DJ
DataJelly
Visibility Test
EdgeGuard
PricingSEO ToolsGuidesGet Started
Dashboard
Back to Blog
Blog
Edge
April 2026

How to Test Your Site for AI Visibility (Fast)

You deploy. The site loads. No errors. Two days later: zero AI traffic, no citations, no visibility. We see this constantly — a React or Vite app returns 200 OK, the HTML is 4 KB, contains 12 words, and is mostly script tags. The browser renders fine. AI crawlers get nothing.

Reading progress0%

On This Page

The Real Problem

If your initial HTML is empty, your AI visibility is zero. No partial credit. Not "mostly indexed." Not "partially extracted." Zero.

The trap is that everything looks fine. Status code 200. Lighthouse green. The page renders perfectly in your browser. Search Console eventually shows the URL as "Discovered." But ChatGPT, Claude, and Perplexity never quote you. Why? Because they read a different document than you do.

A real failing page we audited last week:

  • • HTML size: 4.2 KB
  • • Visible text: 12 words
  • • DOM: 1 empty <div id="root"> + 6 script tags
  • • Browser-rendered version: 13,547 words, 77.5 KB

Same URL. Two completely different documents. AI only ever sees the first one.

How AI Crawlers Actually Work

AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, and the rest) do exactly three things, reliably:

  1. Request the URL
  2. Read the raw HTML response
  3. Extract text, links, and structure

They do not wait for:

  • React hydration
  • API calls or async data fetches
  • Client-side routing or rendering
  • Lazy-loaded components
  • Retries on failure

So your site has two versions. The browser version is 300 KB+ of fully-painted DOM after JavaScript runs. The initial HTML is 4–10 KB, an empty root div, no meaningful text. AI systems only see the second one.

For deeper context on why this happens, see How AI Crawlers Read Your Website and What AI Crawlers Actually Extract.

What Most Guides Get Wrong

Most SEO advice is written for Googlebot circa 2020. That advice is misleading for AI:

"Google can render JS, so SPAs are fine."

Even Google's rendering is delayed and skipped (see Why "Google Renders JavaScript" Is Misleading). AI crawlers don't render at all.

"Client-side rendering is fine."

Fine for users. Invisible to AI. CSR pages with no SSR fallback are zero-citation by default.

"Just improve content quality."

Content quality doesn't matter if the HTML response doesn't contain the content. You can't optimize what isn't there.

If the HTML doesn't contain content, AI doesn't see content. That's the whole rule.

What We See in Production

Four repeatable failure modes. We see all four every week across React, Vite, and Lovable apps.

1

Script shell pages (most common)

Signals: HTML < 10 KB, visible text < 50 chars, ~80% of the DOM is script tags.

Outcome: Zero AI extraction. UI renders fine in the browser, curl returns only scripts. Covered in detail in Script Shell Pages and Your HTML Is Only 4KB.

2

Partial deploy failures

Signals: Bundle 404s or a CDN blocks the JS chunk. HTML size unchanged (~5 KB). Console shows a bundle error. Page is visually blank.

Outcome: Users see a broken UI, AI sees a blank page, status code is still 200. See Why Your Site Randomly Breaks After Deploy.

3

Hydration-only content

Signals: HTML contains layout/nav only — no paragraph text, no headings. All content loads via API after mount.

Outcome: AI gets structure with no content, ignores the page. See Hydration Crashes.

4

Silent regressions after deploy

Signals: A page used to ship 2,500 words. After a deploy, HTML drops 120 KB → 8 KB. Visible text drops 90%.

Outcome: AI visibility disappears overnight. No alert fires unless you track it. Guard tracks this as major text drop (>40%) and major DOM drop (>50%).

Quick Test: What Do Bots Actually See?

~30 seconds

Most people guess. Don't.

Run this test and look at the actual response your site returns to bots.

1

Fetch your page as Googlebot

Use your terminal:

curl -A "Googlebot" https://yourdomain.com

Look for:

  • Real visible text (not just <div id="root">)
  • Meaningful content in the HTML
  • Page size (should not be tiny)
2

Compare bot vs browser

Now test what a real browser gets:

curl -A "Mozilla/5.0" https://yourdomain.com

If these responses are different, Google is indexing a different page than your users see.

Stop guessing — measure it.

Real example: 253 words vs 13,547

We see this constantly. Here's a real example from production: Googlebot saw 253 words and 2 KB of HTML. A browser saw 13,547 words and 77.5 KB. Same URL — completely different content.

Bot vs browser comparison showing 253 words for Googlebot vs 13,547 words for a rendered browser on the same URL

If your HTML doesn't contain the content, Google doesn't either.

Compare Googlebot vs browser on your site → HTTP Debug Tool
3

Check for common failure signals

We see this all the time in production:

  • HTML under ~1KB → usually empty shell
  • Visible text under ~200 characters → thin or missing content
  • Missing <title> or <h1> → weak or broken page
  • Large difference between bot vs browser HTML → rendering issue

Use the DataJelly Visibility Test (Recommended)

You can run this without touching curl. It shows you:

  • Raw HTML returned to bots (Googlebot, Bing, GPTBot, etc.)
  • Fully rendered browser version
  • Side-by-side differences in word count, HTML size, links, and content
Run Visibility Test — Free

What this test tells you (no guessing)

After running this, you'll know:

  • Whether your HTML is actually indexable
  • Whether bots are seeing partial content
  • Whether rendering is breaking in production

This is the difference between "I think SEO is set up" and "I know what Google is indexing."

If you don't understand why this happens, read: Why Google Can't See Your SPA

If this test fails

You have three real options:

SSR

Works if you can keep it stable in production

Prerendering

Breaks with dynamic content and scale

Edge Rendering

Reflects real production output without app changes

If you do nothing, you will not rank consistently. Learn how Edge Rendering works →

This issue doesn't show up in Lighthouse. It shows up in rankings.

Run the TestAsk a Question

Practical Checklist (Fast Testing Workflow)

The whole workflow takes under 5 minutes per page. Run it after every deploy.

1. Fetch raw HTML as an AI bot

curl -H "User-Agent: GPTBot" https://yourdomain.com

Check immediately:

  • HTML size (target: > 30 KB)
  • Real paragraph text (not "Loading…" or empty divs)
  • Headings, links, structured data

2. Compare to a browser response

curl -H "User-Agent: Mozilla/5.0" https://yourdomain.com

If the two responses are wildly different in size or word count, your bot version is incomplete. Run a side-by-side at /seo-tools/http-debug.

3. Inspect the actual HTML shape

A failing SPA looks like this:

<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<title>Acme — modern SaaS</title>
</head>
<body>
<div id="root"></div>
<script type="module" src="/assets/index-a3f7.js"></script>
<script src="https://cdn.example.com/analytics.js"></script>
<!-- ...4 more script tags... -->
</body>
</html>

If your response looks like that, AI sees nothing. Check for a real <h1>, body text, and meta tags.

4. Disable JavaScript in your browser

DevTools → Command Palette → "Disable JavaScript" → reload. If the page goes blank or shows a spinner, your AI visibility is broken. Bots experience that exact view.

5. Look for script-heavy responses

Open the response. If you see multiple large script tags and no inline content, you have a script shell page. Run Page Validator to score bot-readiness automatically.

6. Track HTML size and text length over time

Diff between deploys:

  • HTML size drop > 50% → broken
  • Visible text drop > 40% → major issue
  • Headings count → 0 → critical

7. Repeat after every deploy

This breaks in production when bundles fail, configs drift, or a CDN blocks an asset. If you're not checking after deploys, you're blind. Automate it with Guard.

[Screenshot placeholder: side-by-side terminal output of curl -H "User-Agent: GPTBot" vs browser fetch on the same URL]

Real Thresholds (Not Theoretical)

These map directly to production failures we see every week:

MetricHealthyAt-riskBroken
HTML size30–200 KB10–30 KB< 10 KB
Visible text> 500 words200–500 words< 200 chars
Headings (incl. H1)= 31–20
Text drop vs baseline< 10%10–40%> 40%
HTML drop vs baseline< 20%20–50%> 50%

Prerender vs SSR vs Edge

If your test fails, you have three real options. Most teams pick the wrong one.

Prerender

Works when: pages are static and rarely change.

Breaks when: content updates frequently or invalidation fails. We see snapshots showing two-month-old pricing. See Hidden Costs of Prerendering.

SSR (Next.js)

Works: HTML contains full content.

Costs: full app rewrite, server complexity, slower TTFB. Often a 6–12 week project.

Edge proxy (DataJelly)

Behavior: bots get fully-rendered HTML snapshots; AI crawlers get clean Markdown.

Result: 80–200 KB HTML, full content, no rewrite. Works with React, Vite, Lovable.

Full breakdown: Prerender vs SSR vs Edge Rendering.

If your content is not in the initial HTML, it does not exist for AI.

Not after hydration. Not after API calls. Only what's in the first response counts. Most modern SPAs fail this test by default.

The DataJelly Approach

DataJelly fixes this without touching your app. Edge serves fully-rendered HTML snapshots to bots and clean AI Markdown to GPTBot/ClaudeBot/PerplexityBot. Guard monitors the real signals — HTML size, visible text, DOM changes, rendering failures — so when a deploy breaks your visibility, you know in minutes, not weeks.

  • Edge proxy delivers 80–200 KB rendered HTML to bots
  • AI Markdown for GPTBot, ClaudeBot, PerplexityBot
  • Guard tracks size/text drops across deploys
  • Works with React, Vite, and Lovable SPAs — no rewrites
Run the Visibility Test — FreeStart 7-Day Free TrialAsk a Question

Related Diagnostic Tools

Visibility Test

Compare bot vs browser HTML side-by-side

Page Validator

Check bot-readiness and HTML completeness

HTTP Bot Comparison

Compare GPTBot vs browser responses

Site Crawler

Audit HTML quality across all routes

FAQ

Related Reading

Why "Google Renders JavaScript" Is Misleading

Google's rendering is delayed, deprioritized, and often skipped. AI doesn't render at all.

What AI Crawlers Actually Extract

GPTBot, ClaudeBot, PerplexityBot don't render JS. Here's what they pull from your HTML.

Script Shell Pages

When your HTML is one div and 15 script tags, this is what bots see.

Your HTML Is Only 4KB

Why a 4KB shell isn't SEO-compatible — and what a healthy response looks like.

Hydration Crashes: The Silent Killer

Why pages render in your browser but ship blank HTML to bots.

Why Your Site Randomly Breaks After Deploy

Bundle 404s and config drift produce 200 OKs with empty content.

Hidden Costs of Prerendering

Stale snapshots and broken invalidation. Why prerendering trades one problem for another.

Prerender vs SSR vs Edge Rendering

What actually works for SEO and AI with real production data.

Reading progress0%

On This Page

DataJelly

SEO snapshots for modern SPAs. Making JavaScript applications search engine friendly with enterprise-grade reliability.

Product

  • DataJelly Edge
  • DataJelly Guard
  • Pricing
  • SEO Tools
  • Visibility Test
  • Dashboard

Resources

  • Blog
  • Guides
  • Getting Started
  • Prerendering
  • SPA SEO Guide

Company

  • About Us
  • Contact
  • Terms of Service
  • Privacy Policy

© 2026 DataJelly. All rights reserved. Built with love for the modern web.