Blog

Edge

April 2026

How to Test Your Site for AI Visibility (Fast)

You deploy. The site loads. No errors. Two days later: zero AI traffic, no citations, no visibility. We see this constantly — a React or Vite app returns 200 OK, the HTML is 4 KB, contains 12 words, and is mostly script tags. The browser renders fine. AI crawlers get nothing.

Reading progress0%

The Real Problem

If your initial HTML is empty, your AI visibility is zero. No partial credit. Not "mostly indexed." Not "partially extracted." Zero.

The trap is that everything looks fine. Status code 200. Lighthouse green. The page renders perfectly in your browser. Search Console eventually shows the URL as "Discovered." But ChatGPT, Claude, and Perplexity never quote you. Why? Because they read a different document than you do.

A real failing page we audited last week:

• HTML size: 4.2 KB
• Visible text: 12 words
• DOM: 1 empty <div id="root"> + 6 script tags
• Browser-rendered version: 13,547 words, 77.5 KB

Same URL. Two completely different documents. AI only ever sees the first one.

How AI Crawlers Actually Work

AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, and the rest) do exactly three things, reliably:

Request the URL
Read the raw HTML response
Extract text, links, and structure

They do not wait for:

React hydration
API calls or async data fetches
Client-side routing or rendering
Lazy-loaded components
Retries on failure

So your site has two versions. The browser version is 300 KB+ of fully-painted DOM after JavaScript runs. The initial HTML is 4–10 KB, an empty root div, no meaningful text. AI systems only see the second one.

For deeper context on why this happens, see How AI Crawlers Read Your Website and What AI Crawlers Actually Extract.

What Most Guides Get Wrong

Most SEO advice is written for Googlebot circa 2020. That advice is misleading for AI:

"Google can render JS, so SPAs are fine."

Even Google's rendering is delayed and skipped (see Why "Google Renders JavaScript" Is Misleading). AI crawlers don't render at all.

"Client-side rendering is fine."

Fine for users. Invisible to AI. CSR pages with no SSR fallback are zero-citation by default.

"Just improve content quality."

Content quality doesn't matter if the HTML response doesn't contain the content. You can't optimize what isn't there.

If the HTML doesn't contain content, AI doesn't see content. That's the whole rule.

What We See in Production

Four repeatable failure modes. We see all four every week across React, Vite, and Lovable apps.

Script shell pages (most common)

Signals: HTML < 10 KB, visible text < 50 chars, ~80% of the DOM is script tags.

Outcome: Zero AI extraction. UI renders fine in the browser, curl returns only scripts. Covered in detail in Script Shell Pages and Your HTML Is Only 4KB.

Partial deploy failures

Signals: Bundle 404s or a CDN blocks the JS chunk. HTML size unchanged (~5 KB). Console shows a bundle error. Page is visually blank.

Outcome: Users see a broken UI, AI sees a blank page, status code is still 200. See Why Your Site Randomly Breaks After Deploy.

Hydration-only content

Signals: HTML contains layout/nav only — no paragraph text, no headings. All content loads via API after mount.

Outcome: AI gets structure with no content, ignores the page. See Hydration Crashes.

Silent regressions after deploy

Signals: A page used to ship 2,500 words. After a deploy, HTML drops 120 KB → 8 KB. Visible text drops 90%.

Outcome: AI visibility disappears overnight. No alert fires unless you track it. Guard tracks this as major text drop (>40%) and major DOM drop (>50%).

Quick Test: What Do Bots Actually See?

~30 seconds

Most people guess. Don't.

Run this test and look at the actual response your site returns to bots.

Fetch your page as Googlebot

Use your terminal:

curl -A "Googlebot" https://yourdomain.com

Look for:

Real visible text (not just <div id="root">)
Meaningful content in the HTML
Page size (should not be tiny)

Compare bot vs browser

Now test what a real browser gets:

curl -A "Mozilla/5.0" https://yourdomain.com

If these responses are different, Google is indexing a different page than your users see.

Stop guessing — measure it.

Real example: 253 words vs 13,547

We see this constantly. Here's a real example from production: Googlebot saw 253 words and 2 KB of HTML. A browser saw 13,547 words and 77.5 KB. Same URL — completely different content.

Bot vs browser comparison showing 253 words for Googlebot vs 13,547 words for a rendered browser on the same URL

If your HTML doesn't contain the content, Google doesn't either.

Compare Googlebot vs browser on your site → HTTP Debug Tool

Check for common failure signals

We see this all the time in production:

HTML under ~1KB → usually empty shell
Visible text under ~200 characters → thin or missing content
Missing <title> or <h1> → weak or broken page
Large difference between bot vs browser HTML → rendering issue

Use the DataJelly Visibility Test (Recommended)

You can run this without touching curl. It shows you:

Raw HTML returned to bots (Googlebot, Bing, GPTBot, etc.)
Fully rendered browser version
Side-by-side differences in word count, HTML size, links, and content

Run Visibility Test — Free

What this test tells you (no guessing)

After running this, you'll know:

Whether your HTML is actually indexable
Whether bots are seeing partial content
Whether rendering is breaking in production

This is the difference between "I think SEO is set up" and "I know what Google is indexing."

If you don't understand why this happens, read: Why Google Can't See Your SPA

If this test fails

You have three real options:

SSR

Works if you can keep it stable in production

Prerendering

Breaks with dynamic content and scale

Edge Rendering

Reflects real production output without app changes

If you do nothing, you will not rank consistently. Learn how Edge Rendering works →

This issue doesn't show up in Lighthouse. It shows up in rankings.

Run the Test Ask a Question

Practical Checklist (Fast Testing Workflow)

The whole workflow takes under 5 minutes per page. Run it after every deploy.

1. Fetch raw HTML as an AI bot

curl -H "User-Agent: GPTBot" https://yourdomain.com

Check immediately:

HTML size (target: > 30 KB)
Real paragraph text (not "Loading…" or empty divs)
Headings, links, structured data

2. Compare to a browser response

curl -H "User-Agent: Mozilla/5.0" https://yourdomain.com

If the two responses are wildly different in size or word count, your bot version is incomplete. Run a side-by-side at /seo-tools/http-debug.

3. Inspect the actual HTML shape

A failing SPA looks like this:

<!doctype html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <title>Acme — modern SaaS</title>
  </head>
  <body>
    <div id="root"></div>
    <script type="module" src="/assets/index-a3f7.js"></script>
    <script src="https://cdn.example.com/analytics.js"></script>
    <!-- ...4 more script tags... -->
  </body>
</html>

If your response looks like that, AI sees nothing. Check for a real <h1>, body text, and meta tags.

4. Disable JavaScript in your browser

DevTools → Command Palette → "Disable JavaScript" → reload. If the page goes blank or shows a spinner, your AI visibility is broken. Bots experience that exact view.

5. Look for script-heavy responses

Open the response. If you see multiple large script tags and no inline content, you have a script shell page. Run Page Validator to score bot-readiness automatically.

6. Track HTML size and text length over time

Diff between deploys:

HTML size drop > 50% → broken
Visible text drop > 40% → major issue
Headings count → 0 → critical

7. Repeat after every deploy

This breaks in production when bundles fail, configs drift, or a CDN blocks an asset. If you're not checking after deploys, you're blind. Automate it with Guard.

[Screenshot placeholder: side-by-side terminal output of curl -H "User-Agent: GPTBot" vs browser fetch on the same URL]

Real Thresholds (Not Theoretical)

These map directly to production failures we see every week:

Metric	Healthy	At-risk	Broken
HTML size	30–200 KB	10–30 KB	< 10 KB
Visible text	> 500 words	200–500 words	< 200 chars
Headings (incl. H1)	= 3	1–2	0
Text drop vs baseline	< 10%	10–40%	> 40%
HTML drop vs baseline	< 20%	20–50%	> 50%

Prerender vs SSR vs Edge

If your test fails, you have three real options. Most teams pick the wrong one.

Prerender

Works when: pages are static and rarely change.

Breaks when: content updates frequently or invalidation fails. We see snapshots showing two-month-old pricing. See Hidden Costs of Prerendering.

SSR (Next.js)

Works: HTML contains full content.

Costs: full app rewrite, server complexity, slower TTFB. Often a 6–12 week project.

Edge proxy (DataJelly)

Behavior: bots get fully-rendered HTML snapshots; AI crawlers get clean Markdown.

Result: 80–200 KB HTML, full content, no rewrite. Works with React, Vite, Lovable.

Full breakdown: Prerender vs SSR vs Edge Rendering.

If your content is not in the initial HTML, it does not exist for AI.

Not after hydration. Not after API calls. Only what's in the first response counts. Most modern SPAs fail this test by default.

The DataJelly Approach

DataJelly fixes this without touching your app. Edge serves fully-rendered HTML snapshots to bots and clean AI Markdown to GPTBot/ClaudeBot/PerplexityBot. Guard monitors the real signals — HTML size, visible text, DOM changes, rendering failures — so when a deploy breaks your visibility, you know in minutes, not weeks.

Edge proxy delivers 80–200 KB rendered HTML to bots
AI Markdown for GPTBot, ClaudeBot, PerplexityBot
Guard tracks size/text drops across deploys
Works with React, Vite, and Lovable SPAs — no rewrites

Run the Visibility Test — Free Start 7-Day Free Trial Ask a Question

Related Diagnostic Tools

Visibility Test

Compare bot vs browser HTML side-by-side

Page Validator

Check bot-readiness and HTML completeness

HTTP Bot Comparison

Compare GPTBot vs browser responses

Site Crawler

Audit HTML quality across all routes

FAQ

Blog

Edge

April 2026

How to Test Your Site for AI Visibility (Fast)

Reading progress0%

The Real Problem

If your initial HTML is empty, your AI visibility is zero. No partial credit. Not "mostly indexed." Not "partially extracted." Zero.

A real failing page we audited last week:

• HTML size: 4.2 KB
• Visible text: 12 words
• DOM: 1 empty <div id="root"> + 6 script tags
• Browser-rendered version: 13,547 words, 77.5 KB

Same URL. Two completely different documents. AI only ever sees the first one.

How AI Crawlers Actually Work

AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, and the rest) do exactly three things, reliably:

Request the URL
Read the raw HTML response
Extract text, links, and structure

They do not wait for:

React hydration
API calls or async data fetches
Client-side routing or rendering
Lazy-loaded components
Retries on failure

For deeper context on why this happens, see How AI Crawlers Read Your Website and What AI Crawlers Actually Extract.

What Most Guides Get Wrong

Most SEO advice is written for Googlebot circa 2020. That advice is misleading for AI:

"Google can render JS, so SPAs are fine."

Even Google's rendering is delayed and skipped (see Why "Google Renders JavaScript" Is Misleading). AI crawlers don't render at all.

"Client-side rendering is fine."

Fine for users. Invisible to AI. CSR pages with no SSR fallback are zero-citation by default.

"Just improve content quality."

Content quality doesn't matter if the HTML response doesn't contain the content. You can't optimize what isn't there.

If the HTML doesn't contain content, AI doesn't see content. That's the whole rule.

What We See in Production

Four repeatable failure modes. We see all four every week across React, Vite, and Lovable apps.

Script shell pages (most common)

Signals: HTML < 10 KB, visible text < 50 chars, ~80% of the DOM is script tags.

Outcome: Zero AI extraction. UI renders fine in the browser, curl returns only scripts. Covered in detail in Script Shell Pages and Your HTML Is Only 4KB.

Partial deploy failures

Signals: Bundle 404s or a CDN blocks the JS chunk. HTML size unchanged (~5 KB). Console shows a bundle error. Page is visually blank.

Outcome: Users see a broken UI, AI sees a blank page, status code is still 200. See Why Your Site Randomly Breaks After Deploy.

Hydration-only content

Signals: HTML contains layout/nav only — no paragraph text, no headings. All content loads via API after mount.

Outcome: AI gets structure with no content, ignores the page. See Hydration Crashes.

Silent regressions after deploy

Signals: A page used to ship 2,500 words. After a deploy, HTML drops 120 KB → 8 KB. Visible text drops 90%.

Outcome: AI visibility disappears overnight. No alert fires unless you track it. Guard tracks this as major text drop (>40%) and major DOM drop (>50%).

Quick Test: What Do Bots Actually See?

~30 seconds

Most people guess. Don't.

Run this test and look at the actual response your site returns to bots.

Fetch your page as Googlebot

Use your terminal:

curl -A "Googlebot" https://yourdomain.com

Look for:

Real visible text (not just <div id="root">)
Meaningful content in the HTML
Page size (should not be tiny)

Compare bot vs browser

Now test what a real browser gets:

curl -A "Mozilla/5.0" https://yourdomain.com

If these responses are different, Google is indexing a different page than your users see.

Stop guessing — measure it.

Real example: 253 words vs 13,547

We see this constantly. Here's a real example from production: Googlebot saw 253 words and 2 KB of HTML. A browser saw 13,547 words and 77.5 KB. Same URL — completely different content.

If your HTML doesn't contain the content, Google doesn't either.

Compare Googlebot vs browser on your site → HTTP Debug Tool

Check for common failure signals

We see this all the time in production:

HTML under ~1KB → usually empty shell
Visible text under ~200 characters → thin or missing content
Missing <title> or <h1> → weak or broken page
Large difference between bot vs browser HTML → rendering issue

Use the DataJelly Visibility Test (Recommended)

You can run this without touching curl. It shows you:

Raw HTML returned to bots (Googlebot, Bing, GPTBot, etc.)
Fully rendered browser version
Side-by-side differences in word count, HTML size, links, and content

Run Visibility Test — Free

What this test tells you (no guessing)

After running this, you'll know:

Whether your HTML is actually indexable
Whether bots are seeing partial content
Whether rendering is breaking in production

This is the difference between "I think SEO is set up" and "I know what Google is indexing."

If you don't understand why this happens, read: Why Google Can't See Your SPA

If this test fails

You have three real options:

SSR

Works if you can keep it stable in production

Prerendering

Breaks with dynamic content and scale

Edge Rendering

Reflects real production output without app changes

If you do nothing, you will not rank consistently. Learn how Edge Rendering works →

This issue doesn't show up in Lighthouse. It shows up in rankings.

Run the Test Ask a Question

Practical Checklist (Fast Testing Workflow)

The whole workflow takes under 5 minutes per page. Run it after every deploy.

1. Fetch raw HTML as an AI bot

curl -H "User-Agent: GPTBot" https://yourdomain.com

Check immediately:

HTML size (target: > 30 KB)
Real paragraph text (not "Loading…" or empty divs)
Headings, links, structured data

2. Compare to a browser response

curl -H "User-Agent: Mozilla/5.0" https://yourdomain.com

If the two responses are wildly different in size or word count, your bot version is incomplete. Run a side-by-side at /seo-tools/http-debug.

3. Inspect the actual HTML shape

A failing SPA looks like this:

<!doctype html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <title>Acme — modern SaaS</title>
  </head>
  <body>
    <div id="root"></div>
    <script type="module" src="/assets/index-a3f7.js"></script>
    <script src="https://cdn.example.com/analytics.js"></script>
    <!-- ...4 more script tags... -->
  </body>
</html>

If your response looks like that, AI sees nothing. Check for a real <h1>, body text, and meta tags.

4. Disable JavaScript in your browser

DevTools → Command Palette → "Disable JavaScript" → reload. If the page goes blank or shows a spinner, your AI visibility is broken. Bots experience that exact view.

5. Look for script-heavy responses

Open the response. If you see multiple large script tags and no inline content, you have a script shell page. Run Page Validator to score bot-readiness automatically.

6. Track HTML size and text length over time

Diff between deploys:

HTML size drop > 50% → broken
Visible text drop > 40% → major issue
Headings count → 0 → critical

7. Repeat after every deploy

This breaks in production when bundles fail, configs drift, or a CDN blocks an asset. If you're not checking after deploys, you're blind. Automate it with Guard.

[Screenshot placeholder: side-by-side terminal output of curl -H "User-Agent: GPTBot" vs browser fetch on the same URL]

Real Thresholds (Not Theoretical)

These map directly to production failures we see every week:

Metric	Healthy	At-risk	Broken
HTML size	30–200 KB	10–30 KB	< 10 KB
Visible text	> 500 words	200–500 words	< 200 chars
Headings (incl. H1)	= 3	1–2	0
Text drop vs baseline	< 10%	10–40%	> 40%
HTML drop vs baseline	< 20%	20–50%	> 50%

Prerender vs SSR vs Edge

If your test fails, you have three real options. Most teams pick the wrong one.

Prerender

Works when: pages are static and rarely change.

Breaks when: content updates frequently or invalidation fails. We see snapshots showing two-month-old pricing. See Hidden Costs of Prerendering.

SSR (Next.js)

Works: HTML contains full content.

Costs: full app rewrite, server complexity, slower TTFB. Often a 6–12 week project.

Edge proxy (DataJelly)

Behavior: bots get fully-rendered HTML snapshots; AI crawlers get clean Markdown.

Result: 80–200 KB HTML, full content, no rewrite. Works with React, Vite, Lovable.

Full breakdown: Prerender vs SSR vs Edge Rendering.

If your content is not in the initial HTML, it does not exist for AI.

Not after hydration. Not after API calls. Only what's in the first response counts. Most modern SPAs fail this test by default.

The DataJelly Approach

Edge proxy delivers 80–200 KB rendered HTML to bots
AI Markdown for GPTBot, ClaudeBot, PerplexityBot
Guard tracks size/text drops across deploys
Works with React, Vite, and Lovable SPAs — no rewrites

Run the Visibility Test — Free Start 7-Day Free Trial Ask a Question

Related Diagnostic Tools

Visibility Test

Compare bot vs browser HTML side-by-side

Page Validator

Check bot-readiness and HTML completeness

HTTP Bot Comparison

Compare GPTBot vs browser responses

Site Crawler

Audit HTML quality across all routes

On This Page

The Real Problem

How AI Crawlers Actually Work

What Most Guides Get Wrong

What We See in Production

Script shell pages (most common)

Partial deploy failures

Hydration-only content

Silent regressions after deploy

Quick Test: What Do Bots Actually See?

Fetch your page as Googlebot

Compare bot vs browser

Real example: 253 words vs 13,547

Check for common failure signals

Use the DataJelly Visibility Test (Recommended)

What this test tells you (no guessing)

If this test fails

Practical Checklist (Fast Testing Workflow)

1. Fetch raw HTML as an AI bot

2. Compare to a browser response

3. Inspect the actual HTML shape

4. Disable JavaScript in your browser

5. Look for script-heavy responses

6. Track HTML size and text length over time

7. Repeat after every deploy

Real Thresholds (Not Theoretical)

Prerender vs SSR vs Edge

Prerender

SSR (Next.js)

Edge proxy (DataJelly)

The DataJelly Approach

Related Diagnostic Tools

FAQ

How do I know if AI crawlers can see my content?

What HTML size indicates a problem?

Do AI crawlers execute JavaScript?

What's the clearest failure signal?

Why does my site look fine but still fail?

Can I fix this without rewriting my app?

How do I detect regressions over time?

Related Reading

On This Page

The Real Problem

How AI Crawlers Actually Work

What Most Guides Get Wrong

What We See in Production

Script shell pages (most common)

Partial deploy failures

Hydration-only content

Silent regressions after deploy

Quick Test: What Do Bots Actually See?

Fetch your page as Googlebot

Compare bot vs browser

Real example: 253 words vs 13,547

Check for common failure signals

Use the DataJelly Visibility Test (Recommended)

What this test tells you (no guessing)

If this test fails

Practical Checklist (Fast Testing Workflow)

1. Fetch raw HTML as an AI bot

2. Compare to a browser response

3. Inspect the actual HTML shape

4. Disable JavaScript in your browser

5. Look for script-heavy responses

6. Track HTML size and text length over time

7. Repeat after every deploy

Real Thresholds (Not Theoretical)

Prerender vs SSR vs Edge

Prerender

SSR (Next.js)

Edge proxy (DataJelly)

The DataJelly Approach

Related Diagnostic Tools

FAQ

How do I know if AI crawlers can see my content?

What HTML size indicates a problem?

Do AI crawlers execute JavaScript?

What's the clearest failure signal?

Why does my site look fine but still fail?

Can I fix this without rewriting my app?