[Crawl-Date: 2026-04-24]
[Source: DataJelly Visibility Layer]
[URL: https://datajelly.com/blog/test-site-ai-visibility-fast]
---
title: How to Test Your Site for AI Visibility (Fast) | DataJelly
description: AI crawlers don't run JavaScript. If your raw HTML is empty, AI visibility is zero — no partial credit. A fast, repeatable workflow to test your site.
url: https://datajelly.com/blog/test-site-ai-visibility-fast
canonical: https://datajelly.com/blog/test-site-ai-visibility-fast
og_title: DataJelly - The Visibility Layer for Modern Apps
og_description: Rich social previews for Slack &amp; Twitter. AI-readable content for ChatGPT &amp; Perplexity. Zero-code setup.
og_image: https://datajelly.com/datajelly-og-image.png
twitter_card: summary_large_image
twitter_image: https://datajelly.com/datajelly-og-image.png
---

# How to Test Your Site for AI Visibility (Fast) | DataJelly
> AI crawlers don't run JavaScript. If your raw HTML is empty, AI visibility is zero — no partial credit. A fast, repeatable workflow to test your site.

---

## The Real Problem

If your initial HTML is empty, your AI visibility is zero. **No partial credit.** Not "mostly indexed." Not "partially extracted." Zero.

The trap is that everything looks fine. Status code 200. Lighthouse green. The page renders perfectly in your browser. Search Console eventually shows the URL as "Discovered." But ChatGPT, Claude, and Perplexity never quote you. Why? Because they read a different document than you do.

A real failing page we audited last week:

- • HTML size: **4.2 KB**
- • Visible text: **12 words**
- • DOM: 1 empty `<div id="root">` + 6 script tags
- • Browser-rendered version: **13,547 words, 77.5 KB**

Same URL. Two completely different documents. AI only ever sees the first one.

## How AI Crawlers Actually Work

AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, and the rest) do exactly three things, reliably:

1. Request the URL
2. Read the raw HTML response
3. Extract text, links, and structure

They do **not** wait for:

- React hydration
- API calls or async data fetches
- Client-side routing or rendering
- Lazy-loaded components
- Retries on failure

So your site has two versions. The browser version is 300 KB+ of fully-painted DOM after JavaScript runs. The initial HTML is 4–10 KB, an empty root div, no meaningful text. **AI systems only see the second one.**

For deeper context on why this happens, see [How AI Crawlers Read Your Website](https://datajelly.com/blog/how-ai-crawlers-read-your-website) and [What AI Crawlers Actually Extract](https://datajelly.com/blog/ai-crawlers-extract) .

## What Most Guides Get Wrong

Most SEO advice is written for Googlebot circa 2020. That advice is misleading for AI:

"Google can render JS, so SPAs are fine."

Even Google's rendering is delayed and skipped (see Why "Google Renders JavaScript" Is Misleading). AI crawlers don't render at all.

"Client-side rendering is fine."

Fine for users. Invisible to AI. CSR pages with no SSR fallback are zero-citation by default.

"Just improve content quality."

Content quality doesn't matter if the HTML response doesn't contain the content. You can't optimize what isn't there.

**If the HTML doesn't contain content, AI doesn't see content.** That's the whole rule.

## What We See in Production

Four repeatable failure modes. We see all four every week across React, Vite, and Lovable apps.

1
## Script shell pages (most common)

**Signals:** HTML < 10 KB, visible text < 50 chars, ~80% of the DOM is script tags.

**Outcome:** Zero AI extraction. UI renders fine in the browser, `curl` returns only scripts. Covered in detail in [Script Shell Pages](https://datajelly.com/blog/script-shell-pages) and [Your HTML Is Only 4KB](https://datajelly.com/blog/html-only-4kb) .

2
### Partial deploy failures

**Signals:** Bundle 404s or a CDN blocks the JS chunk. HTML size unchanged (~5 KB). Console shows a bundle error. Page is visually blank.

**Outcome:** Users see a broken UI, AI sees a blank page, status code is still 200. See [Why Your Site Randomly Breaks After Deploy](https://datajelly.com/blog/site-breaks-after-deploy-silent) .

3
### Hydration-only content

**Signals:** HTML contains layout/nav only — no paragraph text, no headings. All content loads via API after mount.

**Outcome:** AI gets structure with no content, ignores the page. See [Hydration Crashes](https://datajelly.com/blog/hydration-crashes-silent-killer) .

4
### Silent regressions after deploy

**Signals:** A page used to ship 2,500 words. After a deploy, HTML drops 120 KB → 8 KB. Visible text drops 90%.

**Outcome:** AI visibility disappears overnight. No alert fires unless you track it. Guard tracks this as *major text drop (>40%)* and *major DOM drop (>50%)*.
## Quick Test: What Do Bots Actually See?

~30 seconds

Most people guess. Don't.

Run this test and look at the actual response your site returns to bots.

1
### Fetch your page as Googlebot

Use your terminal:

`curl -A "Googlebot" https://yourdomain.com`

Look for:

- Real visible text (not just `<div id="root">`)
- Meaningful content in the HTML
- Page size (should not be tiny)

2
### Compare bot vs browser

Now test what a real browser gets:

`curl -A "Mozilla/5.0" https://yourdomain.com`

If these responses are different, Google is indexing a different page than your users see.

Stop guessing — measure it.
### Real example: 253 words vs 13,547

We see this constantly. Here's a real example from production: Googlebot saw 253 words and 2 KB of HTML. A browser saw 13,547 words and 77.5 KB. Same URL — completely different content.
[![Bot vs browser comparison showing 253 words for Googlebot vs 13,547 words for a rendered browser on the same URL](https://datajelly.com/assets/bot-comparison-proof-BSBvKXDf.png) ](https://datajelly.com/assets/bot-comparison-proof-BSBvKXDf.png)
If your HTML doesn't contain the content, Google doesn't either.
[Compare Googlebot vs browser on your site → HTTP Debug Tool](https://datajelly.com/seo-tools/http-debug)

3
### Check for common failure signals

We see this all the time in production:

- HTML under ~1KB → usually empty shell
- Visible text under ~200 characters → thin or missing content
- Missing <title> or <h1> → weak or broken page
- Large difference between bot vs browser HTML → rendering issue
### Use the DataJelly Visibility Test (Recommended)

You can run this without touching curl. It shows you:

- Raw HTML returned to bots (Googlebot, Bing, GPTBot, etc.)
- Fully rendered browser version
- Side-by-side differences in word count, HTML size, links, and content

[Run Visibility Test — Free](https://datajelly.com/#visibility-test)
### What this test tells you (no guessing)

After running this, you'll know:

- Whether your HTML is actually indexable
- Whether bots are seeing partial content
- Whether rendering is breaking in production

This is the difference between *"I think SEO is set up"* and **"I know what Google is indexing."**

If you don't understand why this happens, read: [Why Google Can't See Your SPA](https://datajelly.com/blog/why-google-cant-see-your-spa)
### If this test fails

You have three real options:

SSR

Works if you can keep it stable in production

Prerendering

Breaks with dynamic content and scale

Edge Rendering

Reflects real production output without app changes

If you do nothing, you will not rank consistently. [Learn how Edge Rendering works →](https://datajelly.com/products/edge)

This issue doesn't show up in Lighthouse. It shows up in rankings.

[Run the Test](https://datajelly.com/#visibility-test) [Ask a Question](https://datajelly.com/contact)

## Practical Checklist (Fast Testing Workflow)

The whole workflow takes under 5 minutes per page. Run it after every deploy.
## 1. Fetch raw HTML as an AI bot

curl -H "User-Agent: GPTBot" https://yourdomain.com

Check immediately:

- HTML size (target: > 30 KB)
- Real paragraph text (not "Loading…" or empty divs)
- Headings, links, structured data
## 2. Compare to a browser response

curl -H "User-Agent: Mozilla/5.0" https://yourdomain.com

If the two responses are wildly different in size or word count, your bot version is incomplete. Run a side-by-side at [/seo-tools/http-debug](https://datajelly.com/seo-tools/http-debug) .
## 3. Inspect the actual HTML shape

A failing SPA looks like this:

<!doctype html><html lang="en">  <head>    <meta charset="UTF-8" />    <title>Acme — modern SaaS</title>  </head>  <body>    <div id="root"></div>    <script type="module" src="/assets/index-a3f7.js"></script>    <script src="https://cdn.example.com/analytics.js"></script>    <!-- ...4 more script tags... -->  </body></html>

If your response looks like that, AI sees nothing. Check for a real `<h1>`, body text, and meta tags.
## 4. Disable JavaScript in your browser

DevTools → Command Palette → "Disable JavaScript" → reload. If the page goes blank or shows a spinner, your AI visibility is broken. Bots experience that exact view.
## 5. Look for script-heavy responses

Open the response. If you see multiple large script tags and no inline content, you have a script shell page. Run [Page Validator](https://datajelly.com/seo-tools/page-validator) to score bot-readiness automatically.
## 6. Track HTML size and text length over time

Diff between deploys:

- HTML size drop > 50% → broken
- Visible text drop > 40% → major issue
- Headings count → 0 → critical
## 7. Repeat after every deploy

This breaks in production when bundles fail, configs drift, or a CDN blocks an asset. If you're not checking after deploys, you're blind. Automate it with [Guard](https://datajelly.com/products/guard) .

[Screenshot placeholder: side-by-side terminal output of `curl -H "User-Agent: GPTBot"` vs browser fetch on the same URL]

## Real Thresholds (Not Theoretical)

These map directly to production failures we see every week:
| Metric | Healthy | At-risk | Broken |
| --- | --- | --- | --- |
| HTML size | 30–200 KB | 10–30 KB | < 10 KB |
| Visible text | > 500 words | 200–500 words | < 200 chars |
| Headings (incl. H1) | ≥ 3 | 1–2 | 0 |
| Text drop vs baseline | < 10% | 10–40% | > 40% |
| HTML drop vs baseline | < 20% | 20–50% | > 50% |
## Prerender vs SSR vs Edge

If your test fails, you have three real options. Most teams pick the wrong one.
## Prerender

**Works when:** pages are static and rarely change.

**Breaks when:** content updates frequently or invalidation fails. We see snapshots showing two-month-old pricing. See [Hidden Costs of Prerendering](https://datajelly.com/blog/hidden-costs-of-prerendering) .
### SSR (Next.js)

**Works:** HTML contains full content.

**Costs:** full app rewrite, server complexity, slower TTFB. Often a 6–12 week project.
### Edge proxy (DataJelly)

**Behavior:** bots get fully-rendered HTML snapshots; AI crawlers get clean Markdown.

**Result:** 80–200 KB HTML, full content, no rewrite. Works with React, Vite, Lovable.

Full breakdown: [Prerender vs SSR vs Edge Rendering](https://datajelly.com/blog/prerender-vs-ssr-vs-edge-rendering) .

If your content is not in the initial HTML, it does not exist for AI.

Not after hydration. Not after API calls. Only what's in the first response counts. Most modern SPAs fail this test by default.
## The DataJelly Approach

DataJelly fixes this without touching your app. [Edge](https://datajelly.com/products/edge) serves fully-rendered HTML snapshots to bots and clean AI Markdown to GPTBot/ClaudeBot/PerplexityBot. [Guard](https://datajelly.com/products/guard) monitors the real signals — HTML size, visible text, DOM changes, rendering failures — so when a deploy breaks your visibility, you know in minutes, not weeks.

- Edge proxy delivers 80–200 KB rendered HTML to bots
- AI Markdown for GPTBot, ClaudeBot, PerplexityBot
- Guard tracks size/text drops across deploys
- Works with React, Vite, and Lovable SPAs — no rewrites

[Run the Visibility Test — Free](https://datajelly.com/visibility-test) [Start 14-Day Free Trial](https://app.datajelly.com/signup) [Ask a Question](https://datajelly.com/contact)
## Related Diagnostic Tools

[Visibility Test
Compare bot vs browser HTML side-by-side](https://datajelly.com/visibility-test) [Page Validator
Check bot-readiness and HTML completeness](https://datajelly.com/seo-tools/page-validator) [HTTP Bot Comparison
Compare GPTBot vs browser responses](https://datajelly.com/seo-tools/http-debug) [Site Crawler
Audit HTML quality across all routes](https://datajelly.com/seo-tools/site-crawler)

## FAQ
## How do I know if AI crawlers can see my content?
## What HTML size indicates a problem?
## Do AI crawlers execute JavaScript?
## What's the clearest failure signal?
## Why does my site look fine but still fail?
## Can I fix this without rewriting my app?
## How do I detect regressions over time?
## Related Reading

[Why "Google Renders JavaScript" Is Misleading
Google's rendering is delayed, deprioritized, and often skipped. AI doesn't render at all.](https://datajelly.com/blog/google-renders-javascript-misleading) [What AI Crawlers Actually Extract
GPTBot, ClaudeBot, PerplexityBot don't render JS. Here's what they pull from your HTML.](https://datajelly.com/blog/ai-crawlers-extract) [Script Shell Pages
When your HTML is one div and 15 script tags, this is what bots see.](https://datajelly.com/blog/script-shell-pages) [Your HTML Is Only 4KB
Why a 4KB shell isn't SEO-compatible — and what a healthy response looks like.](https://datajelly.com/blog/html-only-4kb) [Hydration Crashes: The Silent Killer
Why pages render in your browser but ship blank HTML to bots.](https://datajelly.com/blog/hydration-crashes-silent-killer) [Why Your Site Randomly Breaks After Deploy
Bundle 404s and config drift produce 200 OKs with empty content.](https://datajelly.com/blog/site-breaks-after-deploy-silent) [Hidden Costs of Prerendering
Stale snapshots and broken invalidation. Why prerendering trades one problem for another.](https://datajelly.com/blog/hidden-costs-of-prerendering) [Prerender vs SSR vs Edge Rendering
What actually works for SEO and AI with real production data.](https://datajelly.com/blog/prerender-vs-ssr-vs-edge-rendering)

## Structured Data (JSON-LD)
```json
{"@context":"https://schema.org","@type":"FAQPage","mainEntity":[{"@type":"Question","name":"How do I know if AI crawlers can see my content?","acceptedAnswer":{"@type":"Answer","text":"Run curl with a bot user agent (e.g. GPTBot or ClaudeBot). If the raw HTML response doesn\u0027t contain real paragraph text, headings, and links, AI systems can\u0027t see your content. The browser-rendered version doesn\u0027t matter \u2014 only the initial HTML response does."}},{"@type":"Question","name":"What HTML size indicates a problem?","acceptedAnswer":{"@type":"Answer","text":"Pages under ~10 KB are almost always broken for AI visibility. Healthy content pages are typically 30 KB to 200 KB. We see failing SPAs that ship 4\u20136 KB shells with one root div and 15\u002B script tags \u2014 those are invisible to GPTBot, ClaudeBot, and PerplexityBot."}},{"@type":"Question","name":"Do AI crawlers execute JavaScript?","acceptedAnswer":{"@type":"Answer","text":"Not reliably. Assume they do not. GPTBot, ClaudeBot, and PerplexityBot read the raw HTML response and stop. They don\u0027t wait for hydration, don\u0027t fire API calls, and don\u0027t retry on failure. If your content isn\u0027t in the first response, it doesn\u0027t exist for them."}},{"@type":"Question","name":"What\u0027s the clearest failure signal?","acceptedAnswer":{"@type":"Answer","text":"Low visible text (under ~200 characters) combined with script-heavy HTML. If 80%\u002B of the response body is script tags and the visible text is under 50 words, the page is a script shell \u2014 invisible to AI."}},{"@type":"Question","name":"Why does my site look fine but still fail?","acceptedAnswer":{"@type":"Answer","text":"Browsers execute JavaScript fully and synchronously, so users see the rendered UI. AI crawlers stop at the raw HTML response. Same URL, two completely different documents \u2014 and AI only ever sees the empty one."}},{"@type":"Question","name":"Can I fix this without rewriting my app?","acceptedAnswer":{"@type":"Answer","text":"Yes. Use an edge proxy that detects bots and serves fully rendered HTML snapshots to them while regular users continue hitting your live SPA. No SSR migration, no framework rewrite \u2014 works with React, Vite, and Lovable apps as-is."}},{"@type":"Question","name":"How do I detect regressions over time?","acceptedAnswer":{"@type":"Answer","text":"Track HTML size and visible text length per URL across deploys. A 40%\u002B drop in text or 50%\u002B drop in HTML size is a hard failure signal. Most teams find out about regressions weeks later through traffic loss; monitoring catches them in minutes."}}]}
```


## Discovery & Navigation
> Semantic links for AI agent traversal.

* [DataJelly Edge](https://datajelly.com/products/edge)
* [DataJelly Guard](https://datajelly.com/products/guard)
* [Pricing](https://datajelly.com/pricing)
* [SEO Tools](https://datajelly.com/seo-tools)
* [Visibility Test](https://datajelly.com/visibility-test)
* [Dashboard](https://dashboard.datajelly.com/)
* [Blog](https://datajelly.com/blog)
* [Guides](https://datajelly.com/guides)
* [Getting Started](https://datajelly.com/guides/getting-started)
* [Prerendering](https://datajelly.com/prerendering)
* [SPA SEO Guide](https://datajelly.com/guides/spa-seo)
* [About Us](https://datajelly.com/about)
* [Contact](https://datajelly.com/contact)
* [Terms of Service](https://datajelly.com/terms)
* [Privacy Policy](https://datajelly.com/privacy)
