Blog

April 2026

Page Crawled But Not Indexed: The Real Reasons (Not What Google Tells You)

"Crawled – currently not indexed" means Google fetched your page and rejected it. In most cases, it has nothing to do with "content quality."

We see this constantly:

2–5 KB

HTML response

<100

Words of visible text

Empty

<div id="root"> + scripts

Result: Google crawls it, never indexes it, and moves on.

This isn't a content quality problem. It's a delivery failure.

Reading progress0%

The Real Problem

"Crawled – currently not indexed" means Google fetched your page and decided it wasn't worth indexing. The Google Search Console message is vague by design — it tells you almost nothing about why.

We see this constantly on React, Vite, and Lovable builds. The pattern is always the same:

Browser view

3,000–8,000 words, full UI, interactive components

Raw HTML response

2–5 KB, mostly <script> tags, no usable content

If your <body> doesn't contain real text on first response, Google has nothing to evaluate. It crawls the page, sees an empty shell, and drops it.

What's Actually Happening

Google does not fully render every page. It decides whether to invest in rendering based on the initial HTML response. This is the part most people miss.

Google's actual flow:

1Fetch HTML
2Evaluate HTML (size, structure, text content)
3Decide whether to render JS — this is the cutoff
4Attempt render (best effort, not guaranteed)
5Decide to index or drop

The real cutoff happens at step 2–3. If your HTML looks like this:

<10KB total size
<200–300 words of visible text
Mostly <script> tags
No meaningful headings

Google often never renders the page at all. And even when it does attempt rendering:

JS errors break hydration silently
Slow bundles cause timeouts
Partial DOM = partial content = dropped page

That's enough for Google to mark the page as "crawled – currently not indexed" forever.

Concrete Failure Signals

These are the signals we see consistently across real sites. They're not theoretical — they're exactly what Guard flags during automated audits:

Signal	Threshold	Risk
HTML size	<10KB	High
Visible text	<200 chars	High
Script ratio	>70% scripts	High
Missing <title> or empty <h1>	Any	Medium
Resource errors	3+ failed JS/CSS	High
Render time	>5 seconds	Medium

Guard flags these as specific failure types: script_shell_only (blank page, no visible text), critical_bundle_failure (JS didn't load), and low_text_density (HTML exists but content is missing).

What Most Guides Get Wrong

Most SEO advice about "crawled but not indexed" assumes Google saw your content and decided it wasn't good enough. That's almost never the issue.

Bad advice you'll see everywhere:

"Improve your content quality"
"Add more keywords"
"Build more backlinks"
"Request re-indexing"

None of that matters if Google never saw the content. You can't optimize content that doesn't exist in the HTML response.

This is the same fundamental problem we cover in React SEO Is Broken by Default — the HTML is the problem, not the content strategy.

What We See in Production

These are not rare edge cases. These are standard failures we diagnose every week.

Empty HTML shell

HTML: 3.2KB. Body: empty root div. JS required for all content.

<html>
  <head><title>My App</title></head>
  <body>
    <div id="root"></div>
    <script src="/main.js"></script>
  </body>
</html>

Result: never indexed. Google never renders it. This happens constantly on Lovable, Vite, and static SPA builds.

JS bundle failure

HTML loads fine. /main.js returns 404 or fails to execute. React never mounts.

Result: no headings, no text, page dropped. Guard flags this as critical_bundle_failure.

Your browser retries and recovers. Googlebot doesn't.

Slow hydration timeout

TTFB: 2.5s. JS loads at ~4s. Content visible at ~6–8s.

Result: Google captures incomplete DOM. The page might look partially rendered but has missing sections. Indexing skipped or incomplete.

We see this a lot on sites with heavy API calls during mount — Google doesn't wait for your data fetching to finish.

Script-heavy HTML

HTML: 25KB (looks decent). Visible text: ~120 chars. 80% of the HTML is script tags.

Result: classified as low-value. Not indexed. The file size is fine — the content ratio is the problem.

Bot mismatch / bot blocking

Humans see full rendered page. Googlebot gets a fallback, error state, or 403 from Cloudflare/CDN bot protection.

Result: Google indexes the broken version (or nothing). You can check this with the HTTP Debug Tool — compare responses across different user agents.

Solutions Compared

We covered this in depth in Prerender vs SSR vs Edge Rendering, but here's the summary as it applies to indexing failures:

Prerender (build-time)

Works for small static sites. Breaks with dynamic routes.

<100 static routes

Dynamic pages

Route explosion

SSR (server-side)

Reliable when working. Expensive and complex.

Full HTML every request

Slow backend = partial render

Hydration mismatches

Edge snapshot (runtime)

Bots get fully rendered HTML. Humans keep SPA.

No empty shells

No JS dependency

No app rewrite

The key difference: you stop relying on Google to render your app correctly. Instead, bots receive complete HTML snapshots, and AI crawlers receive clean Markdown.

Practical Checklist

Run these checks against your site right now. Most take under 60 seconds.

1. Check raw HTML size

Run curl -s your-url | wc -c

>20KB = good <10KB = problem <5KB = guaranteed issue

2. Measure visible text

Check how many words are in the raw <body> HTML (not after JS execution).

300–800+ words = safe <200 words = likely invisible

3. Inspect DOM structure

Check for real content in the initial HTML:

Bad:

<div id="root"></div>
<script src="/main.js"></script>

Good:

<h1>Your Page Title</h1>
<p>Real content here...</p>

4. Check resource failures

Look for JS 404s, CSS failures, 3+ resource errors. Use the HTTP Debug Tool to catch these.

5. Compare bot vs browser response

Test with normal headers vs bot headers. If the HTML differs, you have a potential indexing risk. The Bot Test Tool does this comparison automatically.

6. Check render timing

Red flags: TTFB >2s, content >5s to appear. Google often stops before your content loads.

7. Watch HTML stability

If HTML changes drastically between crawls — DOM drops >40–50%, text disappears — Google treats the page as unreliable and won't index it consistently.

The Fix

If you're seeing "crawled – currently not indexed" on an SPA or JavaScript-heavy site, you have two real choices:

Rewrite to SSR

High effort. High complexity. Most teams don't finish this.

Add a visibility layer

Edge-based. No app changes. Fixes rendering at the source.

DataJelly does exactly this:

Edge proxy serves full HTML snapshots to search bots
Generates AI Markdown for AI crawlers
Leaves your React app completely unchanged
Fixes incomplete HTML at the source — not in your codebase

Run Visibility Test — Free Talk to Our Team Start 7-Day Free Trial

Quick Test: What Do Bots Actually See?

~30 seconds

Most people guess. Don't.

Run this test and look at the actual response your site returns to bots.

Fetch your page as Googlebot

Use your terminal:

curl -A "Googlebot" https://yourdomain.com

Look for:

Real visible text (not just <div id="root">)
Meaningful content in the HTML
Page size (should not be tiny)

Compare bot vs browser

Now test what a real browser gets:

curl -A "Mozilla/5.0" https://yourdomain.com

If these responses are different, Google is indexing a different page than your users see.

Stop guessing — measure it.

Real example: 253 words vs 13,547

We see this constantly. Here's a real example from production: Googlebot saw 253 words and 2 KB of HTML. A browser saw 13,547 words and 77.5 KB. Same URL — completely different content.

Bot vs browser comparison showing 253 words for Googlebot vs 13,547 words for a rendered browser on the same URL

If your HTML doesn't contain the content, Google doesn't either.

Compare Googlebot vs browser on your site → HTTP Debug Tool

Check for common failure signals

We see this all the time in production:

HTML under ~1KB → usually empty shell
Visible text under ~200 characters → thin or missing content
Missing <title> or <h1> → weak or broken page
Large difference between bot vs browser HTML → rendering issue

Use the DataJelly Visibility Test (Recommended)

You can run this without touching curl. It shows you:

Raw HTML returned to bots (Googlebot, Bing, GPTBot, etc.)
Fully rendered browser version
Side-by-side differences in word count, HTML size, links, and content

Run Visibility Test — Free

What this test tells you (no guessing)

After running this, you'll know:

Whether your HTML is actually indexable
Whether bots are seeing partial content
Whether rendering is breaking in production

This is the difference between "I think SEO is set up" and "I know what Google is indexing."

If you don't understand why this happens, read: Why Google Can't See Your SPA

If this test fails

You have three real options:

SSR

Works if you can keep it stable in production

Prerendering

Breaks with dynamic content and scale

Edge Rendering

Reflects real production output without app changes

If you do nothing, you will not rank consistently. Learn how Edge Rendering works →

This issue doesn't show up in Lighthouse. It shows up in rankings.

Run the Test Ask a Question

Final Takeaway

"Crawled but not indexed" is a delivery failure. Not a content problem. Not a backlink problem.

If your HTML is too small, too empty, or too dependent on JavaScript — Google will crawl it forever and never index it.

Fix the HTML response first. Everything else is noise.

Frequently Asked Questions

Blog

April 2026

Page Crawled But Not Indexed: The Real Reasons (Not What Google Tells You)

"Crawled – currently not indexed" means Google fetched your page and rejected it. In most cases, it has nothing to do with "content quality."

We see this constantly:

2–5 KB

HTML response

<100

Words of visible text

Empty

<div id="root"> + scripts

Result: Google crawls it, never indexes it, and moves on.

This isn't a content quality problem. It's a delivery failure.

Reading progress0%

The Real Problem

We see this constantly on React, Vite, and Lovable builds. The pattern is always the same:

Browser view

3,000–8,000 words, full UI, interactive components

Raw HTML response

2–5 KB, mostly <script> tags, no usable content

If your <body> doesn't contain real text on first response, Google has nothing to evaluate. It crawls the page, sees an empty shell, and drops it.

What's Actually Happening

Google does not fully render every page. It decides whether to invest in rendering based on the initial HTML response. This is the part most people miss.

Google's actual flow:

1Fetch HTML
2Evaluate HTML (size, structure, text content)
3Decide whether to render JS — this is the cutoff
4Attempt render (best effort, not guaranteed)
5Decide to index or drop

The real cutoff happens at step 2–3. If your HTML looks like this:

<10KB total size
<200–300 words of visible text
Mostly <script> tags
No meaningful headings

Google often never renders the page at all. And even when it does attempt rendering:

JS errors break hydration silently
Slow bundles cause timeouts
Partial DOM = partial content = dropped page

That's enough for Google to mark the page as "crawled – currently not indexed" forever.

Concrete Failure Signals

These are the signals we see consistently across real sites. They're not theoretical — they're exactly what Guard flags during automated audits:

Signal	Threshold	Risk
HTML size	<10KB	High
Visible text	<200 chars	High
Script ratio	>70% scripts	High
Missing <title> or empty <h1>	Any	Medium
Resource errors	3+ failed JS/CSS	High
Render time	>5 seconds	Medium

What Most Guides Get Wrong

Most SEO advice about "crawled but not indexed" assumes Google saw your content and decided it wasn't good enough. That's almost never the issue.

Bad advice you'll see everywhere:

"Improve your content quality"
"Add more keywords"
"Build more backlinks"
"Request re-indexing"

None of that matters if Google never saw the content. You can't optimize content that doesn't exist in the HTML response.

This is the same fundamental problem we cover in React SEO Is Broken by Default — the HTML is the problem, not the content strategy.

What We See in Production

These are not rare edge cases. These are standard failures we diagnose every week.

Empty HTML shell

HTML: 3.2KB. Body: empty root div. JS required for all content.

<html>
  <head><title>My App</title></head>
  <body>
    <div id="root"></div>
    <script src="/main.js"></script>
  </body>
</html>

Result: never indexed. Google never renders it. This happens constantly on Lovable, Vite, and static SPA builds.

JS bundle failure

HTML loads fine. /main.js returns 404 or fails to execute. React never mounts.

Result: no headings, no text, page dropped. Guard flags this as critical_bundle_failure.

Your browser retries and recovers. Googlebot doesn't.

Slow hydration timeout

TTFB: 2.5s. JS loads at ~4s. Content visible at ~6–8s.

Result: Google captures incomplete DOM. The page might look partially rendered but has missing sections. Indexing skipped or incomplete.

We see this a lot on sites with heavy API calls during mount — Google doesn't wait for your data fetching to finish.

Script-heavy HTML

HTML: 25KB (looks decent). Visible text: ~120 chars. 80% of the HTML is script tags.

Result: classified as low-value. Not indexed. The file size is fine — the content ratio is the problem.

Bot mismatch / bot blocking

Humans see full rendered page. Googlebot gets a fallback, error state, or 403 from Cloudflare/CDN bot protection.

Result: Google indexes the broken version (or nothing). You can check this with the HTTP Debug Tool — compare responses across different user agents.

Solutions Compared

We covered this in depth in Prerender vs SSR vs Edge Rendering, but here's the summary as it applies to indexing failures:

Prerender (build-time)

Works for small static sites. Breaks with dynamic routes.

<100 static routes

Dynamic pages

Route explosion

SSR (server-side)

Reliable when working. Expensive and complex.

Full HTML every request

Slow backend = partial render

Hydration mismatches

Edge snapshot (runtime)

Bots get fully rendered HTML. Humans keep SPA.

No empty shells

No JS dependency

No app rewrite

The key difference: you stop relying on Google to render your app correctly. Instead, bots receive complete HTML snapshots, and AI crawlers receive clean Markdown.

Practical Checklist

Run these checks against your site right now. Most take under 60 seconds.

1. Check raw HTML size

Run curl -s your-url | wc -c

>20KB = good <10KB = problem <5KB = guaranteed issue

2. Measure visible text

Check how many words are in the raw <body> HTML (not after JS execution).

300–800+ words = safe <200 words = likely invisible

3. Inspect DOM structure

Check for real content in the initial HTML:

Bad:

<div id="root"></div>
<script src="/main.js"></script>

Good:

<h1>Your Page Title</h1>
<p>Real content here...</p>

4. Check resource failures

Look for JS 404s, CSS failures, 3+ resource errors. Use the HTTP Debug Tool to catch these.

5. Compare bot vs browser response

Test with normal headers vs bot headers. If the HTML differs, you have a potential indexing risk. The Bot Test Tool does this comparison automatically.

6. Check render timing

Red flags: TTFB >2s, content >5s to appear. Google often stops before your content loads.

7. Watch HTML stability

If HTML changes drastically between crawls — DOM drops >40–50%, text disappears — Google treats the page as unreliable and won't index it consistently.

The Fix

If you're seeing "crawled – currently not indexed" on an SPA or JavaScript-heavy site, you have two real choices:

Rewrite to SSR

High effort. High complexity. Most teams don't finish this.

Add a visibility layer

Edge-based. No app changes. Fixes rendering at the source.

DataJelly does exactly this:

Edge proxy serves full HTML snapshots to search bots
Generates AI Markdown for AI crawlers
Leaves your React app completely unchanged
Fixes incomplete HTML at the source — not in your codebase

Run Visibility Test — Free Talk to Our Team Start 7-Day Free Trial

Quick Test: What Do Bots Actually See?

~30 seconds

Most people guess. Don't.

Run this test and look at the actual response your site returns to bots.

Fetch your page as Googlebot

Use your terminal:

curl -A "Googlebot" https://yourdomain.com

Look for:

Real visible text (not just <div id="root">)
Meaningful content in the HTML
Page size (should not be tiny)

Compare bot vs browser

Now test what a real browser gets:

curl -A "Mozilla/5.0" https://yourdomain.com

If these responses are different, Google is indexing a different page than your users see.

Stop guessing — measure it.

Real example: 253 words vs 13,547

We see this constantly. Here's a real example from production: Googlebot saw 253 words and 2 KB of HTML. A browser saw 13,547 words and 77.5 KB. Same URL — completely different content.

If your HTML doesn't contain the content, Google doesn't either.

Compare Googlebot vs browser on your site → HTTP Debug Tool

Check for common failure signals

We see this all the time in production:

HTML under ~1KB → usually empty shell
Visible text under ~200 characters → thin or missing content
Missing <title> or <h1> → weak or broken page
Large difference between bot vs browser HTML → rendering issue

Use the DataJelly Visibility Test (Recommended)

You can run this without touching curl. It shows you:

Raw HTML returned to bots (Googlebot, Bing, GPTBot, etc.)
Fully rendered browser version
Side-by-side differences in word count, HTML size, links, and content

Run Visibility Test — Free

What this test tells you (no guessing)

After running this, you'll know:

Whether your HTML is actually indexable
Whether bots are seeing partial content
Whether rendering is breaking in production

This is the difference between "I think SEO is set up" and "I know what Google is indexing."

If you don't understand why this happens, read: Why Google Can't See Your SPA

If this test fails

You have three real options:

SSR

Works if you can keep it stable in production

Prerendering

Breaks with dynamic content and scale

Edge Rendering

Reflects real production output without app changes

If you do nothing, you will not rank consistently. Learn how Edge Rendering works →

This issue doesn't show up in Lighthouse. It shows up in rankings.

Run the Test Ask a Question

Final Takeaway

"Crawled but not indexed" is a delivery failure. Not a content problem. Not a backlink problem.

If your HTML is too small, too empty, or too dependent on JavaScript — Google will crawl it forever and never index it.

Fix the HTML response first. Everything else is noise.

On This Page

The Real Problem

What's Actually Happening

Concrete Failure Signals

What Most Guides Get Wrong

What We See in Production

Empty HTML shell

JS bundle failure

Slow hydration timeout

Script-heavy HTML

Bot mismatch / bot blocking

Solutions Compared

Prerender (build-time)

SSR (server-side)

Edge snapshot (runtime)

Practical Checklist

1. Check raw HTML size

2. Measure visible text

3. Inspect DOM structure

4. Check resource failures

5. Compare bot vs browser response

6. Check render timing

7. Watch HTML stability

The Fix

DataJelly does exactly this:

Quick Test: What Do Bots Actually See?

Fetch your page as Googlebot

Compare bot vs browser

Real example: 253 words vs 13,547

Check for common failure signals

Use the DataJelly Visibility Test (Recommended)

What this test tells you (no guessing)

If this test fails

Final Takeaway

Frequently Asked Questions

Why is my page crawled but not indexed?

What HTML size causes indexing problems?

Does Google always render JavaScript?

How do I verify what Google sees?

Can JS errors cause non-indexing?

Why does my site look fine but isn't indexed?

What's the fastest way to fix this?

Related Reading

On This Page

The Real Problem

What's Actually Happening

Concrete Failure Signals

What Most Guides Get Wrong

What We See in Production

Empty HTML shell

JS bundle failure

Slow hydration timeout

Script-heavy HTML

Bot mismatch / bot blocking

Solutions Compared

Prerender (build-time)

SSR (server-side)

Edge snapshot (runtime)

Practical Checklist

1. Check raw HTML size

2. Measure visible text

3. Inspect DOM structure

4. Check resource failures

5. Compare bot vs browser response

6. Check render timing

7. Watch HTML stability

The Fix

DataJelly does exactly this:

Quick Test: What Do Bots Actually See?

Fetch your page as Googlebot

Compare bot vs browser

Real example: 253 words vs 13,547

Check for common failure signals

Use the DataJelly Visibility Test (Recommended)

What this test tells you (no guessing)

If this test fails

Final Takeaway

Frequently Asked Questions

Why is my page crawled but not indexed?

What HTML size causes indexing problems?