April 2026

SEO

Sitemaps

Why Your Sitemap.xml Exists But Google Still Ignores Your Pages

Your sitemap is valid. Google found your URLs. Your pages still aren't indexed. This is a rendering problem, not a discovery problem — and we see it constantly.

Reading progress0%

The Real Problem

Your sitemap is valid. Google found your URLs. Your pages still aren't indexed.

This happens because Google crawls your URLs and gets unusable HTML back. We see this constantly:

Sitemap submitted with 100–500 URLs

Google crawls all of them

HTML response: 3–8 KB

Visible text: <50 words

Result: "Discovered – currently not indexed"

The sitemap worked. Your HTML didn't.

What's Actually Happening

Google does two completely separate things with your URLs. Most people conflate them. That's where the confusion starts.

Step 1: Discovery

Google reads your sitemap and queues URLs. This part works fine.

/pricing

/features

/docs/getting-started

Step 2: Rendering + Indexing

Google requests each URL and evaluates the raw HTML. This is where you fail.

HTML size: ~5 KB

Body: <div id="root"></div>

Content: loaded via JS

→ Page skipped

Google does not reliably execute your app like a browser. If the raw HTML doesn't contain real content, the page gets skipped. Your sitemap did its job — it pointed Google to the door. But when Google opened it, nobody was home.

What Most Guides Get Wrong

They treat indexing as a discovery problem. It's not.

Common advice that doesn't fix this:

"Submit your sitemap in Search Console"

"Request indexing for each URL"

"Wait a few days and check again"

"Make sure your sitemap is valid XML"

None of that fixes the actual problem. Here's a concrete failure we see all the time:

Real example: /pricing

/pricing loads perfectly in Chrome

Googlebot fetch returns 5 KB of HTML + script tags

Zero pricing content in the response body

→ Page never indexes. Ever.

That's not a sitemap issue. That's a rendering failure.

What We See in Production

These aren't hypothetical. We see these patterns across dozens of production sites every month.

01"Discovered but not indexed" at scale

200 URLs in sitemap. 200 crawled. 0–10 indexed.

Root cause: HTML under 10 KB, text under 100 words per page.

02"Indexed but no rankings"

Page appears in Search Console as indexed. Doesn't rank for anything.

Root cause: Title tag exists, but body content is missing at crawl time. Google indexed a shell.

03SPA deep link failure

/features works via client-side navigation. Direct request returns 404.

Root cause: Server not configured to serve index.html for all routes. Google drops the URL entirely.

04Partial hydration pages

<title> is present, but content is injected after page load.

Root cause: Google indexes the thin version — a page with a title and no body. No ranking potential.

05Sitemap includes non-renderable routes

Your sitemap generator finds routes, but routes only exist after client navigation.

Root cause: Direct requests return empty shells. The routes are real in-browser but don't exist as standalone pages.

How to Verify This (Don't Guess)

Stop guessing whether Google can see your content. Run this in your terminal:

curl -A "Googlebot" https://yoursite.com/pricing

Look at the response and check:

HTML size < 10 KB → bad

No visible text in the body → bad

Only <script> tags → bad

Now compare with a regular request:

curl https://yoursite.com/pricing

If both responses are thin, Google sees nothing useful. Your sitemap is pointing to pages that return empty shells.

Quick Test

Don't want to mess with curl? Use our diagnostic tools to see exactly what bots get from your pages:

Quick Test: What Do Bots Actually See?

~30 seconds

Most people guess. Don't.

Run this test and look at the actual response your site returns to bots.

Fetch your page as Googlebot

Use your terminal:

curl -A "Googlebot" https://yourdomain.com

Look for:

Real visible text (not just <div id="root">)
Meaningful content in the HTML
Page size (should not be tiny)

Compare bot vs browser

Now test what a real browser gets:

curl -A "Mozilla/5.0" https://yourdomain.com

If these responses are different, Google is indexing a different page than your users see.

Stop guessing — measure it.

Real example: 253 words vs 13,547

We see this constantly. Here's a real example from production: Googlebot saw 253 words and 2 KB of HTML. A browser saw 13,547 words and 77.5 KB. Same URL — completely different content.

Bot vs browser comparison showing 253 words for Googlebot vs 13,547 words for a rendered browser on the same URL

If your HTML doesn't contain the content, Google doesn't either.

Compare Googlebot vs browser on your site → HTTP Debug Tool

Check for common failure signals

We see this all the time in production:

HTML under ~1KB → usually empty shell
Visible text under ~200 characters → thin or missing content
Missing <title> or <h1> → weak or broken page
Large difference between bot vs browser HTML → rendering issue

Use the DataJelly Visibility Test (Recommended)

You can run this without touching curl. It shows you:

Raw HTML returned to bots (Googlebot, Bing, GPTBot, etc.)
Fully rendered browser version
Side-by-side differences in word count, HTML size, links, and content

Run Visibility Test — Free

What this test tells you (no guessing)

After running this, you'll know:

Whether your HTML is actually indexable
Whether bots are seeing partial content
Whether rendering is breaking in production

This is the difference between "I think SEO is set up" and "I know what Google is indexing."

If you don't understand why this happens, read: Why Google Can't See Your SPA

If this test fails

You have three real options:

SSR

Works if you can keep it stable in production

Prerendering

Breaks with dynamic content and scale

Edge Rendering

Reflects real production output without app changes

If you do nothing, you will not rank consistently. Learn how Edge Rendering works →

This issue doesn't show up in Lighthouse. It shows up in rankings.

Run the Test Ask a Question

What Actually Works

There are three approaches to fixing this. They all solve the same problem: making your HTML response contain real content when bots request it.

Prerendering (build-time)

Works when:

• Small site (<50 pages)
• Content changes infrequently
• Static routes only

Breaks when:

• Dynamic pages or user content
• Frequent deploys
• Routes change often

SSR (Server-Side Rendering)

Works:

• Full HTML per request
• Gold standard for SEO
• Dynamic content supported

Costs:

• Complex infrastructure setup
• May require framework migration
• Slower iteration cycle

Edge Rendering (what actually fixes this)

Serve different content based on who's requesting:

Search bots → fully rendered HTML snapshot (50–150 KB)

AI crawlers → clean, structured Markdown

Users → your normal SPA experience

This directly fixes the failure. HTML size goes from 5 KB to 50–150 KB. Visible text goes from <50 words to 500–3,000 words. Internal links actually exist in the DOM. Your sitemap now points to pages that actually render.

? Your current flow

Sitemap → URLs discovered

Google requests page → empty HTML

Page ignored

? With edge rendering

Sitemap → URLs discovered

Edge proxy detects bot

Full HTML snapshot returned

→ Page indexed

Practical Checklist

Run through this before you start debugging sitemaps. If any of these fail, your sitemap isn't the problem.

HTML Quality (non-negotiable)

HTML size > 30 KB

Visible text > 300 words

Content present without JavaScript execution

Routing

Direct URL requests return 200

No SPA-only routes that 404 on direct access

No client-only navigation dependencies

Rendering Parity

Bot response matches browser content

No "empty shell" responses for any route

SEO Signals

<title> tag present in raw HTML

<h1> present in raw HTML

No noindex meta tag

Sitemap Sanity

Every URL in sitemap returns real content

No routes that require JS to exist

Stop guessing. See what bots actually see.

Run a free visibility test on your site right now - or start a 7-day free trial to fix rendering across all your pages.

Run Free Visibility Test Start 7-Day Free Trial Ask a Question

Frequently Asked Questions

The Bottom Line

A sitemap does not make pages indexable. It only exposes URLs.

If your HTML response is under 10 KB, under 100 words, and script-driven — Google will ignore it. Your sitemap is just a list of broken URLs.

Fix rendering, not discovery.

April 2026

SEO

Sitemaps

Why Your Sitemap.xml Exists But Google Still Ignores Your Pages

Your sitemap is valid. Google found your URLs. Your pages still aren't indexed. This is a rendering problem, not a discovery problem — and we see it constantly.

Reading progress0%

The Real Problem

Your sitemap is valid. Google found your URLs. Your pages still aren't indexed.

This happens because Google crawls your URLs and gets unusable HTML back. We see this constantly:

Sitemap submitted with 100–500 URLs

Google crawls all of them

HTML response: 3–8 KB

Visible text: <50 words

Result: "Discovered – currently not indexed"

The sitemap worked. Your HTML didn't.

What's Actually Happening

Google does two completely separate things with your URLs. Most people conflate them. That's where the confusion starts.

Step 1: Discovery

Google reads your sitemap and queues URLs. This part works fine.

/pricing

/features

/docs/getting-started

Step 2: Rendering + Indexing

Google requests each URL and evaluates the raw HTML. This is where you fail.

HTML size: ~5 KB

Body: <div id="root"></div>

Content: loaded via JS

→ Page skipped

What Most Guides Get Wrong

They treat indexing as a discovery problem. It's not.

Common advice that doesn't fix this:

"Submit your sitemap in Search Console"

"Request indexing for each URL"

"Wait a few days and check again"

"Make sure your sitemap is valid XML"

None of that fixes the actual problem. Here's a concrete failure we see all the time:

Real example: /pricing

/pricing loads perfectly in Chrome

Googlebot fetch returns 5 KB of HTML + script tags

Zero pricing content in the response body

→ Page never indexes. Ever.

That's not a sitemap issue. That's a rendering failure.

What We See in Production

These aren't hypothetical. We see these patterns across dozens of production sites every month.

01"Discovered but not indexed" at scale

200 URLs in sitemap. 200 crawled. 0–10 indexed.

Root cause: HTML under 10 KB, text under 100 words per page.

02"Indexed but no rankings"

Page appears in Search Console as indexed. Doesn't rank for anything.

Root cause: Title tag exists, but body content is missing at crawl time. Google indexed a shell.

03SPA deep link failure

/features works via client-side navigation. Direct request returns 404.

Root cause: Server not configured to serve index.html for all routes. Google drops the URL entirely.

04Partial hydration pages

<title> is present, but content is injected after page load.

Root cause: Google indexes the thin version — a page with a title and no body. No ranking potential.

05Sitemap includes non-renderable routes

Your sitemap generator finds routes, but routes only exist after client navigation.

Root cause: Direct requests return empty shells. The routes are real in-browser but don't exist as standalone pages.

How to Verify This (Don't Guess)

Stop guessing whether Google can see your content. Run this in your terminal:

curl -A "Googlebot" https://yoursite.com/pricing

Look at the response and check:

HTML size < 10 KB → bad

No visible text in the body → bad

Only <script> tags → bad

Now compare with a regular request:

curl https://yoursite.com/pricing

If both responses are thin, Google sees nothing useful. Your sitemap is pointing to pages that return empty shells.

Quick Test

Don't want to mess with curl? Use our diagnostic tools to see exactly what bots get from your pages:

Quick Test: What Do Bots Actually See?

~30 seconds

Most people guess. Don't.

Run this test and look at the actual response your site returns to bots.

Fetch your page as Googlebot

Use your terminal:

curl -A "Googlebot" https://yourdomain.com

Look for:

Real visible text (not just <div id="root">)
Meaningful content in the HTML
Page size (should not be tiny)

Compare bot vs browser

Now test what a real browser gets:

curl -A "Mozilla/5.0" https://yourdomain.com

If these responses are different, Google is indexing a different page than your users see.

Stop guessing — measure it.

Real example: 253 words vs 13,547

We see this constantly. Here's a real example from production: Googlebot saw 253 words and 2 KB of HTML. A browser saw 13,547 words and 77.5 KB. Same URL — completely different content.

If your HTML doesn't contain the content, Google doesn't either.

Compare Googlebot vs browser on your site → HTTP Debug Tool

Check for common failure signals

We see this all the time in production:

HTML under ~1KB → usually empty shell
Visible text under ~200 characters → thin or missing content
Missing <title> or <h1> → weak or broken page
Large difference between bot vs browser HTML → rendering issue

Use the DataJelly Visibility Test (Recommended)

You can run this without touching curl. It shows you:

Raw HTML returned to bots (Googlebot, Bing, GPTBot, etc.)
Fully rendered browser version
Side-by-side differences in word count, HTML size, links, and content

Run Visibility Test — Free

What this test tells you (no guessing)

After running this, you'll know:

Whether your HTML is actually indexable
Whether bots are seeing partial content
Whether rendering is breaking in production

This is the difference between "I think SEO is set up" and "I know what Google is indexing."

If you don't understand why this happens, read: Why Google Can't See Your SPA

If this test fails

You have three real options:

SSR

Works if you can keep it stable in production

Prerendering

Breaks with dynamic content and scale

Edge Rendering

Reflects real production output without app changes

If you do nothing, you will not rank consistently. Learn how Edge Rendering works →

This issue doesn't show up in Lighthouse. It shows up in rankings.

Run the Test Ask a Question

What Actually Works

There are three approaches to fixing this. They all solve the same problem: making your HTML response contain real content when bots request it.

Prerendering (build-time)

Works when:

• Small site (<50 pages)
• Content changes infrequently
• Static routes only

Breaks when:

• Dynamic pages or user content
• Frequent deploys
• Routes change often

SSR (Server-Side Rendering)

Works:

• Full HTML per request
• Gold standard for SEO
• Dynamic content supported

Costs:

• Complex infrastructure setup
• May require framework migration
• Slower iteration cycle

Edge Rendering (what actually fixes this)

Serve different content based on who's requesting:

Search bots → fully rendered HTML snapshot (50–150 KB)

AI crawlers → clean, structured Markdown

Users → your normal SPA experience

? Your current flow

Sitemap → URLs discovered

Google requests page → empty HTML

Page ignored

? With edge rendering

Sitemap → URLs discovered

Edge proxy detects bot

Full HTML snapshot returned

→ Page indexed

Practical Checklist

Run through this before you start debugging sitemaps. If any of these fail, your sitemap isn't the problem.

HTML Quality (non-negotiable)

HTML size > 30 KB

Visible text > 300 words

Content present without JavaScript execution

Routing

Direct URL requests return 200

No SPA-only routes that 404 on direct access

No client-only navigation dependencies

Rendering Parity

Bot response matches browser content

No "empty shell" responses for any route

SEO Signals

<title> tag present in raw HTML

<h1> present in raw HTML

No noindex meta tag

Sitemap Sanity

Every URL in sitemap returns real content

No routes that require JS to exist

Stop guessing. See what bots actually see.

Run a free visibility test on your site right now - or start a 7-day free trial to fix rendering across all your pages.

Run Free Visibility Test Start 7-Day Free Trial Ask a Question

Frequently Asked Questions

The Bottom Line

A sitemap does not make pages indexable. It only exposes URLs.

If your HTML response is under 10 KB, under 100 words, and script-driven — Google will ignore it. Your sitemap is just a list of broken URLs.

Fix rendering, not discovery.

On This Page

The Real Problem

What's Actually Happening

Step 1: Discovery

Step 2: Rendering + Indexing

What Most Guides Get Wrong

Common advice that doesn't fix this:

Real example: /pricing

What We See in Production

01"Discovered but not indexed" at scale

02"Indexed but no rankings"

03SPA deep link failure

04Partial hydration pages

05Sitemap includes non-renderable routes

How to Verify This (Don't Guess)

Quick Test

Quick Test: What Do Bots Actually See?

Fetch your page as Googlebot

Compare bot vs browser

Real example: 253 words vs 13,547

Check for common failure signals

Use the DataJelly Visibility Test (Recommended)

What this test tells you (no guessing)

If this test fails

What Actually Works

Prerendering (build-time)

SSR (Server-Side Rendering)

Edge Rendering (what actually fixes this)

? Your current flow

? With edge rendering

Practical Checklist

HTML Quality (non-negotiable)

Routing

Rendering Parity

SEO Signals

Sitemap Sanity

Stop guessing. See what bots actually see.

Frequently Asked Questions

My sitemap is valid — why aren't my pages indexed?

What does 'Discovered – currently not indexed' actually mean?

Does Google execute JavaScript when crawling?

How much HTML content does Google need to index a page?

Can I just resubmit my sitemap to fix indexing issues?

What's the difference between discovery and indexing?

Why do my pages work in the browser but not in Google?

How do I check what Google actually sees on my page?

What is edge rendering and how does it fix this?

Does this affect AI crawlers too?

The Bottom Line

Related Reading

On This Page

The Real Problem

What's Actually Happening

Step 1: Discovery

Step 2: Rendering + Indexing

What Most Guides Get Wrong

Common advice that doesn't fix this:

Real example: /pricing

What We See in Production

01"Discovered but not indexed" at scale

02"Indexed but no rankings"

03SPA deep link failure

04Partial hydration pages

05Sitemap includes non-renderable routes

How to Verify This (Don't Guess)

Quick Test

Quick Test: What Do Bots Actually See?

Fetch your page as Googlebot

Compare bot vs browser

Real example: 253 words vs 13,547

Check for common failure signals

Use the DataJelly Visibility Test (Recommended)

What this test tells you (no guessing)

If this test fails

What Actually Works

Prerendering (build-time)

SSR (Server-Side Rendering)

Edge Rendering (what actually fixes this)

? Your current flow

? With edge rendering