DJ
DataJelly
Visibility Test
EdgeGuard
PricingSEO ToolsGuidesGet Started
Dashboard
Back to Blog
April 2026
SEO
Sitemaps

Why Your Sitemap.xml Exists But Google Still Ignores Your Pages

Your sitemap is valid. Google found your URLs. Your pages still aren't indexed. This is a rendering problem, not a discovery problem — and we see it constantly.

Reading progress0%

On This Page

The Real Problem

Your sitemap is valid. Google found your URLs. Your pages still aren't indexed.

This happens because Google crawls your URLs and gets unusable HTML back. We see this constantly:

Sitemap submitted with 100–500 URLs
Google crawls all of them
HTML response: 3–8 KB
Visible text: <50 words
Result: "Discovered – currently not indexed"

The sitemap worked. Your HTML didn't.

What's Actually Happening

Google does two completely separate things with your URLs. Most people conflate them. That's where the confusion starts.

Step 1: Discovery

Google reads your sitemap and queues URLs. This part works fine.

/
/pricing
/features
/docs/getting-started

Step 2: Rendering + Indexing

Google requests each URL and evaluates the raw HTML. This is where you fail.

HTML size: ~5 KB
Body: <div id="root"></div>
Content: loaded via JS
→ Page skipped

Google does not reliably execute your app like a browser. If the raw HTML doesn't contain real content, the page gets skipped. Your sitemap did its job — it pointed Google to the door. But when Google opened it, nobody was home.

What Most Guides Get Wrong

They treat indexing as a discovery problem. It's not.

Common advice that doesn't fix this:

"Submit your sitemap in Search Console"
"Request indexing for each URL"
"Wait a few days and check again"
"Make sure your sitemap is valid XML"

None of that fixes the actual problem. Here's a concrete failure we see all the time:

Real example: /pricing

/pricing loads perfectly in Chrome

Googlebot fetch returns 5 KB of HTML + script tags

Zero pricing content in the response body

→ Page never indexes. Ever.

That's not a sitemap issue. That's a rendering failure.

What We See in Production

These aren't hypothetical. We see these patterns across dozens of production sites every month.

01"Discovered but not indexed" at scale

200 URLs in sitemap. 200 crawled. 0–10 indexed.

Root cause: HTML under 10 KB, text under 100 words per page.

02"Indexed but no rankings"

Page appears in Search Console as indexed. Doesn't rank for anything.

Root cause: Title tag exists, but body content is missing at crawl time. Google indexed a shell.

03SPA deep link failure

/features works via client-side navigation. Direct request returns 404.

Root cause: Server not configured to serve index.html for all routes. Google drops the URL entirely.

04Partial hydration pages

<title> is present, but content is injected after page load.

Root cause: Google indexes the thin version — a page with a title and no body. No ranking potential.

05Sitemap includes non-renderable routes

Your sitemap generator finds routes, but routes only exist after client navigation.

Root cause: Direct requests return empty shells. The routes are real in-browser but don't exist as standalone pages.

How to Verify This (Don't Guess)

Stop guessing whether Google can see your content. Run this in your terminal:

curl -A "Googlebot" https://yoursite.com/pricing

Look at the response and check:

HTML size < 10 KB → bad
No visible text in the body → bad
Only <script> tags → bad

Now compare with a regular request:

curl https://yoursite.com/pricing

If both responses are thin, Google sees nothing useful. Your sitemap is pointing to pages that return empty shells.

Quick Test

Don't want to mess with curl? Use our diagnostic tools to see exactly what bots get from your pages:

Quick Test: What Do Bots Actually See?

~30 seconds

Most people guess. Don't.

Run this test and look at the actual response your site returns to bots.

1

Fetch your page as Googlebot

Use your terminal:

curl -A "Googlebot" https://yourdomain.com

Look for:

  • Real visible text (not just <div id="root">)
  • Meaningful content in the HTML
  • Page size (should not be tiny)
2

Compare bot vs browser

Now test what a real browser gets:

curl -A "Mozilla/5.0" https://yourdomain.com

If these responses are different, Google is indexing a different page than your users see.

Stop guessing — measure it.

Real example: 253 words vs 13,547

We see this constantly. Here's a real example from production: Googlebot saw 253 words and 2 KB of HTML. A browser saw 13,547 words and 77.5 KB. Same URL — completely different content.

Bot vs browser comparison showing 253 words for Googlebot vs 13,547 words for a rendered browser on the same URL

If your HTML doesn't contain the content, Google doesn't either.

Compare Googlebot vs browser on your site → HTTP Debug Tool
3

Check for common failure signals

We see this all the time in production:

  • HTML under ~1KB → usually empty shell
  • Visible text under ~200 characters → thin or missing content
  • Missing <title> or <h1> → weak or broken page
  • Large difference between bot vs browser HTML → rendering issue

Use the DataJelly Visibility Test (Recommended)

You can run this without touching curl. It shows you:

  • Raw HTML returned to bots (Googlebot, Bing, GPTBot, etc.)
  • Fully rendered browser version
  • Side-by-side differences in word count, HTML size, links, and content
Run Visibility Test — Free

What this test tells you (no guessing)

After running this, you'll know:

  • Whether your HTML is actually indexable
  • Whether bots are seeing partial content
  • Whether rendering is breaking in production

This is the difference between "I think SEO is set up" and "I know what Google is indexing."

If you don't understand why this happens, read: Why Google Can't See Your SPA

If this test fails

You have three real options:

SSR

Works if you can keep it stable in production

Prerendering

Breaks with dynamic content and scale

Edge Rendering

Reflects real production output without app changes

If you do nothing, you will not rank consistently. Learn how Edge Rendering works →

This issue doesn't show up in Lighthouse. It shows up in rankings.

Run the TestAsk a Question

What Actually Works

There are three approaches to fixing this. They all solve the same problem: making your HTML response contain real content when bots request it.

Prerendering (build-time)

Works when:

  • • Small site (<50 pages)
  • • Content changes infrequently
  • • Static routes only

Breaks when:

  • • Dynamic pages or user content
  • • Frequent deploys
  • • Routes change often

SSR (Server-Side Rendering)

Works:

  • • Full HTML per request
  • • Gold standard for SEO
  • • Dynamic content supported

Costs:

  • • Complex infrastructure setup
  • • May require framework migration
  • • Slower iteration cycle

Edge Rendering (what actually fixes this)

Serve different content based on who's requesting:

Search bots → fully rendered HTML snapshot (50–150 KB)
AI crawlers → clean, structured Markdown
Users → your normal SPA experience

This directly fixes the failure. HTML size goes from 5 KB to 50–150 KB. Visible text goes from <50 words to 500–3,000 words. Internal links actually exist in the DOM. Your sitemap now points to pages that actually render.

? Your current flow

Sitemap → URLs discovered
Google requests page → empty HTML
Page ignored

? With edge rendering

Sitemap → URLs discovered
Edge proxy detects bot
Full HTML snapshot returned
→ Page indexed

Practical Checklist

Run through this before you start debugging sitemaps. If any of these fail, your sitemap isn't the problem.

HTML Quality (non-negotiable)

HTML size > 30 KB
Visible text > 300 words
Content present without JavaScript execution

Routing

Direct URL requests return 200
No SPA-only routes that 404 on direct access
No client-only navigation dependencies

Rendering Parity

Bot response matches browser content
No "empty shell" responses for any route

SEO Signals

<title> tag present in raw HTML
<h1> present in raw HTML
No noindex meta tag

Sitemap Sanity

Every URL in sitemap returns real content
No routes that require JS to exist

Stop guessing. See what bots actually see.

Run a free visibility test on your site right now - or start a 7-day free trial to fix rendering across all your pages.

Run Free Visibility TestStart 7-Day Free TrialAsk a Question

Frequently Asked Questions

The Bottom Line

A sitemap does not make pages indexable. It only exposes URLs.

If your HTML response is under 10 KB, under 100 words, and script-driven — Google will ignore it. Your sitemap is just a list of broken URLs.

Fix rendering, not discovery.

Related Reading

Why Google Can't See Your SPA

The rendering gap explained — why your browser and Googlebot see completely different pages.

SPA SEO Checklist: 10 Things to Fix

The production checklist for JavaScript app visibility — 10 things that actually break rankings.

React SEO Is Broken by Default

Why React ships HTML that search engines can't use — and the real fixes.

SEO Foundation Checklist

The 15-minute setup that makes everything else work.

Why Script-Based Prerendering Struggles

Build-time prerendering limitations with modern dynamic apps.

Search Engine Crawling Guide

How crawlers discover, render, and index your pages — the full pipeline.

Sitemap Validator Tool

Validate your sitemap format and check for structural errors.

Bot View Checker

See exactly what search engine bots see when they crawl your pages.

HTTP Debug Tool

Compare raw vs rendered responses across different user agents.

Site Crawler

Crawl your site to find URLs and verify every page loads correctly.

DataJelly Edge

Edge rendering that delivers complete HTML to bots without app changes.

DataJelly Guard

Continuous monitoring that catches broken pages before users do.

Reading progress0%

On This Page

DataJelly

SEO snapshots for modern SPAs. Making JavaScript applications search engine friendly with enterprise-grade reliability.

Product

  • DataJelly Edge
  • DataJelly Guard
  • Pricing
  • SEO Tools
  • Visibility Test
  • Dashboard

Resources

  • Blog
  • Guides
  • Getting Started
  • Prerendering
  • SPA SEO Guide

Company

  • About Us
  • Contact
  • Terms of Service
  • Privacy Policy

© 2026 DataJelly. All rights reserved. Built with love for the modern web.