[Crawl-Date: 2026-04-05]
[Source: DataJelly Visibility Layer]
[URL: https://datajelly.com/blog/sitemap-exists-google-ignores-pages]
---
title: Why Your Sitemap Exists But Google Still Ignores Your Pages | DataJelly
description: Your sitemap is valid. Google found your URLs. Your pages still aren't indexed. Here's why discovery ≠ indexing — and the rendering fixes that actually work.
url: https://datajelly.com/blog/sitemap-exists-google-ignores-pages
canonical: https://datajelly.com/blog/sitemap-exists-google-ignores-pages
og_title: DataJelly - The Visibility Layer for Modern Apps
og_description: Rich social previews for Slack &amp; Twitter. AI-readable content for ChatGPT &amp; Perplexity. Zero-code setup.
og_image: https://datajelly.com/datajelly-og-image.png
twitter_card: summary_large_image
twitter_image: https://datajelly.com/datajelly-og-image.png
---

# Why Your Sitemap Exists But Google Still Ignores Your Pages | DataJelly
> Your sitemap is valid. Google found your URLs. Your pages still aren't indexed. Here's why discovery ≠ indexing — and the rendering fixes that actually work.

---

## The Real Problem

Your sitemap is valid. Google found your URLs. Your pages still aren't indexed.

This happens because Google crawls your URLs and gets unusable HTML back. We see this constantly:

Sitemap submitted with 100–500 URLs

Google crawls all of them

HTML response: 3–8 KB

Visible text: <50 words

Result: "Discovered – currently not indexed"

The sitemap worked. Your HTML didn't.

## What's Actually Happening

Google does two completely separate things with your URLs. Most people conflate them. That's where the confusion starts.
## Step 1: Discovery

Google reads your sitemap and queues URLs. This part works fine.

/

/pricing

/features

/docs/getting-started
## Step 2: Rendering + Indexing

Google requests each URL and evaluates the raw HTML. This is where you fail.

HTML size: ~5 KB

Body: <div id="root"></div>

Content: loaded via JS

→ Page skipped

Google does not reliably execute your app like a browser. If the raw HTML doesn't contain real content, the page gets skipped. Your sitemap did its job — it pointed Google to the door. But when Google opened it, nobody was home.

## What Most Guides Get Wrong

They treat indexing as a discovery problem. It's not.
## Common advice that doesn't fix this:

"Submit your sitemap in Search Console"

"Request indexing for each URL"

"Wait a few days and check again"

"Make sure your sitemap is valid XML"

None of that fixes the actual problem. Here's a concrete failure we see all the time:
## Real example: /pricing

`/pricing` loads perfectly in Chrome

Googlebot fetch returns 5 KB of HTML + script tags

Zero pricing content in the response body

→ Page never indexes. Ever.

That's not a sitemap issue. That's a rendering failure.

## What We See in Production

These aren't hypothetical. We see these patterns across dozens of production sites every month.
## 01"Discovered but not indexed" at scale

200 URLs in sitemap. 200 crawled. 0–10 indexed.

**Root cause:** HTML under 10 KB, text under 100 words per page.
## 02"Indexed but no rankings"

Page appears in Search Console as indexed. Doesn't rank for anything.

**Root cause:** Title tag exists, but body content is missing at crawl time. Google indexed a shell.
## 03SPA deep link failure

`/features` works via client-side navigation. Direct request returns 404.

**Root cause:** Server not configured to serve index.html for all routes. Google drops the URL entirely.
## 04Partial hydration pages

<title> is present, but content is injected after page load.

**Root cause:** Google indexes the thin version — a page with a title and no body. No ranking potential.
## 05Sitemap includes non-renderable routes

Your sitemap generator finds routes, but routes only exist after client navigation.

**Root cause:** Direct requests return empty shells. The routes are real in-browser but don't exist as standalone pages.

## How to Verify This (Don't Guess)

Stop guessing whether Google can see your content. Run this in your terminal:

curl -A "Googlebot" https://yoursite.com/pricing

Look at the response and check:

HTML size < 10 KB → **bad**

No visible text in the body → **bad**

Only <script> tags → **bad**

Now compare with a regular request:

curl https://yoursite.com/pricing

If both responses are thin, Google sees nothing useful. Your sitemap is pointing to pages that return empty shells.

## Quick Test

Don't want to mess with curl? Use our diagnostic tools to see exactly what bots get from your pages:
## Quick Test: What Do Bots Actually See?

~30 seconds

Most people guess. Don't.

Run this test and look at the actual response your site returns to bots.

1
### Fetch your page as Googlebot

Use your terminal:

`curl -A "Googlebot" https://yourdomain.com`

Look for:

- Real visible text (not just `<div id="root">`)
- Meaningful content in the HTML
- Page size (should not be tiny)

2
### Compare bot vs browser

Now test what a real browser gets:

`curl -A "Mozilla/5.0" https://yourdomain.com`

If these responses are different, Google is indexing a different page than your users see.

Stop guessing — measure it.
### Real example: 253 words vs 13,547

We see this constantly. Here's a real example from production: Googlebot saw 253 words and 2 KB of HTML. A browser saw 13,547 words and 77.5 KB. Same URL — completely different content.
[![Bot vs browser comparison showing 253 words for Googlebot vs 13,547 words for a rendered browser on the same URL](https://datajelly.com/assets/bot-comparison-proof-BSBvKXDf.png) ](https://datajelly.com/assets/bot-comparison-proof-BSBvKXDf.png)
If your HTML doesn't contain the content, Google doesn't either.
[Compare Googlebot vs browser on your site → HTTP Debug Tool](https://datajelly.com/seo-tools/http-debug)

3
### Check for common failure signals

We see this all the time in production:

- HTML under ~1KB → usually empty shell
- Visible text under ~200 characters → thin or missing content
- Missing <title> or <h1> → weak or broken page
- Large difference between bot vs browser HTML → rendering issue
### Use the DataJelly Visibility Test (Recommended)

You can run this without touching curl. It shows you:

- Raw HTML returned to bots (Googlebot, Bing, GPTBot, etc.)
- Fully rendered browser version
- Side-by-side differences in word count, HTML size, links, and content

[Run Visibility Test — Free](https://datajelly.com/#visibility-test)
### What this test tells you (no guessing)

After running this, you'll know:

- Whether your HTML is actually indexable
- Whether bots are seeing partial content
- Whether rendering is breaking in production

This is the difference between *"I think SEO is set up"* and **"I know what Google is indexing."**

If you don't understand why this happens, read: [Why Google Can't See Your SPA](https://datajelly.com/blog/why-google-cant-see-your-spa)
### If this test fails

You have three real options:

SSR

Works if you can keep it stable in production

Prerendering

Breaks with dynamic content and scale

Edge Rendering

Reflects real production output without app changes

If you do nothing, you will not rank consistently. [Learn how Edge Rendering works →](https://datajelly.com/products/edge)

This issue doesn't show up in Lighthouse. It shows up in rankings.

[Run the Test](https://datajelly.com/#visibility-test) [Ask a Question](https://datajelly.com/contact)

## What Actually Works

There are three approaches to fixing this. They all solve the same problem: making your HTML response contain real content when bots request it.
## Prerendering (build-time)

Works when:

- • Small site (<50 pages)
- • Content changes infrequently
- • Static routes only

Breaks when:

- • Dynamic pages or user content
- • Frequent deploys
- • Routes change often
## SSR (Server-Side Rendering)

Works:

- • Full HTML per request
- • Gold standard for SEO
- • Dynamic content supported

Costs:

- • Complex infrastructure setup
- • May require framework migration
- • Slower iteration cycle
## Edge Rendering (what actually fixes this)

Serve different content based on who's requesting:

**Search bots** → fully rendered HTML snapshot (50–150 KB)

**AI crawlers** → clean, structured Markdown

**Users** → your normal SPA experience

This directly fixes the failure. HTML size goes from 5 KB to 50–150 KB. Visible text goes from <50 words to 500–3,000 words. Internal links actually exist in the DOM. Your sitemap now points to pages that actually render.
### ❌ Your current flow

Sitemap → URLs discovered

Google requests page → empty HTML

Page ignored
### ✅ With edge rendering

Sitemap → URLs discovered

Edge proxy detects bot

Full HTML snapshot returned

→ Page indexed

## Practical Checklist

Run through this before you start debugging sitemaps. If any of these fail, your sitemap isn't the problem.
## HTML Quality (non-negotiable)

HTML size > 30 KB

Visible text > 300 words

Content present without JavaScript execution
## Routing

Direct URL requests return 200

No SPA-only routes that 404 on direct access

No client-only navigation dependencies
## Rendering Parity

Bot response matches browser content

No "empty shell" responses for any route
## SEO Signals

<title> tag present in raw HTML

<h1> present in raw HTML

No noindex meta tag
## Sitemap Sanity

Every URL in sitemap returns real content

No routes that require JS to exist

## Stop guessing. See what bots actually see.

Run a free visibility test on your site right now — or start a 14-day free trial to fix rendering across all your pages.

[Run Free Visibility Test](https://datajelly.com/seo-tools/visibility-test) [Start 14-Day Free Trial](https://dashboard.datajelly.com/) [Ask a Question](https://datajelly.com/contact)

## Frequently Asked Questions
## My sitemap is valid — why aren't my pages indexed?
## What does 'Discovered – currently not indexed' actually mean?
## Does Google execute JavaScript when crawling?
## How much HTML content does Google need to index a page?
## Can I just resubmit my sitemap to fix indexing issues?
## What's the difference between discovery and indexing?
## Why do my pages work in the browser but not in Google?
## How do I check what Google actually sees on my page?
## What is edge rendering and how does it fix this?
## Does this affect AI crawlers too?

## The Bottom Line

A sitemap does not make pages indexable. It only exposes URLs.

If your HTML response is under 10 KB, under 100 words, and script-driven — Google will ignore it. Your sitemap is just a list of broken URLs.

Fix rendering, not discovery.
## Related Reading

[Why Google Can't See Your SPA
The rendering gap explained — why your browser and Googlebot see completely different pages.](https://datajelly.com/blog/why-google-cant-see-your-spa) [SPA SEO Checklist: 10 Things to Fix
The production checklist for JavaScript app visibility — 10 things that actually break rankings.](https://datajelly.com/blog/spa-seo-checklist) [React SEO Is Broken by Default
Why React ships HTML that search engines can't use — and the real fixes.](https://datajelly.com/blog/react-seo-broken-by-default) [SEO Foundation Checklist
The 15-minute setup that makes everything else work.](https://datajelly.com/blog/seo-foundation-checklist) [Why Script-Based Prerendering Struggles
Build-time prerendering limitations with modern dynamic apps.](https://datajelly.com/blog/script-based-prerendering-limits) [Search Engine Crawling Guide
How crawlers discover, render, and index your pages — the full pipeline.](https://datajelly.com/guides/search-engine-crawling) [Sitemap Validator Tool
Validate your sitemap format and check for structural errors.](https://datajelly.com/seo-tools/sitemap-validator) [Bot View Checker
See exactly what search engine bots see when they crawl your pages.](https://datajelly.com/seo-tools/bot-test) [HTTP Debug Tool
Compare raw vs rendered responses across different user agents.](https://datajelly.com/seo-tools/http-debug) [Site Crawler
Crawl your site to find URLs and verify every page loads correctly.](https://datajelly.com/seo-tools/site-crawler) [DataJelly Edge
Edge rendering that delivers complete HTML to bots without app changes.](https://datajelly.com/products/edge) [DataJelly Guard
Continuous monitoring that catches broken pages before users do.](https://datajelly.com/products/guard)

## Structured Data (JSON-LD)
```json
{"@context":"https://schema.org","@type":"FAQPage","mainEntity":[{"@type":"Question","name":"My sitemap is valid \u2014 why aren\u0027t my pages indexed?","acceptedAnswer":{"@type":"Answer","text":"A valid sitemap only tells Google your URLs exist. It doesn\u0027t guarantee indexing. If Google crawls the URL and gets thin or empty HTML back (common with SPAs), the page gets filed as \u0027Discovered \u2013 currently not indexed.\u0027 The sitemap did its job. Your HTML didn\u0027t."}},{"@type":"Question","name":"What does \u0027Discovered \u2013 currently not indexed\u0027 actually mean?","acceptedAnswer":{"@type":"Answer","text":"It means Google found the URL (via sitemap, internal link, or external link) and crawled it, but decided not to index it. The most common cause for SPAs is that the HTML response contains no meaningful content \u2014 just a JavaScript shell. Google won\u0027t index a page with \u003C50 words of visible text."}},{"@type":"Question","name":"Does Google execute JavaScript when crawling?","acceptedAnswer":{"@type":"Answer","text":"Sometimes, but not reliably. Google uses a two-phase system: first it processes raw HTML, then it queues JavaScript rendering for later \u2014 sometimes hours or days later. Critical content that only appears after JS execution may never get indexed. You cannot depend on this."}},{"@type":"Question","name":"How much HTML content does Google need to index a page?","acceptedAnswer":{"@type":"Answer","text":"There\u0027s no official minimum, but in practice we see consistent indexing when pages return \u003E30 KB of HTML with \u003E300 words of visible text. Pages under 10 KB with \u003C100 words are almost always skipped or indexed as thin content with no ranking potential."}},{"@type":"Question","name":"Can I just resubmit my sitemap to fix indexing issues?","acceptedAnswer":{"@type":"Answer","text":"No. Resubmitting a sitemap only triggers re-discovery. If Google already discovered your URLs and decided not to index them because the HTML was empty, resubmitting changes nothing. You need to fix the HTML response, not the discovery mechanism."}},{"@type":"Question","name":"What\u0027s the difference between discovery and indexing?","acceptedAnswer":{"@type":"Answer","text":"Discovery is Google finding your URL (via sitemap, links, etc.). Indexing is Google processing the page content and adding it to search results. Your sitemap handles discovery. Rendering quality handles indexing. These are completely separate steps."}},{"@type":"Question","name":"Why do my pages work in the browser but not in Google?","acceptedAnswer":{"@type":"Answer","text":"Your browser executes JavaScript and renders the full page. Google\u0027s initial crawl evaluates raw HTML. If your page content is loaded entirely via JavaScript (React, Vue, Angular SPAs), the raw HTML is often just a script shell with no content. Google sees a blank page."}},{"@type":"Question","name":"How do I check what Google actually sees on my page?","acceptedAnswer":{"@type":"Answer","text":"Run \u0027curl -A Googlebot https://yoursite.com/page\u0027 in your terminal and look at the response. If the HTML is under 10 KB with no visible text content, that\u0027s exactly what Google sees. You can also use DataJelly\u0027s Visibility Test or HTTP Bot Comparison tool for a side-by-side view."}},{"@type":"Question","name":"What is edge rendering and how does it fix this?","acceptedAnswer":{"@type":"Answer","text":"Edge rendering serves different responses based on who\u0027s requesting the page. When a search bot or AI crawler requests a URL, the edge proxy returns a fully rendered HTML snapshot (50\u2013150 KB with real content). When a user requests it, they get the normal SPA. This fixes indexing without changing your app code."}},{"@type":"Question","name":"Does this affect AI crawlers too?","acceptedAnswer":{"@type":"Answer","text":"Yes. AI crawlers like GPTBot, ClaudeBot, and PerplexityBot do not execute JavaScript at all. If your pages require JS to render content, AI systems will never see or cite your content. Edge rendering can serve clean Markdown to these crawlers specifically."}}]}
```


## Discovery & Navigation
> Semantic links for AI agent traversal.

* [DataJelly Edge](https://datajelly.com/products/edge)
* [DataJelly Guard](https://datajelly.com/products/guard)
* [Features](https://datajelly.com/#features)
* [Pricing](https://datajelly.com/pricing)
* [Visibility Test](https://datajelly.com/visibility-test)
* [Prerendering](https://datajelly.com/prerendering)
* [Prerender Alternative](https://datajelly.com/prerender-alternative)
* [Lovable SEO](https://datajelly.com/lovable-seo)
* [Visibility Layer Guide](https://datajelly.com/guides/visibility-layer)
* [How Snapshots Work](https://datajelly.com/guides/how-snapshots-work)
* [AI SEO Platform](https://datajelly.com/ai-seo-platform)
* [Bot Detection](https://datajelly.com/bot-detection)
* [Dashboard](https://dashboard.datajelly.com/)
* [SEO Tools](https://datajelly.com/seo-tools)
* [Visibility Test](https://datajelly.com/seo-tools/visibility-test)
* [Site Audit](https://datajelly.com/seo-tools/site-audit)
* [Bot Test](https://datajelly.com/seo-tools/bot-test)
* [Social Card Preview](https://datajelly.com/seo-tools/social-card-preview)
* [Robots.txt Tester](https://datajelly.com/seo-tools/robots-txt-tester)
* [Sitemap Validator](https://datajelly.com/seo-tools/sitemap-validator)
* [Structured Data Validator](https://datajelly.com/seo-tools/structured-data-validator)
* [HTTP Header Checker](https://datajelly.com/seo-tools/http-header-checker)
* [Page Speed Analyzer](https://datajelly.com/seo-tools/page-speed-analyzer)
* [SSL Certificate Checker](https://datajelly.com/seo-tools/ssl-checker)
* [DNS Records Viewer](https://datajelly.com/seo-tools/dns-records-viewer)
* [Guides](https://datajelly.com/guides)
* [Getting Started](https://datajelly.com/guides/getting-started)
* [SPA SEO Guide](https://datajelly.com/guides/spa-seo)
* [JavaScript SEO Guide](https://datajelly.com/guides/javascript-seo)
* [SSR Guide](https://datajelly.com/guides/ssr)
* [Search Engine Crawling Guide](https://datajelly.com/guides/search-engine-crawling)
* [Lovable SEO Guide](https://datajelly.com/guides/lovable-seo)
* [AI SEO Testing Guide](https://datajelly.com/guides/ai-seo)
* [SEO Testing Guide](https://datajelly.com/guides/seo-testing)
* [SERP Tracking Guide](https://datajelly.com/guides/serp-tracking)
* [Security Testing Guide](https://datajelly.com/security)
* [About Us](https://datajelly.com/about)
* [Contact](https://datajelly.com/contact)
* [Blog](https://datajelly.com/blog)
* [Terms of Service](https://datajelly.com/terms)
