Blog

Edge

April 2026

How to Debug SEO Issues in a React App

You ship a React app. It returns 200 OK. It works in Chrome. Google doesn't index it. We see this constantly: the HTML response is 2–4KB, contains a root div and scripts, and zero usable text. From a crawler's perspective, the page is empty. This is not an SEO tweak problem. It's a rendering failure.

Reading progress0%

What's Actually Happening

Most React apps ship a client-rendered shell. The server response is ~2–5KB of HTML containing a root <div id="root"> and a script bundle reference. Content only exists after JS executes in the browser.

<!DOCTYPE html>
<html>
  <head>
    <title>My App</title>
  </head>
  <body>
    <div id="root"></div>
    <script src="/assets/index-9f3a1.js"></script>
  </body>
</html>

Bots don't dependably execute your JS. Even when they do, it's delayed and inconsistent. What actually gets indexed is the initial HTML — and whatever text exists at response time.

If that HTML has:

<200 characters of visible text
No <h1>
No internal links

…it is treated as empty. No text, no links, no structure → no indexing. Companion read: Why Google Can't See Your SPA.

Step 1: Inspect Raw HTML (Not the Rendered DOM)

The most common debugging mistake: opening DevTools and inspecting the Elements panel. That shows the post-JS DOM. Bots don't see that. They see the raw response.

Run:

curl -A "Googlebot" https://yoursite.com -o page.html
wc -c page.html       # byte count
grep -o '<h1' page.html | wc -l   # h1 count

Or in browser: View Source (Cmd/Ctrl+U) — not Inspect Element.

Hard numbers to look for:

HTML size < 5KB → failure
Visible text < 200 chars → failure
No <h1> or paragraphs → failure

If you don't see real content in raw HTML, stop. That's the root cause. Everything else is downstream of this.

Step 2: Measure Content (Real Thresholds)

Stop guessing. Use real numbers. These are the production thresholds we apply to every site we audit:

Metric	Healthy	Suspicious	Broken
html_bytes	20KB+	5–15KB	< 5KB
visible_text_length	1,500+ chars	200–800 chars	< 200 chars
word_count	300+	50–200	< 50
internal_links	10+	1–5	0

What we see in production:

• <200 chars → indexed as blank
• <1KB HTML → guaranteed no indexing
• 0 internal links → no crawl discovery

The Page Validator applies these exact thresholds for you and flags pages as blank_page or script_shell_only.

Step 3: Identify Script Shell Pages

Script shells have a clear pattern. If your raw HTML matches this, you have one:

HTML size:        ~3KB
<script> tags:    15–40
<h1>:             missing
<p>:              missing
visible text:     <100 chars
root element:     <div id="root"></div>

The page exists. The shell loads. The content is locked behind JS execution that bots don't run. We have a full breakdown here: Script Shell Pages: When Your App Loads But Nothing Works.

Quick sanity check (Chrome):

DevTools → Settings → Disable JavaScript
Hard refresh the page
If the page is blank → bots see the same blank page

Step 4: Check Internal Links and Structure

Indexing isn't enough. Bots also need to discover your other pages. That happens via internal links in the raw HTML.

# Count <a href> tags in the raw response
curl -sA "Googlebot" https://yoursite.com | grep -oE '<a [^>]*href' | wc -l

If this returns 0, your homepage links nowhere. Bots crawl one page and leave. We see this constantly on React Router apps where navigation is rendered by JS.

Also verify:

<title> is unique per route — not a static "My App" everywhere
<meta name="description"> exists and matches the page
Open Graph tags are present in HTML, not injected by JS
JSON-LD structured data is in the response, not appended after render

Step 5: Compare Bot vs Browser

The single most useful test: fetch the page with a bot user agent, then with a browser user agent, and diff the responses. If they're meaningfully different, Google is indexing a different page than your users see.

Quick Test: What Do Bots Actually See?

~30 seconds

Most people guess. Don't.

Run this test and look at the actual response your site returns to bots.

Fetch your page as Googlebot

Use your terminal:

curl -A "Googlebot" https://yourdomain.com

Look for:

Real visible text (not just <div id="root">)
Meaningful content in the HTML
Page size (should not be tiny)

Compare bot vs browser

Now test what a real browser gets:

curl -A "Mozilla/5.0" https://yourdomain.com

If these responses are different, Google is indexing a different page than your users see.

Stop guessing — measure it.

Real example: 253 words vs 13,547

We see this constantly. Here's a real example from production: Googlebot saw 253 words and 2 KB of HTML. A browser saw 13,547 words and 77.5 KB. Same URL — completely different content.

Bot vs browser comparison showing 253 words for Googlebot vs 13,547 words for a rendered browser on the same URL

If your HTML doesn't contain the content, Google doesn't either.

Compare Googlebot vs browser on your site → HTTP Debug Tool

Check for common failure signals

We see this all the time in production:

HTML under ~1KB → usually empty shell
Visible text under ~200 characters → thin or missing content
Missing <title> or <h1> → weak or broken page
Large difference between bot vs browser HTML → rendering issue

Use the DataJelly Visibility Test (Recommended)

You can run this without touching curl. It shows you:

Raw HTML returned to bots (Googlebot, Bing, GPTBot, etc.)
Fully rendered browser version
Side-by-side differences in word count, HTML size, links, and content

Run Visibility Test — Free

What this test tells you (no guessing)

After running this, you'll know:

Whether your HTML is actually indexable
Whether bots are seeing partial content
Whether rendering is breaking in production

This is the difference between "I think SEO is set up" and "I know what Google is indexing."

If you don't understand why this happens, read: Why Google Can't See Your SPA

If this test fails

You have three real options:

SSR

Works if you can keep it stable in production

Prerendering

Breaks with dynamic content and scale

Edge Rendering

Reflects real production output without app changes

If you do nothing, you will not rank consistently. Learn how Edge Rendering works →

This issue doesn't show up in Lighthouse. It shows up in rankings.

Run the Test Ask a Question

Common Root Causes

Four patterns. We see them constantly in React, Vite, and Lovable apps.

Pure CSR (no SSR, no prerender)

Symptom: Every route returns the same shell HTML.

Fix: Serve real HTML to bots — via SSR, prerendering, or an edge proxy.

Hydration mismatch crashes

Symptom: SSR'd HTML exists but React throws during hydration. UI breaks. Console shows "Text content does not match server-rendered HTML."

Fix: Eliminate non-deterministic rendering on first paint. Read: Hydration Crashes: The Silent Killer.

API-gated content

Symptom: The app waits for /api/page-data before rendering anything. The API is slow or fails intermittently. Bots see an empty page.

Fix: Render core content from the server response, not the client API. Show fallback HTML during loading.

Meta tags injected by JS

Symptom: react-helmet or similar adds title/description after mount. Bots index the static <title> from index.html — usually generic, identical for every page.

Fix: Generate per-route HTML at build time or at the edge so meta tags are in the response.

Solutions Compared: SSR vs Prerender vs Edge

Three real approaches. Each has tradeoffs.

Approach	Works when	Breaks when
SSR (Next.js, Remix)	You can rewrite to a meta-framework and absorb infra cost	Higher TTFB, infra complexity, hot path scales with traffic
Prerendering	Routes are known upfront, content rarely changes, <100 pages	Dynamic data, growing routes, invalidation is imperfect — see Hidden Costs
Edge (DataJelly)	You want bots to see live HTML without rewriting your React app	No long-lived snapshot cache → no drift, no stale content

How Edge solves React SEO

• Generates or validates HTML at request time for bots — no app rewrite
• Search bots get full HTML snapshots with real content, links, and meta tags
• AI crawlers (GPTBot, ClaudeBot, Perplexity) get clean Markdown
• Real users still get the live SPA — zero impact on UX
• Works with React, Vite, and Lovable apps out of the box

Learn how Edge works →

Practical Checklist

Run all eight on your live site. If even one fails, you have a measurable React SEO problem.

Raw HTML > 15KB on content pages

Anything under 5KB is a script shell.

Visible text > 500 chars in raw response

Under 200 chars = indexed as blank.

<h1> exists in raw HTML

Missing h1 = no semantic anchor for ranking.

Per-route <title> and meta description

Same title across all routes = generic indexing.

Internal <a href> links in raw HTML

0 links = no crawl discovery beyond the homepage.

Page renders with JS disabled

Blank page with JS off = bots see blank too.

Bot vs browser HTML are similar size

Large diff = rendering inconsistency.

Open Graph + JSON-LD in response, not JS-injected

Late injection = social previews and rich results break.

Want this automated? The Page Validator and HTTP Bot Comparison tool run most of these for you.

React SEO problems are rendering problems.

No amount of meta-tag tuning, sitemap fiddling, or backlink work fixes a 4KB shell. If your HTML doesn't contain the content, Google doesn't either.

What DataJelly Does About This

DataJelly Edge sits in front of your existing React app. It serves complete HTML snapshots to search bots and clean Markdown to AI crawlers — without changing your application code. The goal is simple: bots see the same complete page your users see.

Works with React, Vite, and Lovable apps. No rewrite. No SSR migration. No prerender drift.

Run Visibility Test — Free Start 7-Day Free Trial Ask a Question

Related Diagnostic Tools

Visibility Test

Compare bot vs browser HTML side-by-side

Page Validator

Apply real React-SEO thresholds automatically

HTTP Bot Comparison

Compare Googlebot vs browser responses

Site Crawler

Audit route coverage across your React app

FAQ

Blog

Edge

April 2026

How to Debug SEO Issues in a React App

Reading progress0%

What's Actually Happening

<!DOCTYPE html>
<html>
  <head>
    <title>My App</title>
  </head>
  <body>
    <div id="root"></div>
    <script src="/assets/index-9f3a1.js"></script>
  </body>
</html>

Bots don't dependably execute your JS. Even when they do, it's delayed and inconsistent. What actually gets indexed is the initial HTML — and whatever text exists at response time.

If that HTML has:

<200 characters of visible text
No <h1>
No internal links

…it is treated as empty. No text, no links, no structure → no indexing. Companion read: Why Google Can't See Your SPA.

Step 1: Inspect Raw HTML (Not the Rendered DOM)

The most common debugging mistake: opening DevTools and inspecting the Elements panel. That shows the post-JS DOM. Bots don't see that. They see the raw response.

Run:

curl -A "Googlebot" https://yoursite.com -o page.html
wc -c page.html       # byte count
grep -o '<h1' page.html | wc -l   # h1 count

Or in browser: View Source (Cmd/Ctrl+U) — not Inspect Element.

Hard numbers to look for:

HTML size < 5KB → failure
Visible text < 200 chars → failure
No <h1> or paragraphs → failure

If you don't see real content in raw HTML, stop. That's the root cause. Everything else is downstream of this.

Step 2: Measure Content (Real Thresholds)

Stop guessing. Use real numbers. These are the production thresholds we apply to every site we audit:

Metric	Healthy	Suspicious	Broken
html_bytes	20KB+	5–15KB	< 5KB
visible_text_length	1,500+ chars	200–800 chars	< 200 chars
word_count	300+	50–200	< 50
internal_links	10+	1–5	0

What we see in production:

• <200 chars → indexed as blank
• <1KB HTML → guaranteed no indexing
• 0 internal links → no crawl discovery

The Page Validator applies these exact thresholds for you and flags pages as blank_page or script_shell_only.

Step 3: Identify Script Shell Pages

Script shells have a clear pattern. If your raw HTML matches this, you have one:

HTML size:        ~3KB
<script> tags:    15–40
<h1>:             missing
<p>:              missing
visible text:     <100 chars
root element:     <div id="root"></div>

The page exists. The shell loads. The content is locked behind JS execution that bots don't run. We have a full breakdown here: Script Shell Pages: When Your App Loads But Nothing Works.

Quick sanity check (Chrome):

DevTools → Settings → Disable JavaScript
Hard refresh the page
If the page is blank → bots see the same blank page

Step 4: Check Internal Links and Structure

Indexing isn't enough. Bots also need to discover your other pages. That happens via internal links in the raw HTML.

# Count <a href> tags in the raw response
curl -sA "Googlebot" https://yoursite.com | grep -oE '<a [^>]*href' | wc -l

If this returns 0, your homepage links nowhere. Bots crawl one page and leave. We see this constantly on React Router apps where navigation is rendered by JS.

Also verify:

<title> is unique per route — not a static "My App" everywhere
<meta name="description"> exists and matches the page
Open Graph tags are present in HTML, not injected by JS
JSON-LD structured data is in the response, not appended after render

Step 5: Compare Bot vs Browser

Quick Test: What Do Bots Actually See?

~30 seconds

Most people guess. Don't.

Run this test and look at the actual response your site returns to bots.

Fetch your page as Googlebot

Use your terminal:

curl -A "Googlebot" https://yourdomain.com

Look for:

Real visible text (not just <div id="root">)
Meaningful content in the HTML
Page size (should not be tiny)

Compare bot vs browser

Now test what a real browser gets:

curl -A "Mozilla/5.0" https://yourdomain.com

If these responses are different, Google is indexing a different page than your users see.

Stop guessing — measure it.

Real example: 253 words vs 13,547

We see this constantly. Here's a real example from production: Googlebot saw 253 words and 2 KB of HTML. A browser saw 13,547 words and 77.5 KB. Same URL — completely different content.

If your HTML doesn't contain the content, Google doesn't either.

Compare Googlebot vs browser on your site → HTTP Debug Tool

Check for common failure signals

We see this all the time in production:

HTML under ~1KB → usually empty shell
Visible text under ~200 characters → thin or missing content
Missing <title> or <h1> → weak or broken page
Large difference between bot vs browser HTML → rendering issue

Use the DataJelly Visibility Test (Recommended)

You can run this without touching curl. It shows you:

Raw HTML returned to bots (Googlebot, Bing, GPTBot, etc.)
Fully rendered browser version
Side-by-side differences in word count, HTML size, links, and content

Run Visibility Test — Free

What this test tells you (no guessing)

After running this, you'll know:

Whether your HTML is actually indexable
Whether bots are seeing partial content
Whether rendering is breaking in production

This is the difference between "I think SEO is set up" and "I know what Google is indexing."

If you don't understand why this happens, read: Why Google Can't See Your SPA

If this test fails

You have three real options:

SSR

Works if you can keep it stable in production

Prerendering

Breaks with dynamic content and scale

Edge Rendering

Reflects real production output without app changes

If you do nothing, you will not rank consistently. Learn how Edge Rendering works →

This issue doesn't show up in Lighthouse. It shows up in rankings.

Run the Test Ask a Question

Common Root Causes

Four patterns. We see them constantly in React, Vite, and Lovable apps.

Pure CSR (no SSR, no prerender)

Symptom: Every route returns the same shell HTML.

Fix: Serve real HTML to bots — via SSR, prerendering, or an edge proxy.

Hydration mismatch crashes

Symptom: SSR'd HTML exists but React throws during hydration. UI breaks. Console shows "Text content does not match server-rendered HTML."

Fix: Eliminate non-deterministic rendering on first paint. Read: Hydration Crashes: The Silent Killer.

API-gated content

Symptom: The app waits for /api/page-data before rendering anything. The API is slow or fails intermittently. Bots see an empty page.

Fix: Render core content from the server response, not the client API. Show fallback HTML during loading.

Meta tags injected by JS

Symptom: react-helmet or similar adds title/description after mount. Bots index the static <title> from index.html — usually generic, identical for every page.

Fix: Generate per-route HTML at build time or at the edge so meta tags are in the response.

Solutions Compared: SSR vs Prerender vs Edge

Three real approaches. Each has tradeoffs.

Approach	Works when	Breaks when
SSR (Next.js, Remix)	You can rewrite to a meta-framework and absorb infra cost	Higher TTFB, infra complexity, hot path scales with traffic
Prerendering	Routes are known upfront, content rarely changes, <100 pages	Dynamic data, growing routes, invalidation is imperfect — see Hidden Costs
Edge (DataJelly)	You want bots to see live HTML without rewriting your React app	No long-lived snapshot cache → no drift, no stale content

How Edge solves React SEO

• Generates or validates HTML at request time for bots — no app rewrite
• Search bots get full HTML snapshots with real content, links, and meta tags
• AI crawlers (GPTBot, ClaudeBot, Perplexity) get clean Markdown
• Real users still get the live SPA — zero impact on UX
• Works with React, Vite, and Lovable apps out of the box

Learn how Edge works →

Practical Checklist

Run all eight on your live site. If even one fails, you have a measurable React SEO problem.

Raw HTML > 15KB on content pages

Anything under 5KB is a script shell.

Visible text > 500 chars in raw response

Under 200 chars = indexed as blank.

<h1> exists in raw HTML

Missing h1 = no semantic anchor for ranking.

Per-route <title> and meta description

Same title across all routes = generic indexing.

Internal <a href> links in raw HTML

0 links = no crawl discovery beyond the homepage.

Page renders with JS disabled

Blank page with JS off = bots see blank too.

Bot vs browser HTML are similar size

Large diff = rendering inconsistency.

Open Graph + JSON-LD in response, not JS-injected

Late injection = social previews and rich results break.

Want this automated? The Page Validator and HTTP Bot Comparison tool run most of these for you.

React SEO problems are rendering problems.

No amount of meta-tag tuning, sitemap fiddling, or backlink work fixes a 4KB shell. If your HTML doesn't contain the content, Google doesn't either.

What DataJelly Does About This

Works with React, Vite, and Lovable apps. No rewrite. No SSR migration. No prerender drift.

Run Visibility Test — Free Start 7-Day Free Trial Ask a Question

Related Diagnostic Tools

Visibility Test

Compare bot vs browser HTML side-by-side

Page Validator

Apply real React-SEO thresholds automatically

HTTP Bot Comparison

Compare Googlebot vs browser responses

Site Crawler

Audit route coverage across your React app

On This Page

What's Actually Happening

Step 1: Inspect Raw HTML (Not the Rendered DOM)

Step 2: Measure Content (Real Thresholds)

Step 3: Identify Script Shell Pages

Step 4: Check Internal Links and Structure

Step 5: Compare Bot vs Browser

Quick Test: What Do Bots Actually See?

Fetch your page as Googlebot

Compare bot vs browser

Real example: 253 words vs 13,547

Check for common failure signals

Use the DataJelly Visibility Test (Recommended)

What this test tells you (no guessing)

If this test fails

Common Root Causes

Pure CSR (no SSR, no prerender)

Hydration mismatch crashes

API-gated content

Meta tags injected by JS

Solutions Compared: SSR vs Prerender vs Edge

Practical Checklist

What DataJelly Does About This

Related Diagnostic Tools

FAQ

Why does my React app look fine but not get indexed?

What HTML size is a red flag?

How do I verify what Google sees?

What is a script shell page?

Do bots execute JavaScript reliably?

What's the fastest fix for SEO in a React app?

How does DataJelly solve this?

Related Reading

On This Page

What's Actually Happening

Step 1: Inspect Raw HTML (Not the Rendered DOM)

Step 2: Measure Content (Real Thresholds)

Step 3: Identify Script Shell Pages

Step 4: Check Internal Links and Structure

Step 5: Compare Bot vs Browser

Quick Test: What Do Bots Actually See?

Fetch your page as Googlebot

Compare bot vs browser

Real example: 253 words vs 13,547

Check for common failure signals

Use the DataJelly Visibility Test (Recommended)

What this test tells you (no guessing)

If this test fails

Common Root Causes

Pure CSR (no SSR, no prerender)

Hydration mismatch crashes

API-gated content

Meta tags injected by JS

Solutions Compared: SSR vs Prerender vs Edge

Practical Checklist

What DataJelly Does About This

Related Diagnostic Tools

FAQ

Why does my React app look fine but not get indexed?

What HTML size is a red flag?

How do I verify what Google sees?

What is a script shell page?

Do bots execute JavaScript reliably?

What's the fastest fix for SEO in a React app?

How does DataJelly solve this?

Related Reading