DJ
DataJelly
Visibility Test
EdgeGuard
PricingSEO ToolsGuidesGet Started
Dashboard
Back to Blog
Blog
Edge
April 2026

Why Internal Links Don't Exist in Your SPA (From Google's POV)

A team ships a React marketing site. Navigation works. The site is fast. Two weeks later: 80% of pages aren't indexed, crawl depth stops at 2, organic traffic is down 60%. Nothing is "broken." The problem is that Google never saw the links.

Reading progress0%

On This Page

The Real Problem

Most SPAs do not include internal links in the initial HTML. We see this constantly. The HTML response is 3–8 KB, the visible text is under 100 characters, and the anchor count is zero. Everything you see in the browser — navigation, related content, footer links — is injected after JavaScript runs.

A real failing homepage we audited:

  • • HTML size: 5.2 KB
  • • Visible text: 42 words
  • • Anchor tags: 0
  • • Browser-rendered: 47 anchors, full nav, footer, content links

Result: homepage indexed. Every child page ignored. The crawler had nowhere to go.

This is not an SEO tweak problem. It's a delivery problem. If links aren't in the document Google fetched, they don't exist in the crawl graph — full stop.

How Googlebot Actually Discovers Links

The crawl pipeline is simple, and the order matters:

  1. Fetch the HTML
  2. Extract <a href="..."> immediately
  3. Queue the discovered URLs for crawl
  4. Maybe render JavaScript later (rendering tier, separate queue, deprioritized)

Steps 1–3 happen before any rendering. If links aren't in step 1, they're not in the queue. Rendering may eventually happen — but link extraction is already done. Anything injected after hydration is found on a future crawl, if at all.

AI crawlers (GPTBot, ClaudeBot, PerplexityBot) are even simpler — they don't render at all. See How AI Crawlers Read Your Website and What AI Crawlers Actually Extract.

Google reliably extracts only this:

<a href="/pricing">Pricing</a>

It does not reliably extract any of these:

<div onClick={() => navigate("/pricing")}>Pricing</div>
<span role="link" data-href="/pricing">Pricing</span>
<button onClick={handleClick}>Pricing</button>

No anchor tag in the raw HTML = no link. The React Router <Link> component does render an anchor — but only after JS runs. That's the whole bug.

What Most Guides Get Wrong

"Google renders JavaScript, so SPA links are fine."

Google's render queue is delayed (seconds to weeks), resource-limited, and inconsistent across crawls. You can't rely on it for link discovery. See Why "Google Renders JavaScript" Is Misleading.

"React Router <Link> components are SEO-safe."

They render anchor tags after hydration. The crawler has already extracted links and moved on by then. They are functionally invisible to the first-pass crawler.

"Sitemaps will fix discovery."

Sitemaps help Google find URLs but don't carry crawl signal or authority. A page only reachable via sitemap.xml ranks worse than one reachable via internal links — and many never get indexed at all.

"Nofollow / noopener don't matter here."

True, but only if you have anchor tags in the first place. We see teams optimize anchor attributes on a page that ships zero anchors.

What We See in Production

Four failure patterns. We see all of them, every week, across React, Vite, and Lovable apps.

1

Zero-link homepage

Signals: HTML 5 KB, ~40 words, 0 anchor tags. Looks like a normal SaaS marketing page in the browser.

Outcome: Homepage indexed. Nothing else discovered. The crawler reaches the front door, finds no links, leaves. You can't backfill this with a sitemap — pages with no inbound internal links almost never rank.

2

Script-shell pages

Signals: 25 KB HTML, 1.8 MB JS bundle, 80 chars of visible text. Everything meaningful is behind hydration.

Outcome: Structurally empty at crawl time. Even when Google does render, link extraction has already happened. Covered in detail in Script Shell Pages and Your HTML Is Only 4KB.

3

Partial indexing after switching to client routing

Signals: Same content, same URLs, no other change. Before migration: 120 indexed pages. After: 18.

Outcome: Links moved from HTML to JavaScript. Google dropped 85% of the link graph. We see this every time a team migrates from a server-rendered marketing site to a CSR React app and forgets that anchor tags matter.

4

Crawl traps (orphan URLs)

Signals: Routes like /pricing/enterprise or /blog/post-123 exist and respond 200, but no <a href> anywhere on the site points to them.

Outcome: Google does not guess URLs. These pages effectively don't exist for organic traffic. Use Site Crawler to find them.

[Screenshot placeholder: GSC "Discovered – currently not indexed" report showing dozens of orphan URLs]

Quick Test: What Do Bots Actually See?

~30 seconds

Most people guess. Don't.

Run this test and look at the actual response your site returns to bots.

1

Fetch your page as Googlebot

Use your terminal:

curl -A "Googlebot" https://yourdomain.com

Look for:

  • Real visible text (not just <div id="root">)
  • Meaningful content in the HTML
  • Page size (should not be tiny)
2

Compare bot vs browser

Now test what a real browser gets:

curl -A "Mozilla/5.0" https://yourdomain.com

If these responses are different, Google is indexing a different page than your users see.

Stop guessing — measure it.

Real example: 253 words vs 13,547

We see this constantly. Here's a real example from production: Googlebot saw 253 words and 2 KB of HTML. A browser saw 13,547 words and 77.5 KB. Same URL — completely different content.

Bot vs browser comparison showing 253 words for Googlebot vs 13,547 words for a rendered browser on the same URL

If your HTML doesn't contain the content, Google doesn't either.

Compare Googlebot vs browser on your site → HTTP Debug Tool
3

Check for common failure signals

We see this all the time in production:

  • HTML under ~1KB → usually empty shell
  • Visible text under ~200 characters → thin or missing content
  • Missing <title> or <h1> → weak or broken page
  • Large difference between bot vs browser HTML → rendering issue

Use the DataJelly Visibility Test (Recommended)

You can run this without touching curl. It shows you:

  • Raw HTML returned to bots (Googlebot, Bing, GPTBot, etc.)
  • Fully rendered browser version
  • Side-by-side differences in word count, HTML size, links, and content
Run Visibility Test — Free

What this test tells you (no guessing)

After running this, you'll know:

  • Whether your HTML is actually indexable
  • Whether bots are seeing partial content
  • Whether rendering is breaking in production

This is the difference between "I think SEO is set up" and "I know what Google is indexing."

If you don't understand why this happens, read: Why Google Can't See Your SPA

If this test fails

You have three real options:

SSR

Works if you can keep it stable in production

Prerendering

Breaks with dynamic content and scale

Edge Rendering

Reflects real production output without app changes

If you do nothing, you will not rank consistently. Learn how Edge Rendering works →

This issue doesn't show up in Lighthouse. It shows up in rankings.

Run the TestAsk a Question

Practical Checklist

Five-minute audit. Run after every deploy.

1. Fetch the raw HTML as Googlebot

curl -A "Googlebot" https://yourdomain.com -o homepage.html

2. Count anchor tags

grep -o '<a ' homepage.html | wc -l
  • 20+ anchors on a marketing homepage → healthy
  • 5–15 anchors → at risk
  • 0–4 anchors → broken link discovery

3. Inspect the actual HTML shape

A typical zero-link SPA shell looks like this:

<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<title>Acme — modern SaaS</title>
</head>
<body>
<div id="root"></div>
<script type="module" src="/assets/index-a3f7.js"></script>
</body>
</html>

Compare to the same homepage rendered properly (what bots should see):

<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<title>Acme — modern SaaS</title>
</head>
<body>
<header>
<a href="/">Acme</a>
<nav>
<a href="/product">Product</a>
<a href="/pricing">Pricing</a>
<a href="/blog">Blog</a>
<a href="/docs">Docs</a>
</nav>
</header>
<main>
<h1>Ship faster with Acme</h1>
<p>...real content with <a href="/case-studies">case studies</a>...</p>
</main>
<footer>
<a href="/about">About</a>
<a href="/security">Security</a>
<a href="/contact">Contact</a>
</footer>
</body>
</html>

4. Crawl your own site

Use Site Crawler to walk the link graph the way Google does. If max crawl depth is 1, your internal links are invisible. If most discovered URLs come from sitemap.xml rather than HTML extraction, same problem.

5. Score bot-readiness

Run Page Validator with a Googlebot user agent. It flags missing anchor tags, low text, and script-shell patterns automatically.

6. Compare bot vs browser

Open HTTP Bot Comparison to see the bot HTML next to the browser HTML on the same URL. If anchor count differs by >80%, your routing is invisible to crawlers.

7. Validate sitemap coverage

A sitemap is not a substitute for internal links, but it's a backstop. Use Sitemap Validator to confirm orphan URLs are at least listed. Then read Why Your Sitemap Exists But Google Still Ignores Your Pages for why this isn't enough.

[Data placeholder: real customer crawl-depth before/after enabling Edge — typical pattern is depth 1 → depth 4+ within one crawl cycle]

Prerender vs SSR vs Edge

Three real options. Most teams pick the wrong one.

Prerendering

Works: static snapshots include links. Crawl depth restored.

Breaks: snapshots go stale, drift from real app state, dynamic pages fail silently. See Hidden Costs of Prerendering.

SSR (Next.js)

Works: server returns full HTML with anchor tags.

Costs: full app rewrite, slower TTFB (+200–500ms), complex caching, higher infra cost. Often a 6–12 week project.

Edge proxy (DataJelly)

Behavior: bots get fully-rendered HTML with full anchor graph; humans get the live SPA.

Result: crawl depth restored within days. No rewrite. Works with React, Vite, Lovable.

Full breakdown: Prerender vs SSR vs Edge Rendering.

If your links aren't in the initial HTML, your link graph doesn't exist.

Not after hydration. Not after API calls. Only what's in the first response gets crawled. Most modern SPAs fail this by default.

The DataJelly Edge Approach

DataJelly Edge sits in front of your app and serves fully-rendered HTML — with the full anchor graph — to bots. Humans hit your normal SPA. AI crawlers get clean Markdown. No app rewrite, no SSR migration, no framework switch.

  • Edge proxy delivers rendered HTML with real anchor tags to every search and AI bot
  • Crawl depth restored within one crawl cycle
  • AI Markdown for GPTBot, ClaudeBot, PerplexityBot
  • Works with React, Vite, and Lovable SPAs as-is

See what bots see on your site

Run the free Visibility Test on your homepage. Get the bot HTML, anchor count, and visible text in under 30 seconds. No signup.

Run the Homepage Visibility TestStart 7-Day Free TrialAsk a Question

Related Diagnostic Tools

Visibility Test

Compare bot vs browser HTML side-by-side

Page Validator

Check bot-readiness and anchor count

HTTP Bot Comparison

Diff Googlebot vs browser response

Site Crawler

Walk your link graph the way Google does

Sitemap Validator

Check sitemap coverage for orphan URLs

Link Discovery

What anchor tags exist in your raw HTML

FAQ

Related Reading

Why "Google Renders JavaScript" Is Misleading

Google's render queue is delayed, deprioritized, and often skipped. You can't rely on it for link discovery.

What AI Crawlers Actually Extract

GPTBot, ClaudeBot, PerplexityBot don't render JS. Here's what they pull from raw HTML.

Why Google Can't See Your SPA

Same root cause, different angle: empty initial HTML kills both content and link discovery.

Why Your Sitemap Exists But Google Still Ignores Your Pages

Sitemaps are not a substitute for internal links. Discovery without authority doesn't rank.

Page Crawled But Not Indexed

When Google finds a URL but won't index it — usually a delivery and link-graph problem.

Reading progress0%

On This Page

DataJelly

SEO snapshots for modern SPAs. Making JavaScript applications search engine friendly with enterprise-grade reliability.

Product

  • DataJelly Edge
  • DataJelly Guard
  • Pricing
  • SEO Tools
  • Visibility Test
  • Dashboard

Resources

  • Blog
  • Guides
  • Getting Started
  • Prerendering
  • SPA SEO Guide

Company

  • About Us
  • Contact
  • Terms of Service
  • Privacy Policy

© 2026 DataJelly. All rights reserved. Built with love for the modern web.