I should put a disclaimer at the top of this article: SiteMapr is one of the tools that misses things on JavaScript-heavy sites. I would rather tell you that up front and explain when to use a different tool than pretend SiteMapr is right for every site. If you have crawled a single-page app with a generic sitemap generator and gotten back fifteen URLs when you have hundreds of routes, this article explains why and what to do instead.
What "JavaScript-heavy" actually means here
A site is meaningfully JavaScript-rendered if the initial HTML response from the server does not contain the page's main content or the navigation links between pages. Open your site, view the raw source (Ctrl+U, not the rendered DOM in dev tools), and look for the actual content of the page.
If you see "Loading..." or an empty div with id root and almost no other body content, your site is client-rendered. If you see the full article text or product listing in the source, your site is server-rendered or static, and most of this article does not apply to you.
The grey area is partial hydration: Next.js, Nuxt, Remix, SvelteKit and similar frameworks server-render the initial route and then take over with JS. These usually crawl fine. The trouble is with frameworks that did not pre-render anything: Create React App without a separate static export, single-bundle Vue or React apps, dashboard-style apps that route entirely client-side.
What most sitemap crawlers do (and don't)
Most public sitemap generators, including SiteMapr, work like this:
- Fetch a URL with an HTTP client.
- Parse the returned HTML for anchor href tags.
- Add discovered URLs to a queue and repeat.
That is it. No JavaScript runtime. No DOM. No waiting for fetch calls to resolve. If your anchor tags are added to the page by JavaScript at runtime, which is exactly what client-rendered SPAs do, none of those links exist in the HTML the crawler receives. The crawler sees the homepage, finds zero links, and stops.
The fix is not to make the crawler smarter (well, it is, but bear with me). The fix is usually to make sure your site is producing real HTML for crawlers in the first place, because if our crawler cannot see your routes, neither can Googlebot in many cases.
Why this matters for SEO
There is a common misconception that "Googlebot runs JavaScript so it does not matter." It does run JavaScript, but with two caveats that destroy the assumption:
- JavaScript rendering is on a separate, slower queue. Google fetches the HTML first, indexes what it sees, and later (sometimes weeks later) sends the URL for rendering. New content shows up much slower than on server-rendered sites.
- Rendering can fail silently. If your client-side code does anything Googlebot's renderer does not like, uses unsupported APIs, takes too long to load, throws errors during hydration, Google indexes the empty shell. You will not see a clear error in Search Console; you will see Crawled – currently not indexed, or you will see your pages indexed with no useful content.
So when SiteMapr cannot crawl your site, that is a leading indicator: there is a non-trivial chance Google is also struggling. Treat it as a diagnostic signal, not a tool failure.
Five strategies, in order of preference
There is no single right answer here. The right approach depends on your stack and how much rework you are willing to do.
1. Don't crawl at all — generate the sitemap from your routes
If you control the site and use a framework with a known route structure, the cleanest answer is not to crawl. It is to generate the sitemap from your own routing config or content source.
In Next.js, the app/sitemap.ts Metadata Route lets you build the sitemap from your CMS or filesystem at build time. In Nuxt, the sitemap module does the same. In Gatsby, gatsby-plugin-sitemap reads from your GraphQL data layer. In Astro, the official sitemap integration walks your routes.
This approach is strictly more accurate than crawling, because it does not rely on the crawler successfully discovering URLs through JavaScript. The sitemap is generated from the same source of truth that produces the routes themselves. If you are building a site from scratch or rebuilding the sitemap pipeline, this is what to do.
2. Pre-render or server-render the public site
If your concern is not just the sitemap but indexing in general, the more durable fix is to make the rendered HTML available without requiring a JS runtime. Options, from least to most invasive:
- Static export at build time. Works for any framework that supports it (Next.js with output export, Gatsby, Astro, Eleventy, Hugo). You ship pre-built HTML for every route.
- Server-side rendering on request. Next.js, Nuxt, Remix in their default modes do this. The server returns rendered HTML; the client takes over after.
- A pre-rendering proxy like Prerender.io, which detects bot traffic and serves a pre-rendered version. Cheaper to add to an existing site than refactoring.
Once your pages have content in the initial HTML, any sitemap generator (including SiteMapr) will be able to crawl them.
3. Use a headless-browser-based crawler
If you cannot change the site and need to crawl it as-is, you need a tool that runs JavaScript before parsing. The serious options:
- Screaming Frog SEO Spider has a JavaScript rendering mode (Configuration, Spider, Rendering, JavaScript). It uses a headless Chromium and is by far the most common professional tool for this.
- Sitebulb offers a similar JS rendering mode.
- Puppeteer or Playwright for a custom solution. If you have engineering hours, writing a 100-line crawler that uses Playwright to navigate and extract routes is straightforward.
These tools are slower than HTML crawlers, typically 5 to 20 times slower, and consume real CPU. That is why hosted sitemap generators do not offer JS rendering for free; the cost per crawl is dramatically higher.
4. Crawl the API instead of the pages
This sounds odd but works for content-driven SPAs. If your client-side app fetches content from an API endpoint that returns the list of pages (a CMS API, a search index, a JSON manifest), you can hit that endpoint directly and build the sitemap from the response.
Example: a React app that loads articles from /api/articles?page=1. The articles are not in the HTML, but they are in the JSON. A short script that paginates through the API and writes URLs to a sitemap.xml works and is more reliable than any browser-based crawl.
5. Maintain the sitemap manually
For small sites, a portfolio with a dozen pages, a marketing site with a fixed set of routes, you can write the sitemap by hand or generate it from a list of routes in a config file. It is not glamorous, but for a 20-route site, "manually edit a list when you add a page" is more reliable than any tool.
What I tell people to do, in practice
The decision tree I actually use, condensed:
- You control the site and use a modern framework: Generate the sitemap from your route source (option 1). Do not crawl your own site.
- You have a legacy SPA you cannot easily refactor: Add Prerender.io or similar (option 2c). It solves both the sitemap problem and the broader indexing problem.
- You are auditing a client's site and need a full crawl: Use Screaming Frog with JS rendering enabled (option 3). It is the right tool for this specific job.
- It is a portfolio or small marketing site: Maintain it manually (option 5). The maintenance burden is real but tiny.
SiteMapr fits the case where you have a server-rendered or static site and want a quick, free sitemap without installing software. That is most websites. It is not all of them, and pretending otherwise wastes your time.
A note on Googlebot's actual behavior
If you only remember one thing from this article: rendered HTML is what gets indexed. Whether your framework gets there via SSR, SSG, or hydration of a pre-rendered shell matters less than the simple test of "what does the raw HTML response look like?"
When that response is empty or near-empty, you have an indexing risk regardless of what your sitemap says. Fixing the rendering is the durable solution. Fixing the sitemap is at best an incomplete one.