Web Front-end
12 minute read

React SEO Strategies and Best Practices

Vineet specializes in building data visualization interfaces and has used React extensively in his projects. His favorite drink is lassi.

React was developed to create interactive UIs that are declarative, modular, and cross-platform. Today, it is one of the more popular—if not the most popular—JavaScript frameworks for writing performant front-end applications. Developed initially to write Single Page Applications (SPAs), React is now used to create full-fledged websites and mobile applications.

If you have extensive experience in conventional web development and move to React, you will notice an increasing amount of your HTML and CSS code moving into JavaScript. This is because React doesn’t recommend directly creating or updating UI elements but instead describing the “state” of the UI. React will then update the DOM to match the state in the most efficient way.

As a result, all the changes to the UI or DOM must be made via React’s engine. Although convenient for developers, this may mean longer loading times for users and more work for search engines to find and index the content.

In this article, we will address challenges faced while building SEO-performant React apps and websites, and we will outline several strategies used to overcome them.

How Google Crawls and Indexes Webpages

Google receives over 90% of all online searches. Let’s take a closer look at its crawling and indexing process.

This snapshot taken from Google’s documentation can help us. Please note that this is a simplified block diagram. The actual Googlebot is far more sophisticated.

Diagram of Googlebot indexing a website.

Points to note:

  1. Googlebot maintains a crawl queue containing all the URLs it needs to crawl and index in the future.
  2. When the crawler is idle, it picks up the next URL in the queue, makes a request, and fetches the HTML.
  3. After parsing the HTML, Googlebot determines if it needs to fetch and execute JavaScript to render the content. If yes, the URL is added to a render queue.
  4. At a later point, the renderer fetches and executes JavaScript to render the page. It sends the rendered HTML back to the processing unit.
  5. The processing unit extracts all the URLs <a> tags mentioned on the webpage and adds them back to the crawl queue.
  6. The content is added to Google’s index.

Notice that there is a clear distinction between the Processing stage that parses HTML and the Renderer stage that executes JavaScript.

This distinction exists because executing JavaScript is expensive, given that Googlebots need to look at over 130 trillion webpages. So when Googlebot crawls a webpage, it parses the HTML immediately and then queues the JavaScript to run later. Google’s documentation mentions that a page stays in the render queue for a few seconds, though it may be longer.

It is also worth mentioning the concept of crawl budget. Google’s crawling is limited by bandwidth, time, and availability of Googlebot instances. It allocates a specific budget or resources to index each website. If you are building a large content-heavy website with thousands of pages (e.g., an e-commerce website) and these pages use a lot of JavaScript to render the content, Google may be able to read less content from your website.

Note: You can read Google’s guidelines for managing your crawl budget here.

Why Optimizing React for SEO Is Challenging

Our brief overview of Googlebot, crawling, and indexing only scratches the surface. Yet, software engineers should identify potential issues faced by search engines trying to crawl and index React pages. Now we can take a closer look at what makes React SEO challenging and what developers can do to address and overcome some of these challenges.

Empty First-pass Content

We know React applications rely heavily on JavaScript, and they often run into problems with search engines. This is because React employs an app shell model by default. The initial HTML does not contain any meaningful content, and a user or a bot must execute JavaScript to see the page’s actual content.

This approach means that Googlebot detects an empty page during the first pass. The content is seen by Google only when the page is rendered. This will delay the indexing of content when dealing with thousands of pages.

Loading Time and User Experience

Fetching, parsing, and executing JavaScript takes time. Further, JavaScript may need to make network calls to fetch the content, and the user may need to wait a while before they can view the requested information.

Google has laid out a set of web vitals related to user experience, used in its ranking criteria. A longer loading time may affect the user experience score, prompting Google to rank a site lower.

We review website performance in detail in the following section.

Page Metadata

Meta <meta> tags are helpful because they allow Google and other social media websites to show appropriate titles, thumbnails, and descriptions for pages. But these websites rely on the <head> tag of the fetched webpage to get this information. These websites do not execute JavaScript for the target page.

React renders all the content, including meta tags, on the client. Since the app shell is the same for the entire website/application, it may be hard to adapt the metadata for individual pages.

Sitemap

A sitemap is a file where you provide information about the pages, videos, and other files on your site and the relationships between them. Search engines like Google read this file to more intelligently crawl your site.

React does not have a built-in way to generate sitemaps. If you are using something like React Router to handle routing, you can find tools that can generate a sitemap though it may require some effort.

Other SEO Considerations

These considerations are related to setting up good SEO practices in general.

  1. Have an optimal URL structure to give humans and search engines a good idea about what to expect on the page.
  2. Optimizing the robots.txt file can help search bots understand how to crawl pages on your website.
  3. Use a CDN to serve all the static assets like CSS, JS, fonts etc. and use responsive images to reduce load times.

We can address many of the problems outlined above by using server-side rendering (SSR) or pre-rendering. We will review these approaches below.

Enter Isomorphic React

The dictionary definition of isomorphic is “corresponding or similar in form.”

In React terms, this means that the server has a similar form to the client. In other words, you can reuse the same React components on the server and client.

This isomorphic approach enables the server to render the React app and send the rendered version to our users and search engines so they can view the content instantly while JavaScript loads and executes in the background.

Frameworks like Next.js or Gatsby have popularized this approach. We should note that isomorphic components can look substantially different than conventional React components. For example, they can include code that runs on the server instead of the client. They can even include API secrets (although server code is stripped out before being sent to the client).

One point to note is that these frameworks abstract away a lot of complexity but also introduce an opinionated way of writing code. We will dig into the performance trade-offs further in this article.

We will also do a matrix analysis to understand the relationship between render paths and website performance. But before that, let’s go through some basics of measuring website performance.

Metrics for Website Performance

Let’s examine some of the factors search engines use to rank websites.

Apart from answering a user’s query quickly and accurately, Google believes that a good website should have the following attributes:

  • It should load quickly.
  • Users should be able to access content without too much waiting time.
  • It should become interactive to a user’s actions early.
  • It should not fetch unnecessary data or execute expensive code to prevent draining a user’s data or battery.

These features map roughly to the following metrics:

  • TTFB: Time to First Byte – The time between clicking a link and the first bit of content coming in.
  • LCP: Largest Contentful Paint – The time when the requested article becomes visible. Google recommends keeping this value under 2.5 seconds.
  • TTI: Time To Interactive – The time at which a page becomes interactive (a user can scroll, click, etc.).
  • Bundle size – The total number of bytes downloaded and code executed before the page became fully visible and interactive.

We will revisit these metrics to better understand how various rendering paths may affect each one of them.

Next, let’s understand different render paths available to React developers.

Render Paths

We can render a React application in the browser or on the server and produce varying outputs.

Two functions change significantly between client- and server-side rendered apps, namely routing and code splitting. We take a closer look at these below.

Client-side Rendering (CSR)

Client-side rendering is the default render path for a React SPA. The server will serve a shell app that doesn’t contain any content. Once the browser downloads, parses, and executes included JavaScript sources, the HTML content is populated or rendered.

The routing function is handled by the client app by managing the browser history. This means that the same HTML file is served irrespective of which route was requested, and the client updates its view state after it is rendered.

Code splitting is relatively straightforward. You can split your code using dynamic imports or React.lazy such that only needed dependencies are loaded based on route or user actions.

If the page needs to fetch data from the server to render content—say, a blog title or product description—it can do so only when the relevant components are mounted and rendered.

The user will most likely see a “Loading data” sign or indicator while the website fetches additional data.

Client-side Rendering With Bootstrapped Data (CSRB)

Consider the same scenario as CSR but instead of fetching data after the DOM is rendered, let’s say that the server sent relevant data bootstrapped inside served HTML.

We could include a node that looks something like this:

<script id="data" type="application/json">
      {"title": "My blog title", "comments":["comment 1","comment 2"]}
</script>

And parse it when the component mounts:

var data = JSON.parse(document.getElementById('data').innerHTML);

We just saved ourselves a round trip to the server. We will see the tradeoffs in a bit.

Server-side Rendering to Static Content (SSRS)

Imagine a scenario where we need to generate HTML on the fly.

For example, if we are building an online calculator and the user issues a query of the sort /calculate/34+15 (leaving out URL escaping). We need to process the query, evaluate the result, and respond with generated HTML.

Our generated HTML is quite simple in structure and we don’t need React to manage and manipulate the DOM once the generated HTML is served.

So we are just serving HTML and CSS content. You can use the renderToStaticMarkup method to achieve this.

The routing will be entirely handled by the server as it needs to recompute HTML for each result, although CDN caching can be used to serve responses faster. CSS files can also be cached by the browser for faster subsequent page loads.

Server-side Rendering with Rehydration (SSRH)

Imagine the same scenario as the one described above but this time we need a fully functional React application on the client.

We are going to perform the first render on the server and send back HTML content along with the JavaScript files. React will rehydrate the server-rendered markup and the application will behave like a CSR application from this point on.

React provides built-in methods to perform these actions.

The first request is handled by the server and subsequent renders are handled by the client. Therefore, such apps are called universal React apps (rendered on both server and client). The code to handle routing may be split (or duplicated) on client and server.

Code splitting is also a bit tricky since ReactDOMServer does not support React. lazy, so you may have to use something like Loadable Components.

It should also be noted that ReactDOMServer only performs a shallow render. In other words, although the render method for your components will be invoked, the life-cycle methods like componentDidMount will not be called. So you will need to refactor your code to provide data to your components using an alternate method.

This is where frameworks like NextJS make an appearance. They mask the complexities associated with routing and code splitting in SSRH and provide a smoother developer experience.

This approach yields mixed results when it comes to page performance as we will notice in a bit.

Pre-rendering to Static Content (PRS)

What if we could render a webpage before a user requests it? This could be done at build time or dynamically when the data changes.

We can then cache the resulting HTML content on a CDN and serve it much faster when a user requests it.

This is called pre-rendering before we render the content, pre-user request. This approach can be used for blogs and e-commerce applications since their content typically does not depend on data supplied by the user.

Pre-rendering with Rehydration (PRH)

We may want our pre-rendered HTML to be a fully functional React app when a client renders it.

After the first request is served, the application will behave like a standard React app. This mode is similar to SSRH, described above, in terms of routing and code-splitting functions.

Performance Matrix

The moment you have been waiting for has finally arrived. It’s time for a showdown. Let’s look at how each of these rendering paths affects web performance metrics and determine the winner.

In this matrix, we assign a score to each rendering path based on how well it does in a performance metric.

The score ranges from 1 to 5:

  • 1 = Unsatisfactory
  • 2 = Poor
  • 3 = Moderate
  • 4 = Good
  • 5 = Excellent
  TTFB
Time to first byte
LCP
Largest contentful paint
TTI
Time to interactive
Bundle Size
Total
CSR 5
HTML can be cached on a CDN
1
Multiple trips to the server to fetch HTML and data
2
Data fetching + JS execution delays
2
All JS dependencies need to be loaded before render
10
CSRB 4
HTML can cached given it does not depend on request data
3
Data is loaded with application
3
JS must be fetched, parsed, and executed before interactive
2
All JS dependencies need to be loaded before render
12
SSRS 3
HTML is generated on each request and not cached
5
No JS payload or async operations
5
Page is interactive immediately after first paint
5
Contains only essential static content
18
SSRH 3
HTML is generated on each request and not cached
4
First render will be faster because the server rendered the first pass
2
Slower because JS needs to hydrate DOM after first HTML parse + paint
1
Rendered HTML + JS dependencies need to be downloaded
10
PRS 5
HTML is cached on a CDN
5
No JS payload or async operations
5
Page is interactive immediately after first paint
5
Contains only essential static content
20
PRH 5
HTML is cached on a CDN
4
First render will be faster because the server rendered the first pass
2
Slower because JS needs to hydrate DOM after first HTML parse + paint
1
Rendered HTML + JS dependencies need to be downloaded
12

Key Takeaways

Pre-rendering to static content (PRS) leads to highest-performing websites while server-side rendering with hydration (SSRH) or client-side rendering (CSR) may lead to underwhelming results.

It’s also possible to adopt multiple approaches for different parts of the website. For example, these performance metrics may be critical for public-facing webpages so they can be indexed more efficiently while they may matter less once a user has logged in and sees private account data.

Each render path represents tradeoffs in where and how you want to process your data. The important thing is that an engineering team is able to clearly see and discuss these tradeoffs and choose an architecture that maximizes the happiness of their users.

Further Reading and Considerations

While I tried to cover the currently popular techniques, this is not an exhaustive analysis. I highly recommend reading this article where developers from Google discuss other advanced techniques like streaming server rendering, trisomorphic rendering, and dynamic rendering (serving different responses to crawlers and users).

Some other factors you need to consider while building content-heavy websites include the need for a good content management system (CMS) for your authors, and the ability to easily generate/modify social media previews and optimize images for varying screen sizes.

Understanding the basics

Is React good for SEO?

React is a JavaScript framework developed to build interactive and modular UIs. SEO is not a design goal of React but content websites built with React can be optimized to achieve better indexing and ranking.

Why should I care about SEO optimizing my React app?

Not all React apps need to be SEO optimized. If you are building a content-heavy website on React—for example, a real estate listings website, an e-commerce website, or a blog—you will benefit by optimizing your website for better rankings.

What is the difference between a single page application (SPA) and a website?

A single page application serves an app shell (empty HTML), which is then populated or “rendered” by JavaScript. All subsequent navigations only fetch relevant views and data while the app shell stays the same. A conventional website serves meaningful HTML content, which is then made interactive by JavaScript. All subsequent navigations load an entirely new page.

Is server-side rendering better than client-side rendering?

It depends. For example, server-side rendering product pages for customer-facing e-commerce pages may lead to better rankings and conversion rates, although SSR may not be required for seller-facing product upload pages. Most large-scale websites would benefit from a hybrid approach.