In this article, we will focus on how to optimize JS-websites for Google (although Bing also recommends the same solution, dynamic rendering).
The content of this article includes:
2. Client-side and server-side rendering
3. How Google crawls websites
4. How to detect client-side rendered content
5. The solutions: Hybrid rendering and dynamic rendering
They open a large range of possibilities in terms of client-side rendering (like allowing the page to be rendered by the browser instead of the server), page load capabilities, dynamic-content, user-interaction, and extended functionalities.
A lot of questions to answer. So where should an SEO start?
Below are key guidelines to the optimization of JS-websites, to enable the usage of these frameworks while keeping the search engine bots happy.
Probably the most important pieces of knowledge all SEOs need when they have to cope with JS-powered websites is the concepts of client-side and server-side rendering.
Understanding the differences, benefits, and disadvantages of both are critical to deploying the right SEO strategy and not getting lost when speaking with software engineers (who eventually are the ones in charge of implementing that strategy).
Let’s look at how Googlebot crawls and indexes pages, putting it as a very basic sequential process:
1. The client (web browser) places several requests to the server, in order to download all the necessary information that will eventually display the page. Usually, the very first request concerns the static HTML document.
2. The CSS and JS files, referred to by the HTML document, are then downloaded: these are the styles, scripts and services that contribute to generating the page.
4. Caffeine (Google’s indexer) indexes the content found
New links are discovered within the content for further crawling
This is the theory, but in the real world, Google doesn’t have infinite resources and has to do some prioritization in the crawling.
Google is a very smart search engine with very smart crawlers.
For this reason, the way Google crawls JS-powered websites is still far from perfect, with blind spots that SEOs and software engineers need to mitigate somehow.
This is in a nutshell how Google actually crawls these sites:
In Tom Greenaway’s words:
Implications for SEO are huge, your content may not be discovered until one, two or even five weeks later, and in the meantime, only your content-less page would be assessed and ranked by the algorithm.
What an SEO should be most worried about at this point is this simple equation:
No content is found = Content is (probably) hardly indexable
And how would a content-less page rank? Easy to guess for any SEO.
So far so good. The next step is learning if the content is rendered client-side or server-side (without asking software engineers).
There are several ways to know it, and for this, we need to introduce the concept of DOM.
The Document Object Model defines the structure of an HTML (or an XML) document, and how such documents can be accessed and manipulated.
In SEO and software engineering we usually refer to the DOM as the final HTML document rendered by the browser, as opposed to the original static HTML document that lives in the server.
You can think of the HTML as the trunk of a tree. You can add branches, leaves, flowers, and fruits to it (that is the DOM).
In practice, you can check the static HTML by pressing “Ctrl+U” on any page you are looking at, and the DOM by “Inspecting” the page once it’s fully loaded.
Most of the times, for modern websites, you will see that the two documents are quite different.
Any URL you browse with this profile will not load any JS content. Therefore, any blank spot in your page identifies a piece of content that is served client-side.
Provided that your website is registered in Google Search Console (I can’t think of any good reason why it wouldn’t be), use the “Fetch as Google” tool in the old version of the console. This will return a rendering of how Googlebot sees the page and a rendering of how a normal user sees it. Many differences there?
Google officially stated in early 2018 that they use an older version of Chrome (specifically version 41, which anyone can download from here) in headless mode to render websites. The main implication is that a page that doesn’t render well in that version of Chrome can be subject to some crawling-oriented problems.
After all these checks, still, ask your software engineers because you don’t want to leave any loose ends.
Asking a software engineer to roll back a piece of great development work because it hurts SEO can be a difficult task.
It happens frequently that SEOs are not involved in the development process, and they are called in only when the whole infrastructure is in place.
We SEOs should all work on improving our relationship with software engineers and make them aware of the huge implications that any innovation can have on SEO.
This is how a problem like content-less pages can be avoided from the get-go. The solution resides on two approaches.
Hybrid rendering suggests the following:
This approach aims to detect requests placed by a bot vs the ones placed by a user and serves the page accordingly.
Combining the two solutions can also provide great benefit to both users and bots.
However, the SEO issues raised by client-side rendering solutions can be successfully tackled in different ways using hybrid rendering and dynamic rendering.