August 1st, 2024

How Google handles JavaScript throughout the indexing process

A study analyzed Googlebot's interaction with JavaScript-heavy sites, revealing it successfully renders all HTML pages and debunking myths about its JavaScript handling, emphasizing modern rendering capabilities and SEO optimization insights.

Read original articleLink Icon
CuriositySkepticismFrustration
How Google handles JavaScript throughout the indexing process

search engines. This research aimed to clarify this misconception by analyzing how Googlebot interacts with JavaScript-heavy sites. The study involved over 100,000 fetches from Googlebot, focusing on the rendering success of various pages on nextjs.org, which employs a mix of rendering techniques. The findings revealed that Googlebot successfully rendered all HTML pages, including those with complex JavaScript interactions, and indexed content loaded asynchronously. The research also debunked several myths about Google's handling of JavaScript, including the belief that Google cannot render JavaScript content and that it treats JavaScript pages differently. The analysis showed that Googlebot's rendering capabilities have evolved significantly, now utilizing an up-to-date version of Chrome, which allows it to process modern JavaScript features effectively. The study also examined the impact of rendering queues and timing on SEO, finding that while some pages experienced delays, many were rendered quickly, challenging the notion of a long rendering queue. Additionally, the research indicated that JavaScript-heavy sites do not inherently suffer from slower page discovery. Overall, the study provides valuable insights into optimizing web applications for search engines, emphasizing the importance of understanding Google's current rendering capabilities and debunking outdated beliefs within the SEO community.

Related

Waves of Writing for Google

Waves of Writing for Google

The article explores the evolution of writing for Google, highlighting shifts from keyword stuffing to user-focused content and AI's impact on writing jobs. Writers are advised to adapt, focus on personal branding, and embrace technology for relevance.

Google has been lying about their search results [video]

Google has been lying about their search results [video]

A leak from Google's GitHub shows the search algorithm tracks user clicks and time on pages, raising concerns about search result accuracy, treatment of smaller websites, and SEO strategies.

The Cost of JavaScript

The Cost of JavaScript

JavaScript significantly affects website performance due to download, execution, and parsing costs. Optimizing with strategies like code-splitting, minification, and caching is crucial for faster loading and interactivity, especially on mobile devices. Various techniques enhance JavaScript delivery and page responsiveness.

Google Now Defaults to Not Indexing Your Content

Google Now Defaults to Not Indexing Your Content

Google has changed its indexing to prioritize unique, authoritative, and recognizable content. This selective approach may exclude smaller players, making visibility harder. Content creators face challenges adapting to Google's exclusive indexing, affecting search results.

'Google says I'm a dead physicist': is the biggest search engine broken?

'Google says I'm a dead physicist': is the biggest search engine broken?

Google faces scrutiny over search result accuracy and reliability, with concerns about incorrect information and cluttered interface. Despite dominance in the search market, criticisms persist regarding data privacy and search quality.

AI: What people are saying
The comments reflect a range of opinions on Googlebot's handling of JavaScript-heavy sites and SEO practices.
  • Many commenters share personal experiences with JavaScript and SEO, noting challenges with indexing and crawl budgets, especially at scale.
  • There is skepticism about the article's conclusions, with some arguing that while Google can render JS, it may not rank those pages effectively.
  • Concerns are raised about the impact of bloated JavaScript on SEO and crawl efficiency, advocating for server-side rendering or prerendered HTML.
  • Some commenters express a desire for Google to be more transparent about its SEO policies to reduce confusion and improve practices.
  • There is a call for better practices in web development, emphasizing the need for accessibility and efficient content delivery.
Link Icon 21 comments
By @palmfacehn - 9 months
The rich snippet inspection tool will give you an idea of how Googlebot renders JS.

Although they will happily crawl and render JS heavy content, I strongly suspect bloat negatively impacts the "crawl budget". Although in 2024 this part of the metric is probably much less than overall request latency. If Googlebot can process several orders of magnitude of sanely built pages with the same memory requirement as a single React page, it isn't unreasonable to assume they would economize.

Another consideration would be that "properly" used, a JS heavy page would most likely be an application of some kind on a single URL, whereas purely informative pages, such as blog articles or tables of data would exist on a larger number of URLs. Of course there are always exceptions.

Overall, bloated pages are a bad practice. If you can produce your content as classic "prerendered" HTML and use JS only for interactive content, both bots and users will appreciate you.

HN has already debated the merits of React and other frameworks. Let's not rehash this classic.

By @dlevine - 9 months
I work for a company that enables businesses to drop eCommerce into their websites. When I started, this was done via a script that embedded an iFrame. This wasn't great for SEO, and some competitors started popping up with SEO-optimized products.

Since our core technology is a React app, I realized that we could just mount the React app directly on any path at the customer's domain. I won't get into the exact implementation, but it worked, and our customers' product pages started being indexed just fine. We even ranked competitively with the upstarts who used server-side rendering. We had a prototype in a few months, and then a few months after that we had the version that scaled to 100s of customers.

We then decided to build a new version of our product on Remix (SSR framework similar to nextjs). It required us to basically start over from scratch since most of our technologies weren't compatible with Remix. 2 years later, we still aren't quite done. When all is said and done, I'm really curious to see how this new product SEOs compared to the existing one.

By @jxi - 9 months
I actually worked on this part of the Google Search infrastructure a long time ago. It's just JSC with a bunch of customizations and heuristics tuned for performance to run at a gigantic scale. There's a lot of heuristics to penalize bad sites, and I spent a ton of time debugging engine crashes on ridiculous sites.
By @orenlindsey - 9 months
I really think it would be cool if Google started being more open about their SEO policies. Projects like this use 100,000 sites to try to discover what Google does, when Google could just come right out and say it, and it would save everyone a lot of time and energy.

The same outcome is gonna happen either way, Google will say what their policy is, or people will spend time and bandwidth figuring out their policy. Either way, Google's policy becomes public.

Google could even come out and publish stuff about how to have good SEO, and end all those scammy SEO help sites. Even better, they could actively try to promote good things like less JS when possible and less ads and junk. It would help their brand image and make things better for end users. Win-win.

By @encoderer - 9 months
I did experiments like this in 2018 when I worked at Zillow. This tracks with our findings then, with a big caveat: it gets weird at scale. If you have a very large number of pages (hundreds of thousands or millions) Google doesn’t just give you limitless crawl and indexing. We had js content waiting days after scraping to make it to the index.

Also, competition. In a highly competitive seo environment like US real estate, we were constantly competing with 3 or 4 other well-funded and motivated companies. A couple times we tried going dynamic first with a page we lost rankings. Maybe it’s because fcp was later? I don’t know. Because we ripped it all out and did it server side. We did use NextJs when rebuilding trulia but it’s self hosted and only uses ssr.

By @dheera - 9 months
I actually think intentionally downranking sates that require JavaScript to render static content is not a bad idea. It also impedes accessibility-related plugins trying to extract the content and present it to the user in whatever way is compatible to their needs.

Please only use JavaScript for dynamic stuff.

By @rvnx - 9 months
Strange article, it seems to imply that Google has no problem to index JS-rendered pages, and then the final conclusion is "Client-Side Rendering (CSR), support: Poor / Problematic / Slow"
By @ea016 - 9 months
A really great article. However they tested on nextjs.org only, so it's still possible Google doesn't waste rendering resources on smaller domains
By @globalise83 - 9 months
Would be interested to know how well Google copes with web components, especially those using Shadow DOM to encapsulate styles. Anyone have an insight there?
By @orliesaurus - 9 months
If Google handles front-end JS so well, and the world is basically a customer-centric SEO game to make money - why do we even bother use server side components in Next.js?
By @bbarnett - 9 months
I kinda wish Google would not index JS rendered stuff. The world would be so much better.
By @EcommerceFlow - 9 months
They tested Google's ability to index and render JS, but not how well those sites ranked. I know as an SEO those results would look completely different. When you're creating content to monetize, the thought process is "why risk it?" with JS.
By @TZubiri - 9 months
It's not a coincidence Google developed Chrome. They needed to understand what the fuck they were looking at, so they were developing a JS + DOM parser anyways.
By @azangru - 9 months
Here's Google search team talking about this in a podcast: https://search-off-the-record.libsyn.com/rendering-javascrip...
By @llmblockchain - 9 months
This blog post is akin to Phillip Morris telling us smoking isn't bad. Go on, try it.
By @toastal - 9 months
Just a reminder: there are alternative search engines, & most of those don’t work without JavaScript & should still be a target
By @sylware - 9 months
headless blink like the new BOTs: headless blink with M&K driven by an AI (I don't think google have click farms with real humans like hackers)
By @kawakamimoeki - 9 months
I have noticed that more and more blog posts from yesteryear are not appearing in Google's search results lately. Is there an imbalance between content ratings and website ratings?
By @DataDaemon - 9 months
This is a great auto-promotion article, but everyone knows Googlebot is busy; give him immediate content generated on the server or don't bother Googlebot.