Google Clarifies Googlebot Crawl and Index the First 15MB of HTML Content Per Page
Google is notorious for rolling out updates that can create a lot of worry for marketers. These updates come about suddenly and impact each website differently. For some, it could be a reward, while others might have to work for a long time to recover their rankings and traffic.
So, when the 15 MB Googlebot limit came, there was indeed some panic. Everyone was confused, thinking is it a new google update?
However, the truth is something else. Recently, the search engine added a line to its documentation stating how the Googlebot crawls the first 15 MB of the content of a raw HTML file, and after that, it stops crawling.
Naturally, some SEOs began to panic thinking that the limit was too low. Being the HTML source code of a URL it does not include downloading videos, images, etc. On the contrary, 15MB of a raw HTML file is a huge limit.
Taking control of the situation, Google’s Gary Illyes released a blog titled Googlebot And The 15 MB Thing. Explaining in his blog, he mentioned that the threshold is not something new. It has been around for years, it’s just that Google is explicitly stating it now.
Moreover, there’s nothing to panic about because there are very few pages on the world wide web bigger than that. However, if the file is bigger than 15 MB, then an individual can move some inline scripts and CSS to external files to compress the size of the file.
Thus, it’s clear now that the first 15 MB of raw HTML file gets forwarded for indexing. To know the size of your page, kindly follow the mentioned steps:
- Load the page and then launch Developer tools.
- Switch to the Network tab.
- Reload the page.
- You should start seeing the requests your browser makes to load the page.
- The top request is what you should be looking for with the byte size of the page in the Size column.
- Alternatively, you can also use URL from a command line.
Google’s John Mueller also jumped on this topic and clarified the confusion with a series of tweets. He added a bunch of novels to show the KB size of their HTML files. Adding those would fill up the 15MB criteria of a single page.
Honestly, that much content on a single page is next to impossible. Here are some tweets from John Mueller:
The Strange Case of Dr. Jekyll and Mr. Hyde by Robert Louis Stevenson, and on top (or bottom?) of all that:
War and Peace by graf Leo Tolstoy.Now, add the content you want to rank for. pic.twitter.com/2dP6otIV9I
— John 🧀 … 🧀 (@JohnMu) June 28, 2022
You can check the size of any page on the internet by going there, and looking at the developer tools in your browser. Or you can use a cool tool like https://t.co/CLRJkz732J which gives you the full size in a nice UI.
— John 🧀 … 🧀 (@JohnMu) June 28, 2022
In conclusion, don’t be bothered by Google’s 15 MB limit, just the way you were before the documentation update.