Technical SEO For Your Crawl Budget
When optimising big websites (over 100,000 URLs) technical SEO starts to become the most important component of your search strategy. Ignoring the ‘under-the-hood’ workings of a website and how these workings affect search engines’ ability to discover, index and rank pages, will result in high ‘technical debt’ which takes a lot of time and effort to pay off.
Getting a forensic SEO audit done on your website will bring to light and prioritise any issues that should be fixed and mitigate any risks related to black-hat SEO. It’s a worthwhile investment.
One action item that will doubtless come up in an audit is that of crawl budget.
Understanding Crawl Budget
Simply put, crawl budget is the number of times a search engine spider crawls your website in a certain time frame. For example, Googlebot typically crawls one of our client’s sites 117,240 times a month, therefore our client’s monthly crawl budget is 117,240 pages per month – that’s a decent budget, but needs to be managed correctly.
Each website’s crawl budget is different based on two factors:
1. Crawl Rate Limit
“Googlebot is designed to be a good citizen of the web. Crawling is its main
priority, while making sure it doesn’t degrade the experience of users visiting a site. Google calls this the ‘crawl rate limit’ which limits the maximum fetching rate for a given site.” – Google
The crawl rate can increase (good) or decrease (bad) based on a couple of factors:
• Site speed: if Googlebot can download (or crawl) a site quickly the crawl rate limit increases, which is good. If a site is slow or responds with server errors the crawl rate limit decreases, which is bad.
• Limit set in Search Console: website owners can reduce Googlebot’s crawling of their site. Note that setting higher limits doesn’t automatically increase crawling.
2. Crawl Demand
“Even if the crawl rate limit isn’t reached, if there’s no demand from indexing, there will be low activity from Googlebot.” – Google
The three factors that play a significant role in determining crawl demand are:
• Popularity: URLs that are more popular on the Internet tend to be crawled more often to keep them fresher in Google’s index
• Staleness: Google attempts to prevent URLs from becoming stale in the index
• Site-wide events like site migrations may trigger an increase in crawl demand to re- index the content under the new URLs.
3 Tips for Optimising Crawl Budget
I use Screaming Frog SEO Spider to simulate how Googlebot crawls a website and Google’s Search Console to help identify issues affecting crawl budget.
1. Investigate Page Load Speed
The first thing to do is check the arch nemesis of crawl rate limit: page load speed!
The above screenshot is from Google’s Search Console (formally Webmaster Tools) under Crawl Stats Time spent downloading a page. The sad face is pointing to an increase in the time it took Googlebot to crawl the site which is waaaaay over the site average and is eating up crawl budget.
We need to find out which pages could be at fault here. We can check this within Google Analytics under Behavior ‘Site Speed’ Page Timings:
The page with 83.50% slower page load time could be one of the culprits. So, we need to look at ways to reduce the page load time for that page and pages like it, this will result in a more efficient crawl.
2. Manage URL Parameters
The problem of URL parameters is common within ecommerce sites and listing sites which generate lots of dynamic URLs that load the same content (filters, search results etc.) By default, Googlebot will treat these URLs as separate pages, this increases the risk of on-site
duplication which can waste crawl budget. Disallowing URL parameters and search results from being crawled through the robots.txt file is recommended:
Some website’s have URL parameters that do not influence the content of the pages, in this case, make sure you let Googlebot know about it by adding these parameters in your Google Search Console account, under Crawl -> URL Parameters (this is an advanced feature within Search Console, an SEO audit will tell you if this step is necessary for your site).
3. Fix HTTP and Redirect Errors
If your website has less than 100,000 URLs you really don’t need to worry about crawl budget at all and doing so would be a waste of your time and resources. However, if you have a big site or a site that auto-generates pages based on URL parameters (like listing sites or ecommerce) then auditing and prioritising your site’s crawl budget is a must.