Too many (il)legitimate bots on my website (and high load)

Adjusting the crawl rate for Google and other bots won’t affect ranking, just how long it takes them to index your entire website. If you had 1,000 pages on your website, a search engine could potentially index your entire site in a few minutes.

However this could cause high system resource usage with all of those pages loaded in a short time period.

Reducing crawling (general)

A crawl-delay of 30 seconds would allow crawlers to index your entire 1,000 page website in just 8.3 hours

So while some bots won't honor it, many others will... so it's probably a good idea to put this on top of your robots.txt

User-agent: * 
Crawl-delay: 10

or

User-agent: * 
Crawl-delay: 30

In general this will also help a lot for less friendly bots!

Googlebot

Googlebot is one of the few exceptions which doesn't honor this directive... 

So for Googlebot we highly recommand going to the Google webmaster tools -> Site Settings -> Limit Google’s maximum crawl rate -> 30 Seconds

UPDATE: This method no longer works and Google claims to have improved their algorithm, more info about still filing a request if needed can be found here: https://developers.google.com/search/docs/crawling-indexing/reduce-crawl-rate

AmazonBot

Lately we see extremely excessive crawling from Amazonbot.

Unfortunately AmazonBot does not support the crawl-delay directive in robots.txt either, but since this bot seems to have no real use for your site (See: https://developer.amazon.com/support/amazonbot), we would recommend to completely block it:

User-agent: Amazonbot 
Disallow: / 


Was this article helpful?

mood_bad Dislike 3
mood Like 3
visibility Views: 5979