|
|
|
|
|
by luckylion
1913 days ago
|
|
I usually recommend setting only Google/Bing/Yandex/Baidu etc to Allow and everything else to Disallow. Yes, the bad bots don't give a fuck, but even the non-malicious bots (ahrefs, moz, some university's search engine etc) don't bring any value to me as a site owner, take up band width and resources and fill up logs. If you can remove them with three lines in your robots.txt, that's less noise. Especially universities do, in my opinion, often behave badly and are uncooperative when you point out their throttling does not work and they're hammering your server. Giving them a "Go Away, You Are Not Wanted Here" in a robots.txt works for most, and the rest just gets blocked. |
|
Why can't you just ratelimit IPs that are "too active" for your server to handle?