Hacker News new | ask | show | jobs
by lionelholt 701 days ago
The decision to block bots is not always about protecting intellectual property. A practical consideration I haven't seen mentioned is that some of these AI bots are stupidly aggressive with their requests, even ignoring robots.txt. I had to activate Cloudflare WAF and block a variety of bots to prevent my web app servers from crashing. At least they're reasonable enough to identify themselves!
2 comments

Those aggressive bot....

Crawls each and every date link on my synology dsm hosted calendar without throttling.

yeah, we had a bunch of them crawling out git repositories in a very aggressive way, repeating the crawl within a few days, etc. etc. 403 to the lot of them, regardless of the bot's purpose.
Whats the deal with (Amazon) bots crawling private Gitlabs aggressively?