|
|
|
|
|
by mrweasel
424 days ago
|
|
> I am kind of surprised how many sites seem to want/need this. The AI scrapers are not only poorly written, they also go out of their way to do cache busting. So far I've seen a few solutions, CloudFlare, require a login, Anubis, or just insane amounts of infrastructure. Some site have reported 60% of their traffic coming from bots not, smaller sites is probably much higher. |
|
My guess is that these tools tend to be targeted at mid-sized sites — the sorts of places that are large enough to have useful content, but small enough that there probably won't be any significant repercussions, and where the ops team is small enough (or plain nonexistent) that there's not going to be much in the way of blocks. That's why a site like SourceHut gets hit quite badly, but smaller blogs stay largely out of the way.
But that's just a working theory without much evidence trying to justify why I'm hearing so many people talking about struggling with AI bot traffic and not seeing it myself.