Hacker News new | ask | show | jobs
by dpark 52 days ago
What the hell. Kernel.org is mining in my browser to “make sure I’m not a bot”?
1 comments

Preventing scrapers from grabbing every single commit and feeding it to train an AI. (Instead of just, you know, cloning the repo and using the clone to feed the AI).

Is this really the first time you've encountered the Anubis anti-scraper system? It's been everywhere the past few years, because so many of those scraper bots are incredibly lazily programmed. Many discussions here on HN have included people commenting on scraper bots hitting every single commit page, diff page, etc. on their self-hosted forges, burning up lots of CPU time and bandwidth to serve them what they could have just gotten by cloning the repo if the bot's programming was slightly more intelligent.

This is the first time I have encountered this, yes.

LLM training has really been a drain on the general internet :(

It's been showing up on just about any code forge that isn't GitHub, so odds are good your browser has encountered it but you didn't notice the Anubis screen until now (it often goes by so quickly you'll miss it if you blink, though that depends on the site).