|
|
|
|
|
by rmunn
59 days ago
|
|
Preventing scrapers from grabbing every single commit and feeding it to train an AI. (Instead of just, you know, cloning the repo and using the clone to feed the AI). Is this really the first time you've encountered the Anubis anti-scraper system? It's been everywhere the past few years, because so many of those scraper bots are incredibly lazily programmed. Many discussions here on HN have included people commenting on scraper bots hitting every single commit page, diff page, etc. on their self-hosted forges, burning up lots of CPU time and bandwidth to serve them what they could have just gotten by cloning the repo if the bot's programming was slightly more intelligent. |
|
LLM training has really been a drain on the general internet :(