|
|
|
|
|
by superkuh
689 days ago
|
|
I've noticed Anthropic bots in my logs for more than a year now and I welcome them. I'd love for their LLM to be better at what I'm interested in. I run my website off my home connection on a desktop computer and I've never had a problem. I'm not saying my dozens of run-ins with the anthropic bots (there have been 3 variations I've seen so far) are totally representative, but they've been respecting my robots.txt. They even respect extended robots.txt features like, User-agent: *
Disallow: /library/*.pdf$
I make my websites for other people to see. They are not secrets I hoard who's value goes away when copied. The more copies and derivations the better.I guess ideas like creative commons and sharing go away when the smell of money enters the water. Better lock all your text behind paywalls so the evil corporations won't get it. Just be aware, for every incorporated entity you block you're blocking just as many humans with false positives, if not more. This anti-"scraping" hysteria is mostly profit motivated. |
|
That seems overly reductive.
First, it sounds like you're insinuating that the people claiming the bots are causing actual disruption to their operations are lying. If that's your intent, some amount of evidence for that would be welcome.
Second, lots of people don't want their content to be used to train these models for reasons that have nothing whatsoever to do with money. Trying to avoid contributing to the training of these models is not the equivalent of rejecting the idea of the free exchange of information.