Hacker News new | ask | show | jobs
by superkuh 695 days ago
>That seems overly reductive.

I qualified my statement but you've chosen to ignore that. I've been paying attention to the Anthropic bots closely for a (relatively) long time and this mastodon group's problems come as a surprise to me based off that lived experience. I don't doubt the truth of their claims. I looked at https://cdn.fosstodon.org/media_attachments/files/112/877/47... and I see the bandwidth used. But like I said,

>I'm not saying my dozens of run-ins with the anthropic bots (there have been 3 variations I've seen so far) are totally representative,

My take here is that their one limited experience also isn't representative and others are projecting it on to the entire project due to a shifting cultural perception that "scraping" is something weird and bad to be stopped. But it's not. If it were me I'd be checking my webserver config to be sure robots.txt is actually being violated. And I'd check my set per user-agent bandwidth limits in nginx to make sure they matched. That'd solve it. I'm sure the mastodon software has better solutions even if they haven't solved their own DDoS generating problem since 2017 (ref: https://github.com/mastodon/mastodon/issues/4486)

1 comments

> their one limited experience

Their report is just the latest in a string from different entities. It's not just one limited experience.

Ah, thanks, I could only read the main/single mastodon post properly since mastodon v4 is javascript only and I was reading the HTML source meta-content field. This does not show replies or linked postings.