|
|
|
|
|
by captainmuon
387 days ago
|
|
As somebody who does some scraping / crawling for legitimate uses, I'm really unhappy with this development. I understand people have valid cases why they don't want their content scraped. Maybe they want to sell it - I can understand that, although I don't like it. Maybe they are opposed to it for fundamental reasons. I for one would like my content to be spread maximally. I want my arguments to be incorporated into AIs, so I can reach more people. But of course that is just me when I'd write certain content, others have different goals. It gets annoying when you have the right to scrape something - either because the owner of the data gave you the OK or because it is openly licensed. But then the webmaster can't be bothered to relax the rate limiter for you, and nobody can give you a nice API. Now people are putting their Open Educational Resources, their open source software, even their freaking essays about openness that they want the world to read behind Anubis. It makes me shake my head. I understand perfectly it is annoying when badly written bots hammer your site. But maybe then HTTP and those bots are the problem. Maybe we should make it easier for site owners to push their content somewhere where we can scrape it easier? |
|