Hacker News new | ask | show | jobs
by amelius 1119 days ago
Can't they pull the data from archive.org?
3 comments

Archive.org was knocked offline the other day due to some AI startup scraping it to death. It’s not a good thing.
Source, they don’t rate limit
True - and their lack of rate limiting ended up letting someone overwhelm their servers, knocking them offline.
They put out a blog asking people not to scrape afterwards. A simple google will be much fast than asking for sources.
Archive.org is a non-profit without the capacity to serve that many requests. An excellent resource for people to use carefully, but not a treasure trove for bots to scrape down to the last bit.
Would be cool if they introduce some reasonably priced access for mass scrapers. Should make some nice income in addition to donations, and a valuable service to community.
That would be worse.