Hacker News new | ask | show | jobs
by wlonkly 82 days ago
ArchiveTeam (which is not the Internet Archive) aggressively crawls websites because they care a lot, because the website in question is about to go away.

Heck, I'd say as caring goes, ArchiveTeam cares more than the owners of the website, because in the ideal shutdown, the owners provide the data instead of forcing people to scrape it if they want to retain it after the site shuts down.

1 comments

They also crawl aggressively when the site is not in danger. They crawled my MediaWiki because someone else input the site in their bot and it overloaded the PHP process. I know that archiving is important but please, not like this.
“Their bot” is a software anyone can run.
So it's... their bot