Hacker News new | ask | show | jobs
by dcposch 3550 days ago
The Archive is awesome, but the author's sensationalist description of what they do isn't really accurate.

For the most part, archive.org is not rushing in to save stuff that's about to be deleted.

Instead they are crawling the web 24/7, patiently maintaining a historical record.

Check out http://oldweb.today

It is amazing

3 comments

What the Archive Team saves does get uploaded to the Internet Archive, but they aren't officially part of it. I think Maciej's description of the Archive Team is accurate - they are the archivists of last resort. When a commercial service is about to disappear forever, they're the ones that spring into action and rescue as much data as possible. If companies and the people that comprise them cared enough about their users' data, there would be no need for the Archive Team.
> For the most part, archive.org is not rushing in to save stuff that's about to be deleted.

Parent mentioned both the Internet Archive and Archive Team. You're right about the Internet Archive, but "rushing in to save stuff that's about to be deleted" is a pretty apt description of most of Archive Team's activity.

"is not rushing in to save stuff that's about to be deleted" > I took the burning building metaphor as as a somewhat fanciful, but otherwise accurate description of the natural state of the web -- a series of loosely linked html pages that could disappear at any minute (and often do) as soon as the hosting expires, the author stops maintaining them, or the company reorganizes, etc. As a simple exercise, go browse a popular blog from 2008 or so and count the broken links.