Hacker News new | ask | show | jobs
by Asparagirl 3552 days ago
> I’ve saluted the efforts of Archive Team and the Internet Archive, but their activity is like having a museum curator that rides around in a fire truck, looking for burning buildings to pull antiques from. It's heroic, it's admirable, but it’s no way to run a culture.

...but in the meantime, here's an obligatory and shameless plug for donating to the Internet Archive[1] (tax-deductible in the US), or better yet making a recurring monthly donation so they can more accurately forecast revenue for the year, or better still getting your employer to make a nice big donation to this crucial bit of Internet memorybanks.

And as for Archive Team, we're always looking for a few good geeks.[2] Run an instance of the Warrior on spare cloud servers, or help patch and ship code at GitHub.[3]

[1] http://archive.org/donate/

[2] http://archiveteam.org/index.php?title=Main_Page

[3] https://github.com/ArchiveTeam/ArchiveBot

2 comments

The Archive is awesome, but the author's sensationalist description of what they do isn't really accurate.

For the most part, archive.org is not rushing in to save stuff that's about to be deleted.

Instead they are crawling the web 24/7, patiently maintaining a historical record.

Check out http://oldweb.today

It is amazing

What the Archive Team saves does get uploaded to the Internet Archive, but they aren't officially part of it. I think Maciej's description of the Archive Team is accurate - they are the archivists of last resort. When a commercial service is about to disappear forever, they're the ones that spring into action and rescue as much data as possible. If companies and the people that comprise them cared enough about their users' data, there would be no need for the Archive Team.
> For the most part, archive.org is not rushing in to save stuff that's about to be deleted.

Parent mentioned both the Internet Archive and Archive Team. You're right about the Internet Archive, but "rushing in to save stuff that's about to be deleted" is a pretty apt description of most of Archive Team's activity.

"is not rushing in to save stuff that's about to be deleted" > I took the burning building metaphor as as a somewhat fanciful, but otherwise accurate description of the natural state of the web -- a series of loosely linked html pages that could disappear at any minute (and often do) as soon as the hosting expires, the author stops maintaining them, or the company reorganizes, etc. As a simple exercise, go browse a popular blog from 2008 or so and count the broken links.
I've got a VPS sitting around doing very little – what's the easiest way to get started?
Run a Warrior! Many flavors available: VirtualBox, Dockerfile, AMI (for Amazon EC2), you name it.

http://archiveteam.org/index.php?title=ArchiveTeam_Warrior

It would deeply unethical for me to point out that you could also run the Warrior on free server space that your company might not notice, kind of like the karmic inverse of a bitcoin miner. Deeply unethical. So I won't mention it.