Keep in mind that the website author has the ability to delete content from archive.org (and I've seen a pretty significant website for a community I'm a part of do so when it shut down, ostensibly for "GDPR compliance" reasons.)
All it takes is a robot.txt file with the right entries in it.
We need something like archive.org, but which reflects the wishes of historians and preservationists, not paranoid website owners.
Do we need that? I feel like the right to be forgotten needs to be protected too. I’m already uncomfortable with the extent to which everything I write can live forever
> It's on my to-do list to automatically mirror all HN submissions as well
I recently emailed that suggestion to dang as well - both Web Archive and Archive Today (aka archive.is/ws/ph/wh/whatever), as the former is more likely to stick around and the latter is better than 12ft at bypassing paywalls. It's on their to-do list as well.
I'm a huge side project procrastinator, but if this comes after HN performance improvements on dang's to-do list I feel like I might beat them to it ;)
All it takes is a robot.txt file with the right entries in it.
We need something like archive.org, but which reflects the wishes of historians and preservationists, not paranoid website owners.