Hacker News new | ask | show | jobs
by Khelavaster 1087 days ago
The archive needs backed up asap before it disappears!
2 comments

Came here to say this. My guess is that Amazon paid them to go away. If my guess is accurate ( I could certainly be wrong ), then Amazon could have them add a robots.txt banning archive.org. If they do that access to the archive will be removed. Mirror it now if you want the content.

One nice way to do so ( handy for any site that you think may vanish off Way Back Machine ): https://github.com/hartator/wayback-machine-downloader

I got it. A little over 900 posts. What do we think... Host on Github to ensure GPT gets its training data?
That’s a good start gitlab or GitHub either way, drop a link here if you can