Hacker News new | ask | show | jobs
by andai 815 days ago
Does anyone know how an entire website can be restored from Wayback Machine? A beloved website of mine had its database deleted. Everything's on Internet Archive, but I think I'd have to

(1) scrape it manually (they don't seem to let you download an entire site?),

(2) write some python magic to fix the css URLs etc so the site can be reuploaded (and maybe add .html to the URLs? Or just make everything a folder with index.html...)

It seems like a fairly common use case but I barely found functional scrapers, let alone anything designed to restore the original content in a useful form.

2 comments

I bet the ArchiveTeam might be able to help you out with this. They were quite helpful when I wanted to make sure a site was preserved, and might be able to help you as well, or at least point you in the right direction. https://wiki.archiveteam.org/