Hacker News new | ask | show | jobs
by btrettel 2321 days ago
Academics should make their databases easily archivable then.

One thing I don't like about COS preprint sites is how much they rely on JavaScript. I'm not sure the Wayback Machine can archive them correctly. It might take a specialized bot.

The Wayback Machine doesn't work right at all on one of my own COS preprints: https://web.archive.org/web/20200211103826/https://engrxiv.o...

Maybe I should contact ArchiveTeam...

1 comments

I'm the director of engrXiv. It would be great to have a reliable archive of the server contents separate from COS. It would be pretty easy to scrape all of the files. There is a regularly updated CSV of all of the contents of engrXiv here: https://osf.io/ns9yr/

The files can be downloaded directly by adding '/download' to the engrxiv URL for each preprint.