Hacker News new | ask | show | jobs
by japaw 3806 days ago
Are there any large publicly available archives of usenet one can download?

I need it for an information retrieval research project. I am aware of gmane.org, but do not think they allow bulk download.

1 comments

Please use the bittorrent download option if possible. It reduces load on archive.org.
If there are any seeds, sure. Torrents are most useful for new, big, popular items.
IA torrent files use the archive as web seeds if they have to. But if there's a spike in interest - like right now apparently - it would reduce the load. So it will still work if there are no seeds, and it will reduce load on the servers when that's possible.

Edit: this is all just to be polite, since the archive is not worried about using a ton of bandwidth.

Is the code that runs the frontend of the IA open source? I'd be interested in contributing to that, so when requests for certain objects are creating excessive load, a response status code is provided to indicate so, and the alternate URI returned is a magnet link for the object.

EDIT: It appears an HTTP 303 status code accomplishes this

503, we return the standard 503 code. Remember that most of our users don't know what BitTorrent is, and would prefer that the archive Just Worked.