| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by GigabyteCoin 4582 days ago

Can anyone give me a quick rundown on how exactly one gains access to all of this data?

I have heard about this project numerous times, and am always dissuaded by the lack of download links/torrents/information on their homepage.

Perhaps I just don't know what I'm looking at?

1 comments

Did you try this?

I haven't tried that one, but I've poked at other of the Amazon Common Datasets collection:

If you're already familiar with using Amazon's virtual servers, it's pretty straightforward.

I also note that the Common Crawl project publishes code here: