Y
Hacker News
new
|
ask
|
show
|
jobs
by
froo
4580 days ago
s3cmd ls s3://aws-publicdatasets/common-crawl/crawl-data/CC-MAIN-2013-20/segments/
That should get you about 90% on your way.