Hacker News new | ask | show | jobs
by froo 4580 days ago

  s3cmd ls s3://aws-publicdatasets/common-crawl/crawl-data/CC-MAIN-2013-20/segments/
That should get you about 90% on your way.