Hacker News new | ask | show | jobs
by res0nat0r 4695 days ago
The data is freely available: http://aws.amazon.com/datasets/41740

and you just need to comply with the Common Crawl TOU: http://commoncrawl.org/about/terms-of-use/