Y
Hacker News
new
|
ask
|
show
|
jobs
by
res0nat0r
4695 days ago
The data is freely available:
http://aws.amazon.com/datasets/41740
and you just need to comply with the Common Crawl TOU:
http://commoncrawl.org/about/terms-of-use/