Hacker News new | ask | show | jobs
by speedgoose 1409 days ago
You can also use the common crawl dataset.

https://commoncrawl.org/