Hacker News new | ask | show | jobs
by chrisacky 1868 days ago
Is there a more recent common crawl data set? 2019 is a long time away.

Reason I ask is I'm trying to get all subdomain a of a certain domain. So I want a reverse host of unique hostnames under a certain domain.

1 comments

There are more recent versions of the dataset. We used the february/march snapshot from this year and the April snapshot just came out (https://commoncrawl.org/2021/04/april-2021-crawl-archive-now...).