Hacker News new | ask | show | jobs
by LisaG 4854 days ago
I hope that some of you who use/play around with the Common Crawl data will try out using the JSON files from the URL Search and then share your code.

If you didn't see the details in the blog post, Common Crawl is giving out $100 in AWS credit to the first five people who share code that incorporates a JSON file from the URL Search.

1 comments

Is it possible to get a list of webhosts, like all the domains and subdomains, stripped from the rest of the url?