Hacker News new | ask | show | jobs
by LunaSea 1612 days ago
Common Crawl is missing far too many URLs for it to be useful in a real world scenario.
1 comments

But can't you add to their index?
No. You can add to the Wayback Machine at web.archive.org via their "save page now" interface... Common Crawl is attempting to be a sample of the web, and doesn't take url suggestions.