Hacker News new | ask | show | jobs
by wumpus 1612 days ago
web.archive.org has a CDX index, similar to Common Crawl.

Since I use both of these archives together, I wrote this code to iron out the differences between them:

https://github.com/cocrawler/cdx_toolkit

1 comments

Hey! I was using your tool a couple months ago. It was super helpful for my project.
Thanks! I rarely hear from users, great to hear from you!