Hacker News new | ask | show | jobs
by hyko 2182 days ago
Is common crawl usable in this context? http://commoncrawl.org/
1 comments

Kinda. The crawler is there, still missing the search engine. Think Google, but open source, and maybe self-hostable[1].

[1] I am pretty sure "hostable" is not a word.

The only public search engine that I'm aware of using common crawl is ChatNoir.

https://groups.google.com/d/msg/common-crawl/3o2dOHpeRxo/H2O...