| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cdnsteve 3565 days ago

I think it would be valuable to have an open dataset of a raw crawl index. It could be distributed via academic torrents or partner with a hosting provider.

The real innovation won't be in crawling but in working on the index, filtering it, organizing it, trying sort algorithms and learning.

If this was available and gained popularity I could see competition in search again.