Hacker News new | ask | show | jobs
by kirillkh 3159 days ago
If I understand correctly, this forgoes Lucene entirely. I would really like something that can be integrated into Lucene/Solr due to the availability of all the infrastructure build around it.

> works faster than grep

I didn't quite get the connection to grep.

1 comments

Suppose you have gigabytes of text, Annoy will find matching articles faster and more precise than grepping with keywords.
Lucene is faster and better than grep too. Annoy may be better than Lucene's "more like this" query which is for finding similar documents in an index to a given set of documents. But how would it be helpful for keyword search which is what is being asked about?
I know, inverted index search is fast, it is the basic search engine algorithm, but there is a difference in quality of top ranked results. With word vectors you can ensure the topic of the whole document is what you want. Many documents mix topics and some keywords appear by mistake in the wrong place, for example, because scraping web text is imperfect and might capture extra text.