Hacker News new | ask | show | jobs
by visarga 3152 days ago
Suppose you have gigabytes of text, Annoy will find matching articles faster and more precise than grepping with keywords.
1 comments

Lucene is faster and better than grep too. Annoy may be better than Lucene's "more like this" query which is for finding similar documents in an index to a given set of documents. But how would it be helpful for keyword search which is what is being asked about?
I know, inverted index search is fast, it is the basic search engine algorithm, but there is a difference in quality of top ranked results. With word vectors you can ensure the topic of the whole document is what you want. Many documents mix topics and some keywords appear by mistake in the wrong place, for example, because scraping web text is imperfect and might capture extra text.