Hacker News new | ask | show | jobs
by marcuslager 3261 days ago
Thx for the feedback. What I think you should and hope you already do realise is Lucene is nowhere near maximum performance for full-text search nor is it's relevance. And implementing new scoring routines is a drag in Lucene.

Google is also nowhere near maximum relevance. I like word2vec. That model fits into my world view. I'm going to implement it and then take it further. Hopefully while being funded. If not then it shall be my contribution to the open source space and nothing more.

1 comments

If you want to do word vector similarity search, try the "annoy" library from Spotify. It's much much faster than Gensim. https://github.com/spotify/annoy
It appears to do the vector similarity part but not the vector creation part and therefore not a gensim replacement. Am I missing something?