| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hiker512 3030 days ago
	Can you recommend an implementation or paper for handling OOV words via character level embeddings?

2 comments

patelajay285 3030 days ago

We just open sourced a easy to use library called Magnitude that handles out-of-vocabulary words and uses Annoy indexing for fast most_similar queries for word2vec, GloVE, and fastText:

https://github.com/plasticityai/magnitude

link

Sinidir 3026 days ago

Thank you very much. This might help with my Master Thesis :)

link

visarga 3030 days ago

Great work!

link

fnl 3029 days ago

I agree - and also to your initial comments on the actual benefits & limitations of word vectors, that I can 100% subscribe to.

link

wenc 3030 days ago

Just an FYI, FastText's default implementation handles OOV words via word n-grams and character n-grams. (see switches -minn, -maxn, and -wordNgram)

https://fasttext.cc/docs/en/options.html

link