Hacker News new | ask | show | jobs
by Radim 4641 days ago
For people interested in a cleaned-up, commented and de-obfuscated word2vec, I recently ported the original C code to Python [1].

My HN submission of this endeavour received no love, but I think it's worthwhile nevertheless as the Python code is not only more concise, readable and extendable, but the training's actually faster too [2].

[1] https://github.com/piskvorky/gensim/blob/develop/gensim/mode...

[2] http://radimrehurek.com/2013/09/word2vec-in-python-part-two-...

3 comments

Your submission receives no love but my one afternoon hack does... oh the humanity... lol

That is some amazing work, thanks!

Its sad you didnt get the love on your submission; you changes are very neat and having word2vec inside gensim feels like a really awesome feature.
Well done!

Mikolov said that he hoped word2vec would "significantly advance the state of the art" of NLP, but really the state of the art can only advance when people can understand and manipulate the code. You're making that possible. Thank you.