Hacker News new | ask | show | jobs
by mrdrozdov 3745 days ago
Although the visualization is similar to what you might see from a word2vec demo, they haven't run word2vec here. There are many ways to generate word vectors, word2vec is one, but the method used here was a Recurrent Neural Network (RNN). More specifically, the type of RNN was a Long Short Term Memory Network (LSTM). Since word vectors can have very high dimensionality (in this case, the dimension was 50), this makes them difficult to visualize. The t-sne algorithm reduces dimensionality to the point where you can visualize the initial vectors and still compare different data points to some useful extent.