|
|
|
|
|
by kaesve
1127 days ago
|
|
How is relatedness measured? Using some embedding space? I often disagree with the measurements, the worst one being "punch" and "bowl" only relating 12%. The concept is very fun though. I might try to make my own version, as it also seems like a fun side project and a way to explore different word embedding spaces. Could be fun to maybe also have a visualization of the embedding space. |
|
Note that the relatedness of words will depend on the training set. Many of these word2vec-based games uses data that was trained on Google News[2], so if "Unrelated Words" uses the same data, you should be looking for word pairs that are more common in news but perhaps less common in general text.
Semantle[3] is another game based on word vectors. I like "Unrelated Words" better because whereas Semantle requires guessing one fixed target word, which is often very different from its nearest neighbor, this game requires guessing a set of words, the flexibility of which makes it feel less frustrating.
[1] https://en.wikipedia.org/wiki/Word_embedding
[2] https://code.google.com/archive/p/word2vec/
[3] https://news.ycombinator.com/item?id=31588388