|
|
|
|
|
by mlepath
515 days ago
|
|
In ML everything is a tradeoff. The article strongly suggests using dot product similarity and it's a great metric in some situations, but dot product similarity has some issues too:
- not normalized (unlike cosine simularity)
- heavily favors large vectors
- unbounded output
- ... Basically, do not carelessly use any similarity metric. |
|
(The catch is that during training logistic regression is done on the word and context vectors, but they have a high degree of similarity. People would even sum the context vectors and word vectors or train with word and context vectors being the same vectors without much loss.)