|
|
|
|
|
by kleebeesh
3159 days ago
|
|
Nice write-up! I also found this one useful in terms of high-level implementation: http://adventuresinmachinelearning.com/word2vec-keras-tutori... Initially it was not obvious to me that the dot-product was even part of the model. In hindsight it's intuitive: a pair of high similarity vectors will have a high dot-product, which yields a high sigmoid activation. This also motivates the use of cosine similarity, which is just a normalized dot-product. Likely obvious to some but this eluded me the first few days I studied this model. |
|