| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by niemandhier 647 days ago

In high dimensional Spaces the distances between nearest and farthest points from query points with respect to normal metrics become almost equal.

Cosine similarity still works though, since it only look at how aligned vectors are.

The thing that people tend to overlook is, that there is no need for embeddings to be a vector space endowed with an inner product.

Words don’t have this structure, we define it on the image of the mapping from words to n-tuples and the embeddings we use coevolved in such a way that we assume the cosine similarity to be meaningful.