Hacker News new | ask | show | jobs
by Der_Einzige 830 days ago
Not obeying the triangle inequality simply means that it’s fast to compute.
1 comments

One does not normally consider trigonometric functions fast to compute. Would you mind elaborating?
Consider the geometric definition of the dot product of two vectors,

a.b = |a||b|cos theta.

This means you get cos of the angle between the two vectors by just dividing the dot product by the product of their magnitudes. You don't actually take cos of the angle to get cosine similarity (for one because you don't know the angle) you just use "cos theta" (calculated as above) as a proxy for how narrow the angle is and therefore how close the two embeddings are.

The paper in TFA shows that if you construct an embedding space such that that angle isn't meaningfully measuring similarity then a low angle doesn't mean two things are very similar. I have a similar paper measuring bears and woods but I haven't got around to typesetting it for publication yet.

You don't need to call cosine to compute cosine similarity - most of the time, a normalized dot product is sufficient.

Needless to say, dot products are directly supported in hardware via the FMA unit.

Yes, this would be fast assuming the vectors are pre-normalized, obviating the need to repeatedly calculate two square roots.
Even if the vectors are NOT normalized, you don't always need to normalize them.

Add in nd indices and the costs tend to be very small.