Hacker News new | ask | show | jobs
by tyler 5649 days ago
It sounds like you're conflating two techniques here. The first (as others have mentioned) is cosine similarity, which measures the angle between the vectors. However, the bit about 0s and 1s sounds like you're talking about locality-sensitive hashing (http://en.wikipedia.org/wiki/Locality_sensitive_hashing). LSH is often used to estimate cosine similarity, as cosine similarity can be quite expensive to calculate. I know Google and others are using it for such.