Hacker News new | ask | show | jobs
by persnickety 953 days ago
Hash functions are an analogy that falls apart very quickly.

You can recover the original word from the embedding, but not from the hash.

A hash function will return very distant vectors for very similar inputs. An embedding will return similar ones

2 comments

Pardon the pedantry, but this reflects the casual/conversational uses of “hash function” not the more general definition. To be a hash function, it just has to map a set to another set of fix sized values (usually some finite set of the natural numbers).

Returning unrelated (distant) hashes for similar inputs is a possible property of a hash function, and oftentimes a desirable one (especially for cryptography), but there are in fact use cases where one wants similar inputs to map to similar (or the same) hash. https://en.m.wikipedia.org/wiki/Locality-sensitive_hashing

Not for perceptual hashes!

Not all hashes are cryptographic hashes.