Hacker News new | ask | show | jobs
by asimpletune 2238 days ago
Looks interesting! What is meant by a “universal family”?
2 comments

A universal set of hash functions is a set of hash functions such that randomly choosing any hash function from the set guarantees an upper bound on the number of collisions regardless of which keys from the universe are input to it (which are also random).

Basically it makes it more difficult for an adversary to exploit collisions from your hash function.

I don't get how this could be used. I tried to imagine, and ended up with something wrong. This is what I imagined:

You have a list of hash functions, and choose one at random, then hash a password. Later a hacker gets these hashed passwords, and has an extra hard time? But this wouldn't work for checking passwords because you wouldn't know what hash.

What is a real use case?

I may be wrong, but after doing a bit of research, here's one example:

Alice is storing keys in a hash table. Since this is a hash-table, the hash (H) that Alice will choose must be fast. However, real-world hash-tables will use a relatively small number of bits from the output of H, because even if you have a table sized to 4 billion, that's only 32 bits.

Let's say that Alice does this by taking the lowest N bits of the output of H (this works in practice regardless of which bits Alice uses) where 2^N is the size of the table. N may change as elements are added

Eve wants to mess with Alice by sending a bunch of keys that all have the same bottom M bits, where M is the largest expected value for N. Since the hash H is very fast, this is very computationally cheap to brute-force, particularly if you have access to very parallel hardware like a GPU.

Now consider that instead of using hash H, Alice uses hash-family U. Whenever a hash table is created (or rehashed,) Alice selects a random hash from U. Eve can no longer easily generate keys that will collide in the hash table.

From what I can tell, for password hashing, this is not appreciably better than salting, if the size of the set of possible salts and the size of the set U are the same.

This allows hash tables to have expected insertion and search times of O(1) as well which is as far as I’m aware the major motivation for it. Also the set of functions can be huge and the hash functions in the universe is infinite following the fact that there are infinitely many primes which would mean that brute force quickly becomes impractical for cracking the hash.