| Hash tables are very important data structures in the computer science world. Hash tables allow you to have amortized O(1) access cost to arbitrary elements - like key/value (often called: map, dict). In order to implement hash table one need to map key onto an index in the table - this is done with a hash function: hash(string) ---> number. Here's a problem. If an attacker knows the hash function, she can produce many strings that will give the same number in return. This usually wasn't a problem, but in the web world it is. It is possible to flood the server (usually in python, ruby, perl) with such crafted requests that, for example, all headers will end up with precisely the same hash value: hash(any_given_header_in_request) ---> fixed value. This is will result in hash table collision and is generally bad. Normal hash functions can't solve this. This problem of maliciously creating hash collisions is called "hash flooding". Siphash is an attempt to solve the problem. It is more than a hash function - it's a crypto PRF function and that gives you more guarantees than dumb hash function. Most importantly it takes two values: a "string to hash" and a "crypto key": siphash(string, crypto_key) --> number. The idea is to generate this "crypto_key" randomly on each program execution, to make sure the attacker can't predict it. Crypto speaking hash functions may be reversible. There is nothing guaranteeing that they are not. But Siphash is a PRF, and in crypto-speach this means it's not reversible. If you can produce an efficient algorithm to reverse Siphash - ie: given crypto key and hash value predict input string - you can write a good paper and be famous. |
I may be misunderstanding you, but isn't the point of a (good) cryptographic hash function that you cannot produce the multiple plaintext which will give the same value, despite knowing the hash function?