Hacker News new | ask | show | jobs
by hvidgaard 2951 days ago
I never said anything about SHA256. I talked about one-way functions.

The thread model is an adversary that gets unlimited access to the values stored for this purpose, and knows the function used to compute it. He wants to check if a given email is in the set. One-way functions is provably the best way to be able to ask yes/no to the question if this email is in the set with no false answers. I have not said anything about using computationally expensive one-way functions because that does not matter if the function takes 10 seconds to compute. He already knows what emails he wants to check.

1 comments

You did say hash function.

Would you mind jotting down that proof?

A proper cryptographic hash function is a one way function, if they exist.

But I'd frame your question the other way around then. You do not want to store the emails in a form that leaks any data. For that we need a compressing function. My (unwritten) assumption was that if the adversary compromise the system to get the data, they'll get any secrets too. This means that a HMAC is no better than a cryptographic hash function.

I know that is quite possible to create a system where this would be significantly harder than just a DB dump. But that is both significantly more difficult, and expensive. I'll admit that the formulation "provable the best we can do" should've had a big fat asterisk with the disclaimer about the threat model.

So, if an attacker have the data set and secrets, and wants to compute if a particular input is a member of this set. Can you do better than a cryptographic hash function?

Can you write down in pseudocode exactly what you're suggesting? As in db.write(sha256(email)) or whatever.