|
|
|
|
|
by ComputerGuru
2307 days ago
|
|
Honestly they’re being super mathematical about it but it’s really nothing fancy at all. SHA is designed to be used like this, CloudFlare hasn’t done anything remotely ingenious (and I wouldn’t mind except they go out of their way to talk about how much better their fancy new algorithm is compared to other multi-set intersection theories). E.g. 512-bit SHA-2 and SHA-3 may be truncated at 128 or 256 bits if that’s all the entropy you need (and you don’t need to be compatible with the formal SHA2/SHA3-512/256 spec). Here, CloudFlare is truncating to an intentionally low entropy of just 20 bits, not to reduce the security but rather to intentionally increase the collisions. It’s ultimately just a glorified hash table and as any CS student can tell you, the bucket size is just a function of the hash size (v1 of the api: 128-bit hash, v2 of the api: 20-bit hash). (I don’t like pretension.) |
|
I bet most people who haven't thought about it beforehand or heard the solution, would fail an interview question as "So, we have a database of 100s Million passwords and we want the user to check if his password is inside; without the user actually sending his password.. or his hash; without us outright revealing all our passwords/hashes to the user."
Their solution is a pretty good tradeoff imho.