|
|
|
|
|
by natch
3269 days ago
|
|
I see, I misunderstood your first comment. I took it as you posting a paper about the technique I described, and when you said "This is pretty standard practice now." I thought you meant "this, what natch just said in his comment," as opposed to "this, what the paper said." So they're still doing multiple hash functions as standard practice, you're saying? Wow. Yes their methods are rather different I see. Indeed using multiple salts is slower than two hashes. What I do now instead of that (the salts method was easier to explain) is just one hash, currently SHA-1 but may move to BLAKE2 soon, and chop the result of that up into 20-bit pieces that I use as hash values, modulo the bit array size. How do you think this would compare to what they do? |
|
Same here (and if you're interested in trying out some new hash functions for performance-related reasons, I'll recommend fast non-cryptographic options like xxHash and MurmurHash). You can still use your chopped-up hashes to generate additional values if needed, depending on the desired error rate.
Do watch out for modulo bias when mapping the chopped-up hashes to the array, though. For max hash value >>> array size the bias is going to be vanishingly small, but if you're clipping the hashes to the nearest power of two greater than the array size, things could get ugly.