Hacker News new | ask | show | jobs
by natch 3267 days ago
Standard practice? Someone should tell all the fresh CS grads who keep blogging about their multiple hash algorithms. I've been doing this as standard practice since the 1990s. I'm glad if Computer Science academics are catching up to the working practitioner.
1 comments

> catching up

Not quite. The n-hashing methods described in that paper are rather different from yours. They're significantly faster, for one thing, which matters if you're building e.g. high-performance networking products.

An equally important contribution of the paper though is providing a framework for proving the reliability of such methods. There are areas of software development where "in practice this has worked fine for me" isn't good enough.

I see, I misunderstood your first comment. I took it as you posting a paper about the technique I described, and when you said "This is pretty standard practice now." I thought you meant "this, what natch just said in his comment," as opposed to "this, what the paper said."

So they're still doing multiple hash functions as standard practice, you're saying? Wow.

Yes their methods are rather different I see.

Indeed using multiple salts is slower than two hashes.

What I do now instead of that (the salts method was easier to explain) is just one hash, currently SHA-1 but may move to BLAKE2 soon, and chop the result of that up into 20-bit pieces that I use as hash values, modulo the bit array size. How do you think this would compare to what they do?

> What I do now

Same here (and if you're interested in trying out some new hash functions for performance-related reasons, I'll recommend fast non-cryptographic options like xxHash and MurmurHash). You can still use your chopped-up hashes to generate additional values if needed, depending on the desired error rate.

Do watch out for modulo bias when mapping the chopped-up hashes to the array, though. For max hash value >>> array size the bias is going to be vanishingly small, but if you're clipping the hashes to the nearest power of two greater than the array size, things could get ugly.