| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by FreakLegion 3266 days ago

> catching up

Not quite. The n-hashing methods described in that paper are rather different from yours. They're significantly faster, for one thing, which matters if you're building e.g. high-performance networking products.

An equally important contribution of the paper though is providing a framework for proving the reliability of such methods. There are areas of software development where "in practice this has worked fine for me" isn't good enough.

1 comments

natch 3266 days ago

I see, I misunderstood your first comment. I took it as you posting a paper about the technique I described, and when you said "This is pretty standard practice now." I thought you meant "this, what natch just said in his comment," as opposed to "this, what the paper said."

So they're still doing multiple hash functions as standard practice, you're saying? Wow.

Yes their methods are rather different I see.

Indeed using multiple salts is slower than two hashes.

What I do now instead of that (the salts method was easier to explain) is just one hash, currently SHA-1 but may move to BLAKE2 soon, and chop the result of that up into 20-bit pieces that I use as hash values, modulo the bit array size. How do you think this would compare to what they do?

FreakLegion 3265 days ago

> What I do now

Same here (and if you're interested in trying out some new hash functions for performance-related reasons, I'll recommend fast non-cryptographic options like xxHash and MurmurHash). You can still use your chopped-up hashes to generate additional values if needed, depending on the desired error rate.

Do watch out for modulo bias when mapping the chopped-up hashes to the array, though. For max hash value >>> array size the bias is going to be vanishingly small, but if you're clipping the hashes to the nearest power of two greater than the array size, things could get ugly.