Hacker News new | ask | show | jobs
by niftich 2219 days ago
This is a hash designed for non-cryptographic use, like in hash tables or bloom filters. You can tell by their small output size, which greatly reduces the cost of a brute-force collision search. It's also in the linked readme.

Hash function families with a similar target usecase include: cityhash, falkhash, farmhash, FNV, meowhash, metrohash, murmur, t1ha, wyhash, xxh.

The SMHasher suite tests hash functions for speed, distribution, bias, and collisions. This function ranks well in those tests.

3 comments

> This is a hash designed for non-cryptographic use

The Readme specifically says "you can modify it to yield 128-bits or more if you want a cryptographically secure hash."

Which is a problematic statement, because it is not designed for cryptography even if you extended the output to 256 bit. It's not the output length that makes it cryptographicaly secure. (Rather it's the difference between "you won't find collisions by accident" and "you won't find collisions even if you try really hard using very sophisticated math", but there are more requirements.)

This is similar to the issues with a different hash by the same Github user posted two weeks ago:

https://news.ycombinator.com/item?id=23103521

Making fast non-cryptographic hash functions is a fun challenge and I appreciate the projects, but please, please do not make any claims about cryptographic properties!

you need others to not make crypto claims, but you're so sure making your crypto claim that it's not cryptographic.

it's always easy to front as an expert spectator and believe others work is no good. why don't you back it up with some effort?

I believe it is a crypto hash. so..a fun challenge?

how about this for a fun challenge, why don't you post a cryptanalysis and I'll link it from the README.

Thanks for the beamsplitter link!

maybe it is designed as a crypto hash

All of those mentioned have Hash flooding issues due to invertability or lack of diffusion.

I think SipHash is the better choice for non-cryptographic use cases (e.g. hash tables) https://131002.net/siphash/

SipHash was specifically designed to be resistant to hash flooding attacks.

FWIW, the SMHasher test suite takes the view [1] that defense against hash flooding attacks is a concern for the hash table's collision resolution method, which is a fair point. Nonetheless, SipHash was subsequently adopted by several programming languages' standard libraries for use in hash tables. SipHash is also notable for its clear and concise specification, including security claims, preliminary cryptanalysis, and a discussion on hash flooding [2].

[1] https://github.com/rurban/smhasher#security [2] https://eprint.iacr.org/2012/351.pdf

SipHash is considered acceptable even for cryptographic use cases (MACs)
I am using Bob Jenkin's one_at_a_time hash for that purpose.

It is really simple and works with unaligned data.

But it is not doing well in benchmarks. I wonder if I should use another one