|
|
|
|
|
by jcelerier
1203 days ago
|
|
Hmm, I know that at least some hash map implementations look for hash functions with an is_avalanching trait and otherwise add an explicit combination step; I'd assume that std::unordered_map implementations do the explicit combination all the time as std::hash was pretty much introduced just to work with such containers |
|
Turns out for this structure hash quality just doesn't matter very much. The identity function is all you need for mediocre results, and a really great hash produces only slightly improved results with this data structure, yet running the hash costs performance so why bother ?
But a modern hash map structure actually relies on that quality for reasonable performance.
https://martin.ankerl.com/2022/08/27/hashmap-bench-01/#bench...
Ask that table about std::unordered_map - notice the fancier hashes make it worse (bigger numbers are worse, relative to 100 is the best) but not that much worse. It's always poor, it just varies as to whether it's very poor.
Now, ask it about absl::flat_hash_map - for Abseil's own hashes the results range from very good to mediocre, and likewise other modern high quality hashes, but for std::hash it's so awful most benchmarks don't finish, shown as - instead of a number.
You can try to detect this, but this is a fundamentally a QoI problem. Why even provide this rubbish hash function in the standard library? This is a lesson the C++ standard library refuses to learn, if you can't do it well then don't become an attractive nuisance by doing it badly.
People are actually less unhappy with Microsoft's stdlib which gives you a poor FNV hash instead of the identity function. Like, OK, this isn't a great hash, but it does basically work (under non-adversarial conditions, FNV has big problems for tailored input) something like Swiss Tables will be substantially slower than it ought to be, but not crazy benchmark failed due to timeout slower. Nobody is cheering, but it doesn't cause open weeping.