|
|
|
|
|
by tialaramex
163 days ago
|
|
Yes, HashMap will by default be randomly seeded in Rust, but also the code you linked intelligently reserves capacity. If h0 is empty, it reserves enough space for all of h1, and if it isn't then it reserves enough extra space for half of h1, which turns out to be a good compromise. Note that the worst case is we ate a single unneeded growth, while the best case is that we avoided N - 1 grows where N may be quite large. |
|
I tried extend() anyway. It didn't work well. Based on your description, extend() implements a variation of preallocation (i.e. Solution II). However, because it doesn't always reserve enough space to hold the merged hash table, clustering still happens depending on N. I have updated the rust implementation (with the help of LLM as I am not a good rust programmer). You can try it yourself with "ht-merge-rust 1 -e -n14m" or point out if I made mistakes.
> HashMap will by default be randomly seeded in Rust
Yes, so it is with Abseil. The default rust hash functions, siphash in the standard library and foldhash in hashbrown, are ~3X as slow in comparison to simple hash functions on pure insertion load. When performance matters, we will use faster hash functions at least for small keys and will need a solution from my post.
> In a new enough C++ in theory you might find the same functionality supported, but Quality of Implementation tends to be pretty frightful.
This is not necessary. The rust libraries are a port of Abseil, a C++ library. Boost is as fast as Rust. Languages/libraries should learn from each other, not fight each other.