|
|
|
|
|
by bluuewhale
172 days ago
|
|
Author here! I wanted to share a follow-up to this post.
https://bluuewhale.github.io/posts/further-optimizing-my-jav... This time I went back with a profiler and optimized the actual hot path. A huge chunk of time was going to Objects.equals() because of profile pollution / missed devirtualization. After fixing that, the next bottleneck was ARM/NEON “movemask” pain (VectorMask.toLong()), so I tried SWAR… and it ended up faster (even on x86, which I did not expect). |
|
Can see the rest of that file and the adjacent `raw_hashtable.h` for the rest of the SwissTable-like implementation and `hashing.h` for the hash function.
FWIW, it consistently out-performs SwissTable in some respects, but uses a weaker but faster hash function that is good enough for the hash table, but not good for other use cases.