| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by titzer 1416 days ago

> and it consistently performed a lookup at 1 nanosecond.

TBH I'm skeptical that you are measuring what you think you are measuring. There are a lot of micro-benchmarking pitfalls, like dead code elimination, loop-invariant code motion, unrolling, and other issues. Unless you actually looked at the machine code coming out of the compiler, you're measuring something you don't understand. E.g. 1 nanosecond is roughly 3-6 instructions. That 100% means the hash lookup has been inlined into the benchmarking loop.

Are your hashtables mostly empty? Really small? Lots of easy hits (or easy misses)? Because the slow cases (actually looking up) are going to be hairier and may not be inlined.

Did you benchmark against Java's HashMap? Because it is also very, very, very fast for simple cases.

1 comments

_gabe_ 1416 days ago

It looks like caching definitely skewed the results a bit. You can take a look at the linked code yourself. Worst case was still only around 80 nanoseconds which is definitely slower, but still orders of magnitude faster than "sub 5 micros".

Don't take my word for it though, you can take a look at the Robin Hood benchmarks[0]. Robin Hood unordered map is a competitive hash map that's performed much better than the STL for me in many cases. They average a 4 nanosecond lookup speed for a hash map with 2000 elements and an integer key.

> Did you benchmark against Java's HashMap?

I benchmarked against C#, which has a runtime that performs similar if not better than the JVM. The C# code was a ~~few microseconds~~ around 130 nanoseconds. Which is still very fast, but up to 100x slower. (And yes, this was after warming up the code. I used benchmark dot net[1] here.). This is a really easy benchmark to set up. If you doubt me you can write a couple of benchmarks in under an hour and compare yourself.

[0]: https://martin.ankerl.com/2019/04/01/hashmap-benchmarks-04-0...

[1]: https://benchmarkdotnet.org/articles/overview.html

link

titzer 1416 days ago

Did I read that right? The C# version is computing SHA256 hashes?

link

_gabe_ 1416 days ago

That's a link to the benchmark framework that I used. The C# benchmark are in a separate gist that I didn't feel like digging up. This is the C# benchmarks[0]. All the interfaces and indirection is the result of me adapting this from a separate comment. But the benchmark is just testing `HashSet.TryGetValue`.

Edit: I just re-ran the benchmarks because I didn't have the results pasted in the snippet (which I've now done so I don't keep getting this wrong haha). The C# HashSet takes around 130 nanoseconds, not microseconds. So it's not orders of magnitude slower, but it is still more than a 2x slow down and up to a 100x slow down in the case of an integer key.

[0]: https://gist.github.com/ambrosiogabe/ba6bd0fa80588c2fd2ca26d...

link