| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hackerlight 1416 days ago
	Java is pretty fast. Second most popular language in HFT. Can get it to a few tens of micros. Not as fast as C++ at sub 5 micros. So good enough for many latency sensitive apps.

2 comments

_gabe_ 1416 days ago

> Not as fast as C++ at sub 5 micros.

Try sub 5 nanos. I was curious awhile ago at how fast C++ hash set lookup was compared to C#, and it consistently performed a lookup at 1 nanosecond. I tested with up to 6GB of data and then stopped because it was taking longer to generate random data then it was to run the benchmark 10,000 times.

C++ benchmarks here[0]. It's a bit more complicated then just a pure lookup since I was pulling some code out of a larger app, but the benchmark is only measuring the lookup speed. I did the C# benchmarks with BenchmarkDotNet or something like that, I can never remember the exact name.

[0]: https://gist.github.com/ambrosiogabe/66a6e2fdc77e6a600e570f4...

link

titzer 1416 days ago

> and it consistently performed a lookup at 1 nanosecond.

TBH I'm skeptical that you are measuring what you think you are measuring. There are a lot of micro-benchmarking pitfalls, like dead code elimination, loop-invariant code motion, unrolling, and other issues. Unless you actually looked at the machine code coming out of the compiler, you're measuring something you don't understand. E.g. 1 nanosecond is roughly 3-6 instructions. That 100% means the hash lookup has been inlined into the benchmarking loop.

Are your hashtables mostly empty? Really small? Lots of easy hits (or easy misses)? Because the slow cases (actually looking up) are going to be hairier and may not be inlined.

Did you benchmark against Java's HashMap? Because it is also very, very, very fast for simple cases.

link

_gabe_ 1416 days ago

It looks like caching definitely skewed the results a bit. You can take a look at the linked code yourself. Worst case was still only around 80 nanoseconds which is definitely slower, but still orders of magnitude faster than "sub 5 micros".

Don't take my word for it though, you can take a look at the Robin Hood benchmarks[0]. Robin Hood unordered map is a competitive hash map that's performed much better than the STL for me in many cases. They average a 4 nanosecond lookup speed for a hash map with 2000 elements and an integer key.

> Did you benchmark against Java's HashMap?

I benchmarked against C#, which has a runtime that performs similar if not better than the JVM. The C# code was a ~~few microseconds~~ around 130 nanoseconds. Which is still very fast, but up to 100x slower. (And yes, this was after warming up the code. I used benchmark dot net[1] here.). This is a really easy benchmark to set up. If you doubt me you can write a couple of benchmarks in under an hour and compare yourself.

[0]: https://martin.ankerl.com/2019/04/01/hashmap-benchmarks-04-0...

[1]: https://benchmarkdotnet.org/articles/overview.html

link

titzer 1416 days ago

Did I read that right? The C# version is computing SHA256 hashes?

link

_gabe_ 1416 days ago

That's a link to the benchmark framework that I used. The C# benchmark are in a separate gist that I didn't feel like digging up. This is the C# benchmarks[0]. All the interfaces and indirection is the result of me adapting this from a separate comment. But the benchmark is just testing `HashSet.TryGetValue`.

Edit: I just re-ran the benchmarks because I didn't have the results pasted in the snippet (which I've now done so I don't keep getting this wrong haha). The C# HashSet takes around 130 nanoseconds, not microseconds. So it's not orders of magnitude slower, but it is still more than a 2x slow down and up to a 100x slow down in the case of an integer key.

[0]: https://gist.github.com/ambrosiogabe/ba6bd0fa80588c2fd2ca26d...

link

hackerlight 1416 days ago

I'm talking about an end to end HFT system in C++.

link

bitcharmer 1416 days ago

All our market data feed handlers are sub 5 micro in the 99th. All in Java.

link