Hacker News new | ask | show | jobs
by nostrademons 3397 days ago
HotSpot has had great speeds for numeric computation at least since 2005. I was doing financial software in Java in my first job out of college, our CTO was an ex-Sun architect who literally wrote the book on Java, and the speeds we got on numerical computations were basically equivalent to C.

The part where Java really falls down is in memory use & management, which you can see on the binary-tree & mandelbrot benchmarks, where it's roughly 4x slower than C. There are inherent penalties to pointer chasing that you can't get around. While HotSpot is often (amazingly) smart enough to inline & stack-allocate small private structs, typical Java coding style relies on complex object graphs. In C++ or Rust these would all have well-defined object ownership and be contained within a single block of memory, so access is just "add a constant to this pointer, and load". In Java, you often need to trace a graph of pointers 4-5 levels deep, each of which may cause a cache miss.

Rule of thumb while I was at Google was to figure on real-world Java being about 2-3x slower than real-world C++.

1 comments

> The part where Java really falls down is in memory use & management, which you can see on the binary-tree & mandelbrot benchmarks, where it's roughly 4x slower than C.

binary-tree is not useful for comparing GCed and non-GCed languages. For non-GCed languages, you are allowed to use a memory pool of your choice (the C version uses the Apache Portable Runtime library), for GCed languages you are required to use the standard GC with the default settings (no adjustment of GC parameters permitted). This is apples and oranges.

For mandelbrot, the C version uses handcoded SIMD intrinsics. I.e. it's not even portable to non-x86 processors.

> For non-GCed languages, you are allowed to use a memory pool of your choice (the C version uses the Apache Portable Runtime library), for GCed languages you are required to use the standard GC with the default settings (no adjustment of GC parameters permitted). This is apples and oranges.

Doesn't that match with how a library would be used in the real world? A c library can create it's own memory pool but a GC one has to live with however it's host is configured.

If I were to run a performance-critical application, I'd definitely tune the GC accordingly. It's why the JVM offers several garbage collectors in the first place, for example.

Also, GCed languages aren't prevented from using memory pools, but often they are not part of their common libraries, because there's less need for them.

> If I were to run a performance-critical application, I'd definitely tune the GC accordingly.

But you have to tune it for the performance of the whole application (AFAIK), you can't tune it for an individual algorithm like you can with c. It's a one size fits all approach.

1. That goes towards the other point that I made [1] about how microbenchmarks have only limited relevance for the performance of large applications (the performance of memory pools can also change as a result; as an extreme case, multiple large memory pools can lead to swapping).

2. Many GCs allow you to tune performance for individual computations. For example, Erlang allows you to basically start a new lightweight process with a heap large enough so that collection isn't needed and to throw it away at the end; OCaml's GC parameters can be changed while the program is running.

[1] https://news.ycombinator.com/item?id=13747876