| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by AlotOfReading 8 days ago

    In general, the tradeoff is between optimisations that help large programs vs optimisations that help small programs.

Do you have concrete examples of large scale Java programs that are significantly more performant than comparable programs in native languages like C++? My understanding was that this dynamic hadn't fundamentally changed much since the 2010s, when Java was able to occasionally edge out a win in 1-2 benchmarks and would lose handily in others. My experience is that large scale Java programs remain a bit of a bear even after significant optimization effort (e.g. Bazel).

There are of course plenty of optimizations the JVM does that aren't possible AOT, but that that doesn't imply an automatic win at large scales, as Rust demonstrates.

1 comments

pron 8 days ago

> Do you have concrete examples of large scale Java programs that are significantly more performant than comparable programs in native languages like C++?

Yes. I was working in a place that made large sensor-fusion applications, air-traffic control applications, and logistical planning, each in the 2-8MLOC range. Over time, we ported all of them from C++ to Java because C++'s performance overheads were too annoying to work around.

Of course, in principle it's always possible to match and perhaps even exceed Java's performance in a low-level language, but in practice it becomes ever more difficult as the program grows (and the cost remains with maintenance forever). The reason is that as programs grow, patterns become less regular (e.g. the variance in object lifetimes grows), the need for concurrency grows (and so the need for sharing objects among threads and for lock free data structures), and more general constructs are used (e.g. more dynamic dispatch). Improvements in modern allocators, as well as LTO and PGO have helped, but not enough to match the extent of optimisations you can do once you're free of the design constraints of low-level control and the focus on the worst case.

Java's thesis (not initially, but from very early on) was to rely on optimisations that can't be effectively employed by low-level languages because of their constraints, such as efficient memory management that benefits from being able to move most pointers in a program, and highly aggressive speculative optimisations (that are nondeterministic and can fail, resulting in deoptimisation). These optimisations tend to be global, and so they don't restrict program structure much, keeping maintenance costs lower, but they do help the average case at the cost of harming the worst case, which is a tradeoff that programs written in low-level languages don't want, and of course, it doesn't give the low-level control that's the entire point of low-level languages. Proving that thesis took a while, and longer in some aspects than others (moving collectors that don't pause were first released to a wide audience three years ago).

Of course, the differences aren't huge because the hot paths are typically small enough that they can be improved without adding too much cost (and hot paths require some manual optimisation in all languages), but gaining some performance as a side effect of significantly lowering costs is nice.

> There are of course plenty of optimizations the JVM does that aren't possible AOT, but that that doesn't imply an automatic win at large scales, as Rust demonstrates.

I don't know what it is that Rust demonstrates given how few large scale projects have chosen it, but I've seen nothing to indicate that it doesn't suffer from the same performance issues as C++ compared to Java. In fact, someone I know who works at one of the world's largest tech companies told me that his team lead really wanted to do something in Rust, so they ported a small-to-medium service from Java to Rust. The result was such a huge performance drop that it wouldn't meet their minimum requirements. They were then forced to spend an additional 6 to 12 months carefully hand-optimising their Rust code until it matches Java's performance, but the result is such that all future maintenance will be more expensive. This is the exact same pattern I've seen with C++.

It's interesting that 20 years ago the people who said Java can't beat C++ on performance were experienced low-level programmers who had little or no experience with Java (and they were also right on several axes at the time). Today the people who say that are those with little experience with low-level languages (and are under the impression that low level languages are universally fast), but they will eventually learn about their fundamental performance issues just as we did decades ago.

I think that Rust in particular has made people without much experience in low-level programming (among which Rust has made much more inroads than among those with a lot of experience in low-level programming) believe a certain story, namely that the problem with low level languages was memory safety and that that was the reason so many large programs switched to Java despite the performance sacrifices they had to make. Now that Rust fixes that problem, they can have their cake and eat it too! In reality, memory safety was indeed one of the several significant problems with low level languages that Java sought to fix, but another was the performance issues low level languages suffer from as they get large (making good performance ever more costly). The tradeoff isn't performance (in large programs there might even be a performance gain) but low-level control, as that is what low-level languages are about. That was what they offered back then, and it's still what they offer now. Rust was first designed twenty years ago, back when things still looked a certain way (which is why, IMO, it repeated most of C++'s design mistakes), but these days I think that a better, more modern design of low-level languages is more focused on control, leaving large programs to high-level languages. Lack of memory safety has, without a doubt, been one of the things that made low-level languages less palatable to "ordinary" applications, but it was far from the only one.

Anyway, I'm sure the debate of which is faster, C++ (/Rust/Zig) or Java, will continue, and frankly, due to the nature of modern hardware, compiler, and runtime optimisations these days (when the question of the cost of some individual operation is all but meaningless and out ability to extrapolate from the performance of one program to another is close to nil), it largely comes down to empirical questions such as which program patterns are more or less common in the field and in which domains, as there are code and workload patterns that could give an advantage to either one.