Hacker News new | ask | show | jobs
by jandrewrogers 8 days ago
I’ve done performance-engineering for decades in Java, C++, and C for both data analytics and supercomputing/HPC. Java performs significantly worse than C++ in all cases without exception. This is the result you should expect from first principles; something has gone horribly wrong with your software optimization if Java is faster than C++ or even Rust.

There are good reasons to use Java in environments that care about performance. Absolute performance can be traded for other concerns while still being good. It is why I did so much performance-engineering work in the language.

Most performance is architectural in nature. Extremely granular control of scheduling is a prerequisite. System languages provide that control if you want it, Java does not.

When you design software in Java, you accept that some software architectures are not available to you. If you care about performance, you would not port a software architecture optimized around the limitations of Java to a systems language.

1 comments

> I’ve done performance-engineering for decades in Java, C++, and C for both data analytics and supercomputing/HPC. Java performs significantly worse than C++ in all cases without exception.

I've done similar work (not supercomputing/HPC, but yes for soft and hard realtime software, including safety-critical software) and I couldn't disagree more. Of course, we didn't get to write every program in both Java and C++, but the main question was how much effort it took to achieve the required performance. Over multiple projects it was clear that hitting the performance targets was, on the whole, significantly easier in Java.

> This is the result you should expect from first principles; something has gone horribly wrong with your software optimization if Java is faster than C++ or even Rust.

Strong disagreement here, but we need to be specific about what we mean when we say performance.

It is undoubtedly true that for every Java program there exists a C++ program with the same performance, and the proof is simple: every Java program is a C++ program with the classes being input. But that C++ program is close to 2MLOC long. The same could also be said about a C++ program vs. an Assembly program, as every C++ program could be written as an Assembly program.

But when I talk about performance, I refer to what I think most programmers care about when it comes to performance. Not how fast can a program hypothetically be given enough effort and expertise, but how fast can my program be in my budget.

Both speculative compiler optimisations and memory management optimisations are simply not an option for low level languages due to their constraints, and they are very powerful global optimisations. Given a lot of expertise and effort (that must continue throughout the software's lifetime, and often increases as it evolves) you can work around these limitations, but Java was designed so that you can benefit from them, which means more performance per unit of effort.

In large programs more general constructs (e.g. dynamic dispatch) and patterns (concurrency, great variance in object lifetime) grow in prevalence, and low level languages require more effort and discipline to work around their shortcomings in these areas. Optimising JITs that allow aggressive speculative optimisations and moving collectors were invented and adopted to address these shortcomings. You could claim that the advanced mechanisms that were developed to address C++'s performance issues have failed to achieve their goal, although it won't be easy and much of it comes down to empirical questions of which patterns arise more or less frequently in software, but given that this is what these mechanisms were at least intended to achieve, you certainly can't claim that they fail to do so "from first principles". Some compilation optimisations need speculation; some memory management optimisations need moving pointers. Not having these optimisations available in a program you can write without a lot of special effort cannot make it faster "from first principles".

So no, I don't believe at all that something has to go wrong for a Java program to be faster than a C++ program given a certain budget for the program. Indeed, in larger, more complex programs, I believe the very opposite is true. In most situations, if you get the same performance in C++ as you do in Java, then something has gone terribly wrong with your Java program.

As someone who's worked on a pretty famous JVM feature (virtual threads), I can tell you that we and the designers of low-level languages consciously make different performance tradeoffs because we optimise for different programs and people, and have different preferences when it comes to average case vs. worst case, but there is no universal dominance in performance to either one of these approaches over the other.

One obvious example was our decision to remove Unsafe from Java. Some Java developers voiced opposition, citing a program speed competition (the "one-billion-row challenge" [1]) where Unsafe improved the performance of an entry (which was later cloned and tweaked by others) by 25%. But we saw it as further motivation for the decision. Among over a dozen performance experts who submitted entries, only one was able to write a program efficient enough for Unsafe to make a big difference, and the variance in the results even among the top 20 or so entries was larger than Unsafe's improvement. By removing Unsafe, we would harm that one expert's program, but it would allow us to perform more aggressive constant-folding optimisations that would result in much greater performance improvements over the entire ecosystem. Even from a design philosophy perspective alone, this removal of control to the detriment of some programs "for the greater good" of performance over the entire ecosystem is almost unthinkable in low level languages, because control is what they're for. Did that decision make Java a faster or a slower language? That depends on how you look at performance.

[1]: https://github.com/gunnarmorling/1brc

If what you are saying is correct, the performance of Java has to be the best-kept secret in the industry. Because you are the only person I've ever heard making such claims seriously.

But this looks more like an apples-to-oranges comparison. You might be talking more about performance in complex business logic, while others are talking about performance in computation.

I can imagine that Java could be faster than C++ or Rust (for the same effort) when the number distinct active tasks is large. But in more traditional performance-critical work, such as HPC or video game engines, there are usually only a limited number of distinct combinations of performance-critical tasks that can be active at the same time. Even if the codebase itself is huge, the performance-critical subset is simple, and the performance advantages from increased control over the execution are cheap.

> the performance of Java has to be the best-kept secret in the industry

Is it, though? It's the first language of choice for a large number, if not most performance-critical applications.

> Because you are the only person I've ever heard making such claims seriously.

Your sources must be very limited, then, because in serious compiler and runtime design and memory management circles this is quite common. There is a debate, but it is an empirical one over whether the circumstances that favour Java over C++ are more or less common in practice or vice-versa. And again, given that it's the first language of choice in most performance-critical applications (and even if you don't believe it's number one, surely you agree it's in the top two or three) one or two more people probably think its performance is at least competitive with C++.

> But in more traditional performance-critical work, such as HPC or video game engines, there are usually only a limited number of distinct combinations of performance-critical tasks that can be active at the same time

I wouldn't say HPC and video game engines are "traditional performance critical work". Not because they're not performance critical, but because the range of performance critical programs is far larger - think bank card transaction processing; think mobile phone routing, and there are many more examples (also, AAA video game engines are indeed very traditional in their design and tech choices, but their performance-sensitivity these days is not so much around CPU-related optimisations but about scheduling the GPU, and their tech choices are much more constrained by the consoles they need to support than by performance).

It sounds like we are not even talking about the same thing when we talk about performance.

HPC and video game engines are examples of traditional performance-critical work. Performance-critical, because they typically run in a resource-constrained environment. (If they don't, the user is likely to request the system to do more work.) And traditional, because it's more about algorithmic performance than system performance. The kind of performance people cared about long before computers became capable enough to run complex software systems.

I would not consider card transaction processing performance-critical. The total number of transactions is very low relative to the amount of resources available to process them.

As for Java, it stopped being a general-purpose language a long time ago. Most people who care about the performance of the software they write don't consider it, because almost nobody in their field uses it or talks about it. If it's actually a good choice for performance-sensitive applications in those fields, the people who are using it have done a good job keeping it secret.

You're right, because I certainly don't consider resource-constrained programs to be the only performance-sensitive applications. I consider an application performance-sensitive when it has severe performance requirements (either on throughput or latency or both) that aren't easily or sufficiently met with horizontal scaling. This typically involves situations where high volumes of data must flow and be processed on the same machine (my own journey with Java began when we ported a large C++ application that did distributed, soft-realtime sensor fusion, synchronised with atomic clocks, to Java, and it was very much performance-sensitive).

If you are running in a resource-constrained environment, you might have no choice but to have complete control over hardware resources, in which case you may need to use a low-level language, but your optimisation budget is very high. A different and more common case is where the hardware isn't too resource-constrained, but the performance requirements aren't easily met, either. In these situations, the performance challenge isn't necessarily to optimise at all costs, but to find a way to meet the performance requirement while staying within budget. In these areas, Java has already displaced C++, and continues to be the first language of choice.

Of course, the people who write such applications (in any language) don't often talk about their architecture, but here's one example when they do: https://www.infoq.com/presentations/java-robot-swarms/ In this case, as in many others, the performance requirements are strict (and aren't easily met with horizontal scaling), but the constraint under which they must be met isn't the hardware but the budget and speed of development/evolution.

More often, the performance challenge is how to get the best performance per unit of effort (while meeting the performance requirements, of course) rather than how to get the last 1-5% of performance at any cost. Or sometimes I put this question as not "how fast can a program be?" but "how fast can I practically make my program?"

The optimisations Java offers are precisely intended to maximise the latter, because that's exactly where low-level languages suffer performance shortcomings. They could get that performance or perhaps better with a lot more effort (that needs to be continuously spent throughout the software's lifetime), but many performance-sensitive applications don't have or would rather not spend the time, money, or expertise to do that, and are looking for the best performance per unit of effort.

>I wouldn't say HPC and video game engines are "traditional performance critical work". Not because they're not performance critical, but because the range of performance critical programs is far larger - think bank card transaction processing; think mobile phone routing, and there are many more examples (also, AAA video game engines are indeed very traditional in their design and tech choices, but their performance-sensitivity these days is not so much around CPU-related optimisations but about scheduling the GPU, and their tech choices are much more constrained by the consoles they need to support than by performance).

In "business oriented" contexts, the usual culprits are database access and serialization/communication overheads. If you use Rust with serdes, you get access to one of the fastest ways to turn JSON documents into struct accessible data on the entire planet. The same implementation effort could be spent on any industry specific data formats.

I am struggling to think of any scenarios where Rust is supposed to be uniquely unsuited and Java would have an obvious win to make the broad and sweeping statements you've made.

If everything you said is true, people would be building JVM backends for C++/Rust the same way LLVM has been used as a backend and there would be constant discussions about JVM vs clang vs gcc. It just doesn't add up.

> If you use Rust with serdes, you get access to one of the fastest ways to turn JSON documents into struct accessible data on the entire planet.

Yeah, because most people who choose Rust are those coming from JS, Python, or Ruby, and almost no one has written large systems in Rust yet, I see why you'd think that, because that's indeed the main challenge in the kind of programs normally written in JS, Python, or Ruby. In automation control, the bottleneck isn't the DB; in distributed sensor fusion the bottleneck isn't the DB; in telecom routing the bottleneck isn't the DB (I actually don't know what the bottleneck is in transaction processing, but I'm pretty sure it's not just the DB). These are just some areas where Java is the top choice.

> I am struggling to think of any scenarios where Rust is supposed to be uniquely unsuited and Java would have an obvious win to make the broad and sweeping statements you've made.

In all the same places where Java displaced C++ and continues to do so: large systems. I think few even consider Rust, TBH.

> If everything you said is true, people would be building JVM backends for C++/Rust the same way LLVM has been used as a backend and there would be constant discussions about JVM vs clang vs gcc. It just doesn't add up.

First, Java is far more popular than C++ (let alone Rust), so there would be little point (although there is an LLVM backend for the JVM, though I doubt many people use it). The people who want Java's benefits over C++'s benefits have been using Java for a long time now.

Second, you can't have a JVM backend for C++ and Rust and fully enjoy the performance benefits of Java, because the JVM's optimisations are enabled by the language not having the constraints that low-level languages have. The people who just need the performance choose Java anyway, and the people who choose low-level language choose them because they need the control the JVM doesn't offer.

Low level CPU-related optimisation is absolutely still a thing. The GPU is always filled to the brim trying to get as much quality out of a graphics frame so a lot gets offloaded to the CPU. When I was doing this I was doing a lot of low-level CPU optimisation. GPU optimisation was usually more about transform process topology but there was plenty of low-level work to do there too.

Games are both high throughput AND low-latency and C++ is still king there

C++ is no doubt king in games (for reasons that aren't necessarily primarily performance [1]), but not only are there plenty of high-throughput low-latency applications in C++, I believe there are more than in C++.

BTW, "low latency" is relative, and in most games the relevant latency is the frame, which is usually between 5-15 ms. I worked at a place that did large low-latency software, some soft realtime and some safety-critical hard realtime, where the cutoff between Java and low-level was whether the required latency was under 10us (tha's microseconds!). That's an order of magnitude below what's in games. We did use specialised versions of Java (and specialised kernels), but these days, on normal OSes and plain Java, the cutoff is usually around 1-3ms (although at that point you often need special kernels anyway).

Something that C++ people often don't know is that there's nothing in Java that makes it any harder to compile and run with optimisations at least as good as those offered by C++, but the opposite isn't the case: there are fundamental problems that make it hard to perform some optimisations in C++. Of course, the tradeoff is predictability. Some aggressive optimisations require speculation, which means a fallback to deoptimised (even interpreted) code and then recompilation. I pure compilation and memory management terms, Java has the advantage, but it aims to make the average-case faster than C++ at the expense of the worst case.

[1]: E.g. AAA games are extremely conservative when it comes to technology choices; more conservative than even the military. AAA games often need to target limited consoles where there are few alternatives to C++ available.

I'm not sure I understand what exactly you're talking about. I personally moved away from Java to Rust, because of the obvious and immediate performance benefits and this is possible because Rust manages to stay safe despite the lack of a garbage collector.
I am not GP poster. I find pron points interesting even if I work in the gamedev on game engines. If you don't mind I will try to explain how I see them interesting. Since I have not worked on Rust systems I will stick to C++.

Note his example elsewhere in this discussion of 2 projects done at same time in Java and Rust and the complaint that Rust system used too many locks. This can happen in C++ too. But why it does not happen in (my) practice? Because C++ evolved to not use locks in large scale parallel systems. This was said from mainstage conferences keynotes at least since 2013 [1]. So there is "normal C++" and "C++ that works at large scale" and they are not the same C++ languages. The performance scales between them are many orders of magnitude. Imho it does not mean that Java anywhere near the best of what C++ can do. So here we are talking past each other. pron is correct that Java is not bad and you are correct that you have no reasons to leave Rust.

1. https://sean-parent.stlab.cc/presentations/2013-09-11-cpp-se...

> The performance scales between them are many orders of magnitude. Imho it does not mean that Java anywhere near the best of what C++ can do.

I don't think you're aware of where Java is today. Here's a recent talk about some of the issues we're working on now: https://youtu.be/J4O5h3xpIY8

I said that in the past the people who believed Java can't match or exceed C++'s performance were typically those with a lot of low-level programming experience and little or no experience with Java, while today it's mostly people with little experience with low-level programming, but I think you may be in the first group. To people in that group, the question I pose is: what is exactly that you'd think makes Java harder to compile in an optimised way than C++? That's not hard to answer for JS or Python, but you'll find that it is hard to answer for Java. (I don't have a question to ask the people in the second group because they are typically people who don't know much about software performance to begin with, don't have any informed intuition about it, and just say nonsensical things like "runtime overhead").

On the whole, the range of optimisations available to our compiler is larger than to a C++ compiler, and we have a wider selection of memory management optimisations, too (this matters mostly in large programs with a wide variety of object lifetimes).

So if you were to ask me why I would speculate that C++ can't be as well-optimised as Java, I could tell you that it's because it can't inline as aggressively and it can't move pointers (due to its constraints and intended domains).

I think an answer for why Java wouldn't be as optimised at C++ could refer to things like "Java has an interpreter" (true, but that design was chosen to support more aggressive speculative optimisations in the compiler), or "Java has moving-tracing GCs" (true, and that was chosen because they offer an optimisation of memory management in a wide variety of situations). The JVM was designed to address specific performance shortcoming of low-level languages; true, they don't result in a win in all situations, and in some they even lose, but these mechanisms were chosen because they do win in many situations.

In general, when we (the JVM's developers) see something that C++ can do faster, we treat it as a performance bug and solve it. What John (the chief JVM architect) is talking about is related to the last area where Java suffers (arrays-of-structs) to which we'll start delivering the solution very soon.

There are some intentional performance-related tradeoffs that both our team and the C++/gcc/LLVM teams make, but they are about offering better or worse performance under different circumstances, and definitely not universally.

As an example I was personally involved with, the C++ team and us intentionally chose differenet approaches to coroutines that give better performance in some situations and worse in others, and we both opted to prioritise different situations (i.e. situations where cache misses are more or less likely).

In general, C++ offers better performance than Java in some programs, and the opposite is true in other programs. On average, their performance has come closer over the years, each improving the areas where they were weaker.

As to "the best of what C++ can do", it's hard to define, because, as I said, every Java program can be seen as a C++ program, so technically C++ can always match the performance of a Java program given enough effort and expertise. But when talking about performance, what's practically possible matters much more than what's hypothetically possible, and in those programs where Java wins, achieving the same performance in C++ is just far more costly.

But also, given that both languages can and do come close to the maximal hypothetical hardware performance, they're rarely too far apart (unless we're considering warmup time), and they're both very much "anywhere near" each other almost all the time.

as for my experience, yep I do not have Java experience and a long list of C++ projects.

> what is exactly that you'd think makes Java harder to compile in an optimised way than C++?

In games C++ is doing some simulations and data delivery for GPU. Code that does work on GPU is not mixed with rest of C++ code. So invoking Cuda (or the likes) in the middle of computation is a cheat code that Java does not have. Simulations on the CPU need to be efficiently parallel ( think 12 hardware threads for last gen or 4-6 threads for smaller platforms) and most likely specialized for hardware SIMD ( think AVX2 for last gen or SSE2 like for smaller platforms). To wrangle multi GB data efficiently a lot of compression/decompression and data structures are needed. Does Java still has overhead per class instance? It might force designs with arrays of primitive data types that are more verbose.

Add there per platform I/O and everything. It means that games force people to unlearn everything that language ever thought about standard I/O. Even more about being cross platform. In C++ it means something completely different. In C++ you can't trust language implementation vendor with anything. From your comment I assume that Java teams rely on language implementation in lots of ways. In C++ being efficient means do it yourself. How efficient our memory allocation is? Answer can only be per engine/project. There is no 'average' because 'vendor provided' is the bottom of the barrel quality. No one is improving vendor provided exactly because no one is expected to use it.

In short there are hard to compare many different C++. I can't see them compare to each other much less to other programming languages like Java. This might be not the answer you wanted but that's all I have.

> So invoking Cuda (or the likes) in the middle of computation is a cheat code that Java does not have.

It does (and has since JDK 22). But what we're working on now is JIT-compiling Java code to CUDA (not arbitrary code, but certainly code that's suitable for a kernel): https://openjdk.org/projects/babylon/articles/hat-matmul/hat...

> and most likely specialized for hardware SIMD ( think AVX2 for last gen or SSE2 like for smaller platforms)

Yep, we've had good SIMD support for a few years now. (https://javapro.io/2026/04/09/java-vector-api-faster-vector-...)

> Does Java still has overhead per class instance? It might force designs with arrays of primitive data types that are more verbose.

That is the last area where Java is still behind but the work on arrays-of-structs (with no headers) is nearly complete. A first release of that is imminent.

> In C++ being efficient means do it yourself

Right, and that's precisely what I meant about low-level languages being optimised for control and not performance. You could do things at such a low level in Java, but the main problem is not the performance but that it's just less convenient than in C++.

Anyway, aside from some outdated (or soon-to-be-outdated) things, what you pointed out is mostly about lack of convenient direct low-level control rather than general performance, and that is exactly when low-level languages can be a better fit.