Hacker News new | ask | show | jobs
by jltsiren 8 days ago
If what you are saying is correct, the performance of Java has to be the best-kept secret in the industry. Because you are the only person I've ever heard making such claims seriously.

But this looks more like an apples-to-oranges comparison. You might be talking more about performance in complex business logic, while others are talking about performance in computation.

I can imagine that Java could be faster than C++ or Rust (for the same effort) when the number distinct active tasks is large. But in more traditional performance-critical work, such as HPC or video game engines, there are usually only a limited number of distinct combinations of performance-critical tasks that can be active at the same time. Even if the codebase itself is huge, the performance-critical subset is simple, and the performance advantages from increased control over the execution are cheap.

1 comments

> the performance of Java has to be the best-kept secret in the industry

Is it, though? It's the first language of choice for a large number, if not most performance-critical applications.

> Because you are the only person I've ever heard making such claims seriously.

Your sources must be very limited, then, because in serious compiler and runtime design and memory management circles this is quite common. There is a debate, but it is an empirical one over whether the circumstances that favour Java over C++ are more or less common in practice or vice-versa. And again, given that it's the first language of choice in most performance-critical applications (and even if you don't believe it's number one, surely you agree it's in the top two or three) one or two more people probably think its performance is at least competitive with C++.

> But in more traditional performance-critical work, such as HPC or video game engines, there are usually only a limited number of distinct combinations of performance-critical tasks that can be active at the same time

I wouldn't say HPC and video game engines are "traditional performance critical work". Not because they're not performance critical, but because the range of performance critical programs is far larger - think bank card transaction processing; think mobile phone routing, and there are many more examples (also, AAA video game engines are indeed very traditional in their design and tech choices, but their performance-sensitivity these days is not so much around CPU-related optimisations but about scheduling the GPU, and their tech choices are much more constrained by the consoles they need to support than by performance).

It sounds like we are not even talking about the same thing when we talk about performance.

HPC and video game engines are examples of traditional performance-critical work. Performance-critical, because they typically run in a resource-constrained environment. (If they don't, the user is likely to request the system to do more work.) And traditional, because it's more about algorithmic performance than system performance. The kind of performance people cared about long before computers became capable enough to run complex software systems.

I would not consider card transaction processing performance-critical. The total number of transactions is very low relative to the amount of resources available to process them.

As for Java, it stopped being a general-purpose language a long time ago. Most people who care about the performance of the software they write don't consider it, because almost nobody in their field uses it or talks about it. If it's actually a good choice for performance-sensitive applications in those fields, the people who are using it have done a good job keeping it secret.

You're right, because I certainly don't consider resource-constrained programs to be the only performance-sensitive applications. I consider an application performance-sensitive when it has severe performance requirements (either on throughput or latency or both) that aren't easily or sufficiently met with horizontal scaling. This typically involves situations where high volumes of data must flow and be processed on the same machine (my own journey with Java began when we ported a large C++ application that did distributed, soft-realtime sensor fusion, synchronised with atomic clocks, to Java, and it was very much performance-sensitive).

If you are running in a resource-constrained environment, you might have no choice but to have complete control over hardware resources, in which case you may need to use a low-level language, but your optimisation budget is very high. A different and more common case is where the hardware isn't too resource-constrained, but the performance requirements aren't easily met, either. In these situations, the performance challenge isn't necessarily to optimise at all costs, but to find a way to meet the performance requirement while staying within budget. In these areas, Java has already displaced C++, and continues to be the first language of choice.

Of course, the people who write such applications (in any language) don't often talk about their architecture, but here's one example when they do: https://www.infoq.com/presentations/java-robot-swarms/ In this case, as in many others, the performance requirements are strict (and aren't easily met with horizontal scaling), but the constraint under which they must be met isn't the hardware but the budget and speed of development/evolution.

More often, the performance challenge is how to get the best performance per unit of effort (while meeting the performance requirements, of course) rather than how to get the last 1-5% of performance at any cost. Or sometimes I put this question as not "how fast can a program be?" but "how fast can I practically make my program?"

The optimisations Java offers are precisely intended to maximise the latter, because that's exactly where low-level languages suffer performance shortcomings. They could get that performance or perhaps better with a lot more effort (that needs to be continuously spent throughout the software's lifetime), but many performance-sensitive applications don't have or would rather not spend the time, money, or expertise to do that, and are looking for the best performance per unit of effort.

>I wouldn't say HPC and video game engines are "traditional performance critical work". Not because they're not performance critical, but because the range of performance critical programs is far larger - think bank card transaction processing; think mobile phone routing, and there are many more examples (also, AAA video game engines are indeed very traditional in their design and tech choices, but their performance-sensitivity these days is not so much around CPU-related optimisations but about scheduling the GPU, and their tech choices are much more constrained by the consoles they need to support than by performance).

In "business oriented" contexts, the usual culprits are database access and serialization/communication overheads. If you use Rust with serdes, you get access to one of the fastest ways to turn JSON documents into struct accessible data on the entire planet. The same implementation effort could be spent on any industry specific data formats.

I am struggling to think of any scenarios where Rust is supposed to be uniquely unsuited and Java would have an obvious win to make the broad and sweeping statements you've made.

If everything you said is true, people would be building JVM backends for C++/Rust the same way LLVM has been used as a backend and there would be constant discussions about JVM vs clang vs gcc. It just doesn't add up.

> If you use Rust with serdes, you get access to one of the fastest ways to turn JSON documents into struct accessible data on the entire planet.

Yeah, because most people who choose Rust are those coming from JS, Python, or Ruby, and almost no one has written large systems in Rust yet, I see why you'd think that, because that's indeed the main challenge in the kind of programs normally written in JS, Python, or Ruby. In automation control, the bottleneck isn't the DB; in distributed sensor fusion the bottleneck isn't the DB; in telecom routing the bottleneck isn't the DB (I actually don't know what the bottleneck is in transaction processing, but I'm pretty sure it's not just the DB). These are just some areas where Java is the top choice.

> I am struggling to think of any scenarios where Rust is supposed to be uniquely unsuited and Java would have an obvious win to make the broad and sweeping statements you've made.

In all the same places where Java displaced C++ and continues to do so: large systems. I think few even consider Rust, TBH.

> If everything you said is true, people would be building JVM backends for C++/Rust the same way LLVM has been used as a backend and there would be constant discussions about JVM vs clang vs gcc. It just doesn't add up.

First, Java is far more popular than C++ (let alone Rust), so there would be little point (although there is an LLVM backend for the JVM, though I doubt many people use it). The people who want Java's benefits over C++'s benefits have been using Java for a long time now.

Second, you can't have a JVM backend for C++ and Rust and fully enjoy the performance benefits of Java, because the JVM's optimisations are enabled by the language not having the constraints that low-level languages have. The people who just need the performance choose Java anyway, and the people who choose low-level language choose them because they need the control the JVM doesn't offer.

Low level CPU-related optimisation is absolutely still a thing. The GPU is always filled to the brim trying to get as much quality out of a graphics frame so a lot gets offloaded to the CPU. When I was doing this I was doing a lot of low-level CPU optimisation. GPU optimisation was usually more about transform process topology but there was plenty of low-level work to do there too.

Games are both high throughput AND low-latency and C++ is still king there

C++ is no doubt king in games (for reasons that aren't necessarily primarily performance [1]), but not only are there plenty of high-throughput low-latency applications in C++, I believe there are more than in C++.

BTW, "low latency" is relative, and in most games the relevant latency is the frame, which is usually between 5-15 ms. I worked at a place that did large low-latency software, some soft realtime and some safety-critical hard realtime, where the cutoff between Java and low-level was whether the required latency was under 10us (tha's microseconds!). That's an order of magnitude below what's in games. We did use specialised versions of Java (and specialised kernels), but these days, on normal OSes and plain Java, the cutoff is usually around 1-3ms (although at that point you often need special kernels anyway).

Something that C++ people often don't know is that there's nothing in Java that makes it any harder to compile and run with optimisations at least as good as those offered by C++, but the opposite isn't the case: there are fundamental problems that make it hard to perform some optimisations in C++. Of course, the tradeoff is predictability. Some aggressive optimisations require speculation, which means a fallback to deoptimised (even interpreted) code and then recompilation. I pure compilation and memory management terms, Java has the advantage, but it aims to make the average-case faster than C++ at the expense of the worst case.

[1]: E.g. AAA games are extremely conservative when it comes to technology choices; more conservative than even the military. AAA games often need to target limited consoles where there are few alternatives to C++ available.

I'm a Java developer now, amongst other languages. The advantage of Java is that it takes A LOT less time to develop something, so there is the whole bang for buck for sure. I have had a few problems where I would love shared direct memory access and some atomics (because it would be a lot easier). But for the most part developing in Java is a lot quicker.

I don't think game developers are more conservative than any other developers. We do have large C++ codebases and so it's hard to change.

All modern engines have a few scripting languages tacked on too.

Something like Lua usually is the sweet spot: most of the people developing scripts are not developers. We even had a Java interpreter for scripting once, but it lost favor for this reason.

There were exceptions, but I found that developers generally preferred C# over Java anyway. Our assets pipelines are generally in C# already.

Any speculative optimisation we were doing by hand. There is the whole deferring allocations / moving allocations, both of which we were already doing (e.g. copying every frame).

A lot of our C++ code is intrinsics (including memory primitives like _mm_stream_ps and barriers) and you HAVE to have good control over how memory is laid out (e.g. knowing that data is split between cache lines so that you you don't get contention). Lots of spin locks too. I just don't see how you can do this kind of low level work in Java.

> A lot of our C++ code is intrinsics (including memory primitives like _mm_stream_ps and barriers)

Java has such intrinsicts, too: https://docs.oracle.com/en/java/javase/25/docs/api/java.base.... They may not look like intrinsics that compile to a single machine instruction, but the are (I don't think we offer stream access, simply because there hasn't been demand for it; if there is, we can add it. I actually added a streaming array copy to the JVM because I thought I could use it for something, but the results weren't what I expected, so I took it out)

BTW, here's a list of our intrinsics:

https://github.com/openjdk/jdk/blob/master/src/hotspot/share...

As you might notice, they include SIMD intrinsics offered through https://docs.oracle.com/en/java/javase/25/docs/api/jdk.incub...

> and you HAVE to have good control over how memory is laid out (e.g. knowing that data is split between cache lines so that you don't get contention)

We have the `@Contended` annotation precisely for that: https://github.com/openjdk/jdk/blob/master/src/java.base/sha... You have to use a flag to tell the JVM to respect this annotation, but the people who write high performance code know this: https://www.baeldung.com/java-false-sharing-contended

> Lots of spin locks too.

We have an intrinsic for spin locks: Thread.onSpinWait() https://docs.oracle.com/en/java/javase/25/docs/api/java.base...()

> I just don't see how you can do this kind of low level work in Java.

There's no reason you should if you're not writing high performance code in Java, but the people who write such code in Java know how to do these things in Java.

To be clear, Java certainly doesn't offer as much precise control as a low-level language, but it does offer everything you need for high performance (except array-of-struct, but that will arrive soon). The reason for that is that there's high demand for these constructs because so much of the worlds performance-sensitive software is written in Java. Traditionally, not games (which often have to run on platforms for which we don't offer Java) but manufacturing automation, defence, and trading.

> There is the whole deferring allocations / moving allocations, both of which we were already doing (e.g. copying every frame).

Yes, you can certainly do some memory management optimisations in C++, although with some effort (it's especially hard to use some standard library stuff, but when I write high performance code in C++ I don't use std at all). The low-level language that makes it easier is Zig.

> Any speculative optimisation we were doing by hand.

It's hard to do speculative optimisation by hand, unless you're generating code on the fly. The way speculative optimisations work is that we observe that something has been true so far (e.g. think about a specific branch that's always taken or a dynamic dispatch that only hits a certain target at a certain callsite) but the compiler can't prove that it's necessarily true. So we emit machine code that assumes it's true with special traps that would trigger some fault signal if the assumption is invalidated. If the trap is hit, we capture the signal, deoptimise the subroutine and then recompile it differently (without the assumption).

In C++ what I do is do some of the same optimisation results by hand (typically using templates), but of course, they're not speculative and I need to be careful. There's also code size and I-cache implications, but while we try to keep an eye on the I-cache, Java doesn't always get this balance right, either.