Hacker News new | ask | show | jobs
by phs2501 1237 days ago
> I don't know a ton about low level performance, but I'd find it surprising if it's not true. How can interpreted bytecode outperform direct machine instructions?

No production Java/.NET runtime directly interprets bytecode, it gets dynamically compiled into machine code. And as a result it can dynamically _recompile_ it if it discovers runtime profiling patterns that mean a different compilation can run faster, it can dynamically inline functions into the compilation if pertinent, etc.

This does not mean that Java/.NET code _will_ always run faster than native static compilation (a quick gander at the real world shows that optimized production code is usually pretty close either way), but it explains how it _can_ run faster.

3 comments

To take this even further, it's not the code that matters as much but the data access patterns and being able to have structured access patterns in ways that preserve cache locality.

C# actually does really well on this front in that it brings in value types front-and-center, although similar types of capabilities exist in Java(either through ByteBuffers or sun.misc.Unsafe) if a bit harder to use.

I do find it really interesting that there was a whole generation of languages that forgot that memory layout mattered. the typical OO pile of pointers is about as bad as you can possibly imagine.
Their memory layout patterns are not worse than the stereotypical C program with their linked lists due to its inability to have even a proper vector data structure.
>No production Java/.NET runtime directly interprets bytecode, it gets dynamically compiled into machine code. And as a result it can dynamically _recompile_ it if it discovers runtime profiling patterns that mean a different compilation can run faster, it can dynamically inline functions into the compilation if pertinent, etc.

Oh, didn't know that. Interesting, thanks!

> No production Java/.NET runtime directly interprets bytecode, it gets dynamically compiled into machine code.

True, with the caveat that:

1. Compilation occurs while the program runs, using some CPU power

2. Compilation occurs on those parts of the code that pass certain criteria for being a hotspot

3. Because of #2, some CPU power has to be used for profiling continuously as the program runs

It can be faster than AoT compilation, but AoT can do much more aggressive optimisations because AoT compilation can use all available CPUs, for as much time as they want to. JiT compilers have to balance the processing power used for compilation against leaving some processing power for the actual program.

JiT performs very well on benchmarks because:

1. It's a small piece of code, run serially (thereby leaving one entire other core just for profiling and compilation)

2. That one small piece of code is run dozens of thousands of times, triggering the JiT compiler to optimise that "hot spot".

3. The "hot spot" is the only code to run so the entire program is very quickly turned into a native-code program, with the best optimisations that the JiT compiler can perform.

In practice (i.e. not benchmarking), the code is large, it doesn't run serially, it uses all cores (especially in performance sensitive applications) and the JiT compiler and continuous monitoring will effectively steal processing power from the program. In benchmarks, the JiT compiler is not using any power that the program would have used.

> And as a result it can dynamically _recompile_ it if it discovers runtime profiling patterns that mean a different compilation can run faster, it can dynamically inline functions into the compilation if pertinent, etc.

There's a lot of "if"s there. They all have to line up perfectly to get well-optimised native code.

> This does not mean that Java/.NET code _will_ always run faster than native static compilation (a quick gander at the real world shows that optimized production code is usually pretty close either way), but it explains how it _can_ run faster.

It can run faster, but that is rare outside of the benchmark environment - it's more likely to run as fast as AoT compiled programs, because the throttling factor in most programs is the data access patterns, not the computation.

In practice, for the usual type of program, you're not likely to notice much of a difference (other than startup time) between AoT programs and JiT programs.

Go’s compiler pretty much just spews out machine code, it barely does any optimizations, so in this case this comparison is meaningless.
.NET has always JIT all of the code, it never did any kind of interpretation with exception of tiny versions of it like .NET Compact Framework.

All Java implementation have flags to JIT without interpretation.

Their major implementations allow for PGO sharing across runs.

Finally, both ecosystems have supported forms of AOT for the last 20 years.