Hacker News new | ask | show | jobs
by acimim_ha 820 days ago
I don't really see why:

  public class TestJIT {
    public static void main(String[] args) {
        for (int i = 0; i < 20_000; i++) {
            payload();
        }
    }

    public static int payload(int a, int b) {
        return a + b;
    }
  }
shouldn't be optimized into a 'no-op'. The end-effect is the same.
4 comments

Things like that often will be, which is why you generally need to use the JMH harness to do microbenchmarking on the JVM. It uses internal APIs to stop the compiler treating results as dead and eliminating them.

In this case it doesn't happen because to see that the entire operation is dead requires the compiler to inline payload into main, but he says he disabled inlining for that method specifically so it wouldn't happen. Recall that the goal is to see the assembly for a block of code in isolation, not demo what the JVM can do when given free reign.

Well, inlining is not necessary, good old interprocedure analysis would resolve it too.
Sure, but neither javac (the Java to bytecode compiler) or HotSpot are doing that. The former tries to preserve as much as possible, and for the latter interprocedure analysis is too costly at run-time.
Could javac do the analysis and record it in the bytecode for HotSpot to optimize? Or is this kind of hybrid teamwork not done?
It is done, but for this case the problem is partial compilation. For this you'd need methods to be tagged as pure, but that assumption needs to propagate and it could be violated by a library being upgraded.
There are two aspects to this:

- If payload is not inlined, the loop can't be optimized away. The fact of iteration itself may be a desirable side-effect (spin-wait/pause) a stricter compiler can't make an assumption about, unlike GCC or Clang

- If payload is inlined, it should be a no-op. If it's not, and its result consumed by an opaque "sink" method, there may be limitations.

On interproc analysis - don't forget you can dynamically load code and access payload through reflection too. This limits certain optimizations that are otherwise legal in AOT compilation. .NET has similar restrictions and corresponding differences when publishing binaries with JIT vs AOT - the former gets to enjoy DynamicPGO (HotSpot kind of optimizations), the latter gets to enjoy frozen world (with exact devirtualization, faster reflection, auto-sealing, etc. but overall not as good as DynamicPGO with guarded devirt, branch reordering, etc.).

Early in the article, the author tells the compiler not to inline calls to payload(). When that isn't inlined, the compiler can't tell if the body of the loop has side effects or not, so it won't be able to eliminate the loop.
Why would you want to remove those calculations?

You would change behavior of this program

How? Except it would finish sooner. The is no behavior.
There is a side effect like cpu temperature increase

If I put my leg on my desktop tower then I may feel enjoyably warm or if I put some chocolate on my laptop then it may start melting

Also fans will be louder

None of that is behavior according to the java (or any sane) language specification, so it can be optimized out of existence.
It is written in other, more general "specification" called physics

Computers arent purely abstract, they exist in real world and are affected by it, so lets do not try to pretend otherwise

So according to that "specification" no optimization is allowed. Since that would almost always change the "heating behavior" of code. Therefore, it is absurd.
Yet the compiler writers care only about the language spec, and you can bet that failing to optimize this as dead code would be considered a compiler bug.

This goes not only for Java compiler, but many other languages as well.

What exactly do you expect an optimizing compiler to do?
Not sure if being facetious, but FWIW you can't really rely on these. Next year you'll get a much faster CPU and memory and the timing will be all different. Or, tomorrow you run it while encoding video on the CPU which eats 99% of CPU, and it's hundred times slower.
Obviously, there is an xkcd for that: https://xkcd.com/1172/