Hacker News new | ask | show | jobs
by jerf 3643 days ago
Summing a list of numbers is easy mode for a JIT. You've got a tight loop with one type that can be statically shown will never be violated in real-time. Unfortunately, unless that's actually your workload, the speed with with a JIT-based system can add numbers is not relevant to how fast it runs in practice. Any JIT that can't tie C on that workload is broken somehow.

Personally, I think people often go quite overboard with the "benchmarks are useless" idea, but this benchmark really is useless, because it will never produce any differences betweens JITs and thus can't show whether one is good or bad.

1 comments

> it will never produce any differences betweens JITs and thus can't show whether one is good or bad

It can tell you which JITs can't even manage to remove the loop, which is useful to know.

Apparently neither cpython, pypy or gcc manage to remove the loop in this case. I actually think it is interesting that this "slow" code in cpython is within [ed: ~10x] of pypy/jit/machine code (c++ probably should do better, I'm not all that familiar with gcc - maybe -O3 isn't enough to try to unroll loops and/or try to vectorize).

Actually code like this arguably should be a win for a high-level language with an optimization pass; ideally the whole thing should be translated to a constant at compile-time.

Ah right I think that's because the accumulator is a double. I missed that. I think it should still be possible but compilers probably don't bother.
That I don't necessarily expect from a JIT in real time. I'd expect it from any half-decent optimizing compiler, but I'd expect it to likely be the result of several interacting and too-expensive-for-real-time optimizations.

I mean, if the JIT works that out, great, and if someone wants to show off performance numbers that shows one can do that I'm interested in the information, but I wouldn't in general discard one for failing to notice that optimization.