|
Speaking as a compiler guy, and having a hand in a few successful commercial JITs: The only reason he thinks they aren't slow is because they haven't yet reached the limits of making the JIT faster vs the program faster.
Yes, it's true that the languages are not slow in the sense of being able to take care of most situations through better optimization strategies.
As a compiler author, one can do things like profile types/trace/whatever, and deoptimize if you get it wrong. You can do a lot. You can recognize idioms, use different representations behind people's back, etc. But all those things take time that is not spent running your program. On average, you can do pretty well. But it's still overhead. As you get farther along in your JIT, optimization algorithms get trickier and trickier, your heuristics, more complex.
You will eventually hit the wall, and need to spend more time doing JIT'ing than doing real work to make optimizations to some code.
This happens to every single JIT, of course.
This is why they try to figure out which code to optimize.
But even then, you may find there is too much of it. Because of this, the languages are slower, it's just the overhead of better JIT algorithms, not slower code. In practice, you hope that you can optimize enough code well enough that nobody cares, because the ruby code takes 8ms, and the C code takes 5ms. For example: Almost all of the allocations and copying can be optimized, but depending on the language, the algorithms to figure out what you can do safely may be N^3. Also, PyPy is still pretty young in its life cycle (in this iteration of PyPy:P) for folks to say that they can make stuff much faster if they only had a few things.
It really needs a very large set of production apps being rin by a very large set of folks for quite a while to see where the real bottlenecks still are.
Past a certain point, you run out of optimization algorithm bullets. The way compilers get the last 20% is by tuning the algorithms for 10 years. Of course, i'm not trying to slag on PyPy, I think they've done an amazing job of persevering through multiple rewrites to get somewhere that seems to be quite good now. I just am a little wary of a fairly young JIT saying that all big performance problems fall into a few categories. |
The issue with benchmarks is surely well known, also by the PyPy authors; I wonder what the biggest application is that they have benchmarked or that runs on PyPy.
Your point on the JIT compiler interrupting program execution is certainly valid, too, but not necessarily so. One could easily do the code generation in a separate background thread and let execution switch over only if necessary. But, as you have already said, a latency issue certainly exists. This is one of the cases where interpreters usually have a leg up, and there are promising ways of optimizing interpreters.