Hacker News new | ask | show | jobs
by remram 2114 days ago
This is not what people write in Python or PHP, but this is what people write in C extensions for Python or PHP. Having your JIT be that fast allows you to forego those extensions and write the low-level hot loops in the same language, and that's a huge improvement.

You usually don't care how your matrix multiplication/regex matching/unicode normalization/JSON parsing is implemented, but people had to make those, and they are users of the language too.

Even though it might not change the bottom-line for your high-level app.

1 comments

Well, the problem is that Python and PHP are actually bad languages for expressing code like that. For expressing C. For one, they're not statically typed.

Julia is a dynamic language that seems to do better because it was designed for this purpose.

But it doesn't seem to have panned out in practice in Python, or PHP as far as I know. Those languages have huge piles of C, and whenever you call into C, the JIT gets confused. People don't seem to rewrite their huge piles of C in Python or PHP. In Python, it's more likely Cython.

I'd like to see pointers to counterexamples -- where people actually wrote some C-like code in Python or PHP and let the JIT do its work. I haven't seen it, aside from the PyPy project itself, and maybe a few other examples. I think you would still take a significant performance hit.

The issue is that C compilers in 2020 are even better at compiling the example I showed. They do amazing things with that kind of code that state-of-the-art JITs don't in practice.

It’s not that the JIT gets confused, it’s that the C APIs for these languages can do almost anything - even stuff that you can’t normally do in the language. So you are faced with a giant optimization boundary.

However a call to a shared library that isn’t linked against your language API is not very expensive as you have a much better handle on the values that are escaping and can make much better optimization choices.

In the Truffle project we are using an LLVM bitcode interpreter that allows us to JIT right through that language boundary and still link to native shared libraries. This means people shouldn’t have to rewrite their C extensions and we can hopefully still run the combination of high level language and C extension faster.

That optimization boundary seems like it's much more of a problem for TruffleRuby than it is for language-specific native implementations? IIRC TruffleRuby relies a lot on being able to optimize away Ruby objects and frames and there's quite a performance cliff if you have to materialize full escaping objects?

JSC and LuaJIT have simpler ways to deal with calling native code which might do weird stuff.

>Well, the problem is that Python and PHP are actually bad languages for expressing code like that. For expressing C. For one, they're not statically typed.

Well, C hardly is, either...

The Psyco project (now dead) used to get very reasonable speedups (factors of several) in pure Python code, particularly for numeric algorithms. It was retired because PyPy was being developed and was expected to solve all speed problems. I wonder why this approach worked while other python JITs did not.
C is frequently 100x faster than Python for code like the example I showed. With autovectorization and other optimizations it can be 500x.

So if a Python JIT does 10-50x better than CPython on a numeric workload, that sounds impressive, but it's still slow compared to C.

And again they don't get 10-50x on string/hash/method call workloads. I think they're lucky to get 2x in some of those cases.

Actually a lot of string stuff is just calling into C, and can be as fast a C (and often is).
Just calling into C doesn't give the performance you’re after a lot of the time. The compiler needs to be aware of the properties of strings. Usually they're implemented as either dedicated opcodes or intrinsics. You can see the simple ones in LuaJIT here https://github.com/LuaJIT/LuaJIT/blob/ff1e72acead01df7d8ed0f...