Hacker News new | ask | show | jobs
by gopalv 3643 days ago
> What you really need to explain is why Python is inefficient.

Python is extremely dynamic and this makes things hard for someone who wants to build a JIT.

The powerful bits of python metaprogramming makes it really impossible for a JIT to say with some certainty across all running threads that what it is doing is right.

Inlining a simple call like a.x() is rather hard when everything underneath can move around - I am not saying that it always does, but implementing a python variant which is nearly the same isn't very useful.

Compare this to PHP, which has fixed method calls (unless you use runkit, which you shouldn't) - a->x() will always be the same method as long as there was an a->x which was valid.

The method will never change once it is has been validated.

However unlike Java, both languages end up not knowing exactly what type "a" will be when the method is being called.

Java also doesn't quite know, but only when the invoke is via an interface. But the engine at least knows exactly how many impls of that interface has been loaded so far (and the bi-morphic case is commonly 1 real impl and 1 mock impl).

But both in case of PHP and Python, the whole idea of "which object do I have look up ::x() for?" is an unknown. In PHP's case, you have to look it up once per class encountered and in Python's case, you have to verify someone hasn't replaced it at runtime.

There are very nice functional ways around this problem at the bottom end of numeric loops for Python, which makes it great for numeric processing interleaved with generic control flow.

numpy + numba is a great way of limiting all this and getting performance out of a simple loop. And I'd rather use numpy + a python script doing regexes rather than a C program + LAPACK.

But that performance doesn't translate over when you have class/object oriented structures or in general, just multi-threaded web/rpc style code.

1 comments

JavaScript has the same problems, and that hasn't stopped every major JS engine from building a JIT.
JS is single-threaded which makes an enormous difference to actually squeezing performance out of your JIT.

Just building a JIT for a language generally isn't the hard part. Building a JIT that is substantially faster than a bytecode-compiled implementation of the language is what's hard, and how hard that is depends intimately on the semantics of the source language. When I say intimately, I mean every single detail of the language's semantics matter.