Hacker News new | ask | show | jobs
by peterkelly 1355 days ago
mypyc is cool and all, but I can't help thinking about how Node just JITs everything automatically without the need for any special steps like this.
2 comments

That's not Node - that's V8. And it's possible to do the same thing for Python - there's nothing magic about JavaScript compared to Python - it's just a lot of engineering work to do it, which is beyond what this project's scope is. PyPy does it, but not inside standard Python.
I'm well aware of V8 and pypy. I also really like Python as a language, especially with mypy.

It just makes me sad that in a world with multiple high-performance JIT engines (including pypy, for Python itself), the standard Python version that most people use is an interpreter. I know it's largely due to compatibility reasons (C extensions being deeply intertwined with CPython's API).

There is a really important (if not "magic") difference between JavaScript and Python. JS has always (well, since IE added support) been a language with multiple widely-used implementations in the wild, which has prevented the emergence of a third-party package ecosystem which is heavily tied to one particular implementation. Python on the other hand is for a large proportion of the userbase considered CPython, with alternate implementations being second class citizens, despite some truly impressive efforts on the latter.

The fact that packages written in JS are not tied to (or at least work best with) a single implementation is also what made it possible for developers of JS engines to experiment with different implementation approaches, including JIT. While I'm not intimately familiar with writing native extension modules for Node (having dabbled only a little), my understanding is the API surface is much narrower than Python, allowing for changes in the engine that avoid breaking APIs. But there is less need for native modules in JS, because of the presence of JIT in all major engines.

> It just makes me sad that in a world with multiple high-performance JIT engines (including pypy, for Python itself), the standard Python version that most people use is an interpreter. I know it's largely due to compatibility reasons (C extensions being deeply intertwined with CPython's API).

this is misleading, if one sees the phrase "interpreter" as that code is represented as syntax-derived trees or other datastructures which are then traversed at runtime to produce results - someone correct me if I'm wrong but this would apply to well known interpreted languages like Perl 5. cPython is a bytecode interpreter, not conceptually unlike the Java VM before JITs were added. It just happens to compile scripts to bytecode on the fly.

That's not misleading, that's standard terminology. an interpreter using bytecode is still an interpreter.
Bytecode is just another data structure that you traverse at runtime to produce results. It's a postfix transformation of the AST. It's still an interpreter.
Well, ok, but then isn't a CPU is also just an interpreter, traversing the object code text of compiled code?
We don't normally call hardware or firmware implementations an 'interpreter'.

Almost all execution techniques include some combination of compilation and interpretation. Even some ASTs include aspects of transformation to construct them from the source code, which we could call a compiler. Native compilers sometimes have to interpret metadata to do things like roll forward for deoptimisation.

But most people in the field would describe CPython firmly as an 'interpreter'.

so you'd call the pre-JIT JVM an "interpreter" and you'd call Java an interpreted language?
> so you'd call the pre-JIT JVM an "interpreter"

Yeah? I think almost everyone would?

> and you'd call Java an interpreted language?

Java is interpreted in many ways, and compiled in many ways, as I said it's complicated. It's compiled to bytecode, which is interpreted until it's time to be compiled... at which point it's abstract interpreted to a graph, which is compiled to machine code, until it needs to deoptimise at which point the metadata from the graph is interpreted again, allowing it to jump back into the original interpreter.

But if it didn't have the JIT it'd always be an interpreter running.

To clarify my comment, I did mean bytecode interpreter.

This is a common implementation approach - parse the source to generate an AST, transform the AST to bytecode, then interpret the bytecode. It's still interpretation, and is slow. Contrast to JIT engines which transform the intermediate code (whether that's AST or bytecode) to machine code, and is fast.

someone correct me if I'm wrong but this would apply to well known interpreted languages like Perl 5

Perl uses the same execution method you describe for cPython.

This is in the process of being addressed - look into the HPy project
Python is a bit more dynamic than JS, which makes it uniquely hard to optimize. There is more improvement to be done however and is being done.
Right, but I think we know how to optimise all these things. It's all solved problems.
A few things are impossible without changing/subsetting the language. What I was trying to get at.
I think it's more that cpython is so slow so a lot of things people use are implemented using the C API, and many optimizations will break a bunch of things. If everything was pure python the situation would be different.
What things are you thinking of?

(Not trying to interrogate you or prove you wrong, but I've got an interest in optimising very difficult meta-programming patterns.)

Nearly everything (or is it everything?) in memory can be modified at runtime. There are no real constants for example. The whole stack top to bottom can be monkeypatched on a whim.

This means nothing is guaranteed and so every instruction must do multiple checks to make sure data structures are what is expected at the current moment.

This is true of JS as well, but to a lesser extent.

If it's solved, why is python so slow?
That's what Microsoft is paying Guido for, for the next versions of python.
I think that's not really the plan - they're talking about just basic template compilation, nothing like V8 https://github.com/markshannon/faster-cpython/blob/master/pl....