Hacker News new | ask | show | jobs
by zzzeek 1362 days ago
> It just makes me sad that in a world with multiple high-performance JIT engines (including pypy, for Python itself), the standard Python version that most people use is an interpreter. I know it's largely due to compatibility reasons (C extensions being deeply intertwined with CPython's API).

this is misleading, if one sees the phrase "interpreter" as that code is represented as syntax-derived trees or other datastructures which are then traversed at runtime to produce results - someone correct me if I'm wrong but this would apply to well known interpreted languages like Perl 5. cPython is a bytecode interpreter, not conceptually unlike the Java VM before JITs were added. It just happens to compile scripts to bytecode on the fly.

4 comments

That's not misleading, that's standard terminology. an interpreter using bytecode is still an interpreter.
Bytecode is just another data structure that you traverse at runtime to produce results. It's a postfix transformation of the AST. It's still an interpreter.
Well, ok, but then isn't a CPU is also just an interpreter, traversing the object code text of compiled code?
We don't normally call hardware or firmware implementations an 'interpreter'.

Almost all execution techniques include some combination of compilation and interpretation. Even some ASTs include aspects of transformation to construct them from the source code, which we could call a compiler. Native compilers sometimes have to interpret metadata to do things like roll forward for deoptimisation.

But most people in the field would describe CPython firmly as an 'interpreter'.

I call it "bytecode interpreted" to distinguish it from traditional parse-tree interpretation such as Perl 5 and others
so you'd call the pre-JIT JVM an "interpreter" and you'd call Java an interpreted language?
> so you'd call the pre-JIT JVM an "interpreter"

Yeah? I think almost everyone would?

> and you'd call Java an interpreted language?

Java is interpreted in many ways, and compiled in many ways, as I said it's complicated. It's compiled to bytecode, which is interpreted until it's time to be compiled... at which point it's abstract interpreted to a graph, which is compiled to machine code, until it needs to deoptimise at which point the metadata from the graph is interpreted again, allowing it to jump back into the original interpreter.

But if it didn't have the JIT it'd always be an interpreter running.

I am not too concerned about the word "interpreter", and more about cPython being called an "interpreted language", which implies it works like Perl 5, or that cPython being an "interpreter" is somehow a problem. It's normal mode of operation works more like pre-JVM Java, with "interpreted bytecode" from .pyc files.
Most people don’t make this distinction, and would just say ‘interpreter’. Interpreting bytecode vs an AST is a pretty minor difference. It’s exactly the same data in a slightly different format. The ‘compilation’ is just a post-order linearisation. And storing it in files or not even more so.
as I'm sure you're aware, bytecode interpretation typically implies a superior performing model than AST interpretation, and compiling into bytecode produces a lot of opportunities for optimization that are not typically feasible when working with an AST directly. Of course it's all bits and anything is possible, but it's assumed to be a better approach in a generally non-subtle way.
To clarify my comment, I did mean bytecode interpreter.

This is a common implementation approach - parse the source to generate an AST, transform the AST to bytecode, then interpret the bytecode. It's still interpretation, and is slow. Contrast to JIT engines which transform the intermediate code (whether that's AST or bytecode) to machine code, and is fast.

someone correct me if I'm wrong but this would apply to well known interpreted languages like Perl 5

Perl uses the same execution method you describe for cPython.