As far as I remember, PyPy uses a Python interpreter written in RPython that's being specialized with respect to the actual Python code to be executed, with the residual program implementing the semantics of that one Python program that's been fed into it.
I see. This doesn't look like a Futaruma Projection to me. What PyPy does is run a python program under their own RPython based interpreter. Then it uses a tracing JIT to JIT the RPython based interpreter. There is no PE going on. No residual specialized program is created.
The first Futaruma Projection would be if PyPy would specialize the interpreter based on the python program yielding an executable.
No you don't get that. A tracing JIT is not capable of producing an executable like that under normal circumstances. If a certain path is not taken during execution it might not be compiled at all. The point of the first futurama projection is that you get a fully runnable executable that is semantically equivalent to running the original program in the interpreter. A JIT only produces what it sees during execution. I guess it might be possible if you carefully run your program with inputs that exhaust every possible path.
Clearly PyPy has to produce executable code if it wants to jump into it. The CPU wouldn't understand if it were asked to jump into something that isn't executable code.
> The point of the first futurama projection
...was to delight viewers with what would turn out to be a wonderful pilot episode?
You're right, PyPy _used_ to base it's wizardry on PE (I don't know if it's Futurama or not, that's honestly the first time I hear of that term), but now they are using something called meta-tracing JIT, where, instead of JIT-tracing the program that your language's source describes, they JIT-trace your interepreter while it's running your language's source.
The extremly cool and awesome thing about this is that this is effectively a general purpsoe JIT, one JIT to rule all interpreted languages that could ever be written. There is nothing specific about Python in the toolchain. For _Any_ interpreted language:
- You write only your naive-but-readable interpreter in Rpython, a restricted subset of python that tries to preserve the readability but ditch the dynamic madness. (This is not python, this is an entirely different language. It just happens that every valid Rpython program is also a valid Python program. There is nothing special about Rpython here either, they could have theoretically picked any readable language to write your naive interpreter in, but they chose Rpython)
- The compilation pipeline produces two things: an exectuable image of your naive interpreter*, and a bytecode image for the general-purpose JITer.
- Normally, it's the executable image of your interpreter that runs your language's programs, but once it detects a user-program-level loop (e.g. because it has encountered a backward jump.), it invokes the supporting runtime (the general-purpose JITer) and delegates to the bytecode version of itself.
- The GP JITer starts tracing the bytecode image of your interpreter (which, remember, is itself executing the user-level program the whole time), once it detects that the user-level loop is done, it says so. Now the general-purpsoe JITer has a record of all the operations that your interpreter executed while it was running the user-level path, which is the same as {all the operations that the user-level path executed} (minus all the interpreter-specific operations, which the GP JITer also knows about because this info is contained in the bytecode)
- The GP JITer treats the execution record as any other JIT, it produces an optimised native version from it, and bingo!, you got yourself a native image of that user-level loop.
- The original interpreter, the executable, now goes back into the picture. It puts that native version of the loop in its pocket, ready for the next time it encounteres the loop.
It's so f*ing cool, that's why their logo is a snake eating itself: there's so much meta shenanigans going on. Their implementation of Python is merely the application, it's the amazing toolchain they built to build it that is the real treasure.
*: One of the steps in creating the exectuable is, I kid you not, is running the standard Cpython interpreter on your Rpython source (as it's valid python), waiting for interpreter to do it's expensive startup, then freezing the whole enviroment it produced to package it with the executable. This couldn't be done to speed up normal Python because its extremly dynamic nature messes with this.*
Besides not every PE instance is a Futurama Projection.