Hacker News new | ask | show | jobs
by rowanG077 1688 days ago
Damn I didn't know any of the Futamura Projections where actually implemented in practice.
4 comments

For some reason I read this as Futurama projections.
Sounds to me like PyPy has been using Futamura projections for a decade or so.
Can you link me to some sources? I'm not familiar with PyPy but I thought it's a normal tracing JIT. In fact quick googling shows a blog post explicitly saying PyPy does NOT use PE: https://www.pypy.org/posts/2018/09/the-first-15-years-of-pyp....

Besides not every PE instance is a Futurama Projection.

https://gist.github.com/tomykaira/3159910

This gist talks specifically about PyPy using the Futamara projection.

The linked text is pretty bad, having many mistakes. Please see the two comments at the end of the page.

For a better overview on what the projections are, please see http://blog.sigfpe.com/2009/05/three-projections-of-doctor-f... , or the original paper by Futamura.

Thank you, this is super helpful -- I didn't look close enough to spot that myself. :)
As far as I remember, PyPy uses a Python interpreter written in RPython that's being specialized with respect to the actual Python code to be executed, with the residual program implementing the semantics of that one Python program that's been fed into it.
PyPy is two projects.

1. RPython + the PyPy _compiler_ which is a compiler for JIT compilers (like GraalVM as I understand it)

2. An implementation of the Python language _using_ RPython to produce a JIT for Python scripts.

There are other languages _using_ RPython + PyPy compiler to produce JIT compilers for languages other than Python too.

https://doc.pypy.org/en/latest/architecture.html#layers

I see. This doesn't look like a Futaruma Projection to me. What PyPy does is run a python program under their own RPython based interpreter. Then it uses a tracing JIT to JIT the RPython based interpreter. There is no PE going on. No residual specialized program is created.

The first Futaruma Projection would be if PyPy would specialize the interpreter based on the python program yielding an executable.

That's exactly what PyPy does do. You get a specialized version of the interpreter which is specialized to the particular program.
No you don't get that. A tracing JIT is not capable of producing an executable like that under normal circumstances. If a certain path is not taken during execution it might not be compiled at all. The point of the first futurama projection is that you get a fully runnable executable that is semantically equivalent to running the original program in the interpreter. A JIT only produces what it sees during execution. I guess it might be possible if you carefully run your program with inputs that exhaust every possible path.
You're right, PyPy _used_ to base it's wizardry on PE (I don't know if it's Futurama or not, that's honestly the first time I hear of that term), but now they are using something called meta-tracing JIT, where, instead of JIT-tracing the program that your language's source describes, they JIT-trace your interepreter while it's running your language's source.

The extremly cool and awesome thing about this is that this is effectively a general purpsoe JIT, one JIT to rule all interpreted languages that could ever be written. There is nothing specific about Python in the toolchain. For _Any_ interpreted language:

- You write only your naive-but-readable interpreter in Rpython, a restricted subset of python that tries to preserve the readability but ditch the dynamic madness. (This is not python, this is an entirely different language. It just happens that every valid Rpython program is also a valid Python program. There is nothing special about Rpython here either, they could have theoretically picked any readable language to write your naive interpreter in, but they chose Rpython)

- The compilation pipeline produces two things: an exectuable image of your naive interpreter*, and a bytecode image for the general-purpose JITer.

- Normally, it's the executable image of your interpreter that runs your language's programs, but once it detects a user-program-level loop (e.g. because it has encountered a backward jump.), it invokes the supporting runtime (the general-purpose JITer) and delegates to the bytecode version of itself.

- The GP JITer starts tracing the bytecode image of your interpreter (which, remember, is itself executing the user-level program the whole time), once it detects that the user-level loop is done, it says so. Now the general-purpsoe JITer has a record of all the operations that your interpreter executed while it was running the user-level path, which is the same as {all the operations that the user-level path executed} (minus all the interpreter-specific operations, which the GP JITer also knows about because this info is contained in the bytecode)

- The GP JITer treats the execution record as any other JIT, it produces an optimised native version from it, and bingo!, you got yourself a native image of that user-level loop.

- The original interpreter, the executable, now goes back into the picture. It puts that native version of the loop in its pocket, ready for the next time it encounteres the loop.

It's so f*ing cool, that's why their logo is a snake eating itself: there's so much meta shenanigans going on. Their implementation of Python is merely the application, it's the amazing toolchain they built to build it that is the real treasure.

*: One of the steps in creating the exectuable is, I kid you not, is running the standard Cpython interpreter on your Rpython source (as it's valid python), waiting for interpreter to do it's expensive startup, then freezing the whole enviroment it produced to package it with the executable. This couldn't be done to speed up normal Python because its extremly dynamic nature messes with this.*

About Futamura Projections (for people like me that had never heard about it): https://gist.github.com/tomykaira/3159910