CFFI [1] is a step in the right direction, inspired by LuaJIT's CFFI. It originated from the PyPy folks and is supported in PyPy [2]; it's also supported to some degree by Numba [3]. I don't know what level of C call optimization is available when using the JITs, so I can't speak to the performance, but I've used it casually via CPython and was impressed by the API. That said, it has been around for a while and the traction seems somewhat limited -- I would guess because most people who have this kind of problem also need more than "just" FFI.