I'm curious if they tried Cython. I've read that you can achieve up to 35% speedup just by compiling python code, plus you can type pyx file to get near C performance.
I never tried it, this would have been a nice use case.
We did. We used Cython where we could, but Cython doesn't work with Twisted (or at least inline deferreds, which holy we had everywhere), due to different generator semantics. Edit: in fact, our internal protobuf lib was all generated Cython. We had a code-gen inception.