|
|
|
|
|
by rciorba
2139 days ago
|
|
I did. Inherited a legacy web app that did stupid things in Python in memory (basically search and aggregation). I realized a rewrite was the best course of action, but in the meanwhile the old thing had to stay up and running, and as the volume of data increased, it started to run in to HTTP timeouts as often requests took longer than 2 minutes. I moved the thing to PyPy, and got about a 30% speedup from that. Only one lib had to be replaced with a pure python alternative, as it was using a C extension. It bought me enough time to finish the new implementation (duplicate the data in Elasticsearch, hey presto from over a minute to about a second to get results). For some workloads PyPy's JIT can do wonders. |
|
I have document parsing and SPARQL queries that can take a few minutes that I'd like to run frequently so I can keep all parts of the system up to date.
I've only benchmarked it a bit, but I found I got approximately the five times speed-up that PyPy promised. This is with PyPy based on Python 3.6. I think PyPy is switching to cffi as the way to connect to C code so most native code "just works" now.
I had to backport my code from Python 3.8; Python 3.6 lacks contextvars, but there is a polyfill for that, otherwise there was no problem.
I stayed away from PyPy for a long time because it was tied to Python 3.5 which was busted in various ways. One of those was that the filesystem path objects were half-implemented, you should have been able to pass them into anything from the stdlib that expected a string path and at that time you couldn't. Little accidents like that can slow down a technology like PyPy from being adopted.