Hacker News new | ask | show | jobs
by gamesbrainiac 2691 days ago
Anyone using pypy in production? If so, what are your experiences like?
4 comments

I did do but not for anything high traffic.

At work we had this internal service that did search entirety in memory, using sorted lists. When it was initially written, years ago, that was deemed "good enough".

By the time I joined the company a little over a year ago, search requests were sometimes exceeding 2 minutes, and would be killed due to HTTP timeout (somehow users were OK with search taking a long time, they were just annoyed that now sometimes it didn't work at all). While I was working to rewrite the whole thing to push data to ES and do everything from there, the old system needed to keep running.

So I moved it over to PyPy. I did need to swap blist(C extension) for sortedcontainers (pure python), but that was a minimal change. Search got about 30% faster, enough to keep chugging along while the rewrite happened.

The place where I work uses it for back-end report generation. In my experience, PyPy's claimed speed benefits are very realistic. For most of our code, we get about a 3x speedup. For some areas of code, we get up to a 7x speedup. The only compatibility issues we've run into stemmed from mistakes on our part (e.g. all dicts are effectively OrderedDicts in PyPy, but your code will break in CPython if you rely on that implementation detail).

As a side note, for some reason, it's terrible for pytest — your tests will take literally 100x as long. If you trust PyPy's compatibility, I'd recommend running your tests with CPython even if you run PyPy in production.

> all dicts are effectively OrderedDicts in PyPy, but your code will break in CPython if you rely on that implementation detail

Actually since CPython 3.6 (or maybe it's 3.5?) the same is true in CPython too, though I'd personally still advocate specifying OrderedDict if you particularly require that property anyway.

We used PyPy in production. It sped up workload loading JSON-per-line log to Amazon Redshift by factor of two with no changes.
For what it's worth, I was using it for a Django project and it did speed up noticeably - but I also ran into mysterious crashes. Now, I can't pin it to pypy necessarily, might have been some Django lib not playing ball with pypy. Had no time to test it in depth because moving back to regular Python was "good enough".