Hacker News new | ask | show | jobs
PyPy v7.0.0: triple release of 2.7, 3.5 and 3.6-alpha (morepypy.blogspot.com)
198 points by rlamy 2691 days ago
11 comments

Great news, will have to give it a spin. A few of the Python 3.6 features were holding us back from deploying PyPy.
How's pypy with numpy these days? It's been a while since I had a look and last time it was the big limiting factor for me
numpy works on pypy these days. You have to install it via pip and it does not have wheels, so it takes a few minutes to install, otherwise works. The numpy-heavy workloads will not be sped up by pypy, but depending on what you are exactly doing your python program might be faster in other parts.

We are actively trying to secure funding and goals to make numpy fast, now that it works and we are working together with numpy community to make sure it stays working.

IIRC, PyPy is actually slower on pure CFFI calls, so anything that is mostly Numpy/Pandas/Scikit will likely be slower in PyPy. At least this was the case when we were using PyPy 3.5.

But often the rest of your code base will experience dramatic increases in performance, and the penalty on already fast CFFI calls was not significant overall (in our use case).

I think you are confusing CFFI (which is massively faster under pypy) with CPython C API calls (which is both slower under pypy and ones used by numpy)
Do you have a donate link ? Would love to contribute.

Usually your donate links are mentioned on your blogspot and more than one person has told me that they are not sure this blogspot is authentic.

Do look into having a single static website with a blog - using something like Pelican (http://www.pelicanthemes.com)

Why does it matter if this blogspot is authentic? The "Donation page" link is to http://pypy.org/py3donate.html which is hosted under pypy.org.

The blog could be written by Trafalmadorian trolls, and it wouldn't matter - so long as you trusted that "pypy.org" was part of the PyPy project and that its connection to PayPal was valid.

I don't think your last comment is relevant. What I think you are saying is that the blog should be hosted under pypy.org and not blogspot. It could be WordPress or any other system.

It couldn't easily be Pelican because the blog supports comments, which means integrating with some other commenting system - plus authentication and spam detection. Why not get rid of the hassle of manually adding files and rebuilding the server and just use blogspot?

Lastly, are you really saying that people aren't donating to PyPy because some supportive blog site somewhere else, run by who knows who, isn't hosted on pypy.org?

yeah - because people dont go to the homepage. what gets tweeted out is this blogspot page (including here on HN). So people check out the nice blog post and see a donate link on the left of https://morepypy.blogspot.com/ and then go ...hmm, looks shady.

Not many read the blog post, then go to Google, search for pypy and then click the link and then donate.

Fundraising is not an easy process - there is an aspect of sales here. Blogspot may be easy.. but its not good branding. And that was my point - cos I want them to succeed.

Click donate here: http://pypy.org/
Will scikit learn and pandas work without issue as well? Keen to try PyPy once I can run a typical data science stack.
Yes, both scikit-learn and pandas should work with PyPy.
Thanks for all the work you guys are doing! PyPy has been a case study in learning about JITs and alternative Python implementations for me over the past few years.
I've attempted to run it (not with this latest release yet though) and it appeared as if calling a numpy function essentially disabled the JIT for that function, which was not acceptable for us at that point. What did seem to help curiously was wrapping the numpy function in a python function and then calling that, which seemed to prevent the JIT being disabled in the calling function.
Curious why? We use numpy is easily optimizable areas so there's very little python code, and JITing a call to a C library is pointless.
PyPy is fantastic. I'm my option, if they achieve to be faster than CPython in all scenarios( I'm thinking in Numpy/Scikit/Pandas CPython C API abuse), and get full Windows support (they do not support multiprocess in Windows), they WILL be the preferred Python Runtime.

Congratulations PyPy team for the great release!

I doubt they'll ever get even as fast as CPython for CPython API usage. What allows PyPy to run Python code fast is the same thing that makes the CPython API slow: The two implementations look nothing alike on the inside. The API is too leaky an abstraction to implement efficiently on anything but CPython.
While I agree with you, pypy already runs cffi fast and we are thinking how to run cython fast. That would cover a large subset of using C API I think.
Awesome. And they're finally pushing pypy3 in their Ubuntu PPA too (https://launchpad.net/~pypy/+archive/ubuntu/ppa)!

I use pypy3 pretty heavily in some personal projects, and the install process has always been extremely annoying. This is great!

------

I wish they'd set up a CI-build PPA (or any install process that's not just "wget binary").

I actually looked at setting up my own ppa at one point, but it's apparently involved enough that I didn't have time.

If I was PyPy, I'd have dropped all 2.x effort and focused on Python 3.8 compatibility.
Actually, there's been quite some speculation that PyPy could inherit the 2.x crown when CPython expires its support for 2.x (1 Jan 2020). PyPy is probably the best non-CPython Python 2.7 implementation, so it's a natural fit to some degree. I suppose it's all dependent on where the PyPy team wants to spend its time and whether or not someone might fund them for this effort (Canonical, RH, etc).

I don't know that they have the will to bug-for-bug reproduce CPython but the C API emulation layer added within the last few years probably means that it will be really high fidelity.

Well, this whole Python 2 & 3 is a huge fiasco. Supporting 2.x keeps the circus forever on.
I don't know that I agree that it's a fiasco. I've been asked to work on a PDP11 in the last few years and while it was mind-boggling (the request -- I turned it down), it was understandable from a business perspective. So I can see how some businesses might want Just A Little While Longer on Python 2.x.

shrug, use python 3.x and don't sweat the fact that some folks are still stuck on 2.x. You certainly can't stop someone from offering support for 2.x -- the language is well defined and the reference implementation is open source. Anyone out there could put up a "Python 2.x Support: Cheap!" sign and be in business.

It surely is. There's a reason behind why support for old versions ends at some point.
Reinventing the wheel keeps the circus going forever. As long as python 2 is supported I don't have to touch any of the scripts I have to "maintain". The moment I am forced to migrate everything to Python 4 (they broke everything already, why not again) I will have to deal with a huge amount of pointless busywork just trying to restore functionality.
Python 4 will not "break everything again", it will be equivalent to a point release. It has been already explicitly mentioned.
Yeah, this is how to world moves forward - otherwise, we'd all be still watching VHS movies. Plus, there's this lovely tool called 2to3.
Funnily I can still watch VHS movies on a relatively new TV. Just had to plug in the VHS recorder and play, meanwhile the description of 2to3 makes it look like a plug and pray.

I mean really? A script that might mention some cases it can't fix as warnings. "optional" fixes that replace missing classes with incompatible replacements, giving you the "choice" of fixing the mess by hand. Also an apparently unstable API for your own fixes, doubling down on breaking your code while you try to fix your broken code has to be dedication.

> Note The lib2to3 API should be considered unstable and may change drastically in the future.

What's the problem to stay on the "just working" version of Python and not feel entitled to get updates?!
The only entitlement here comes from you.

If pypy wants to support 2, they can. If developers want to take advantage of that, they can. If library maintainers want to drop 2 support, they can. If others want to fork the libraries to provide security for the 2.x fork, they can.

What is not acceptable is trying to dictate what technology others choose to use to suit your personal preference.

I could kiss this comment. You don't have a right to stop other people using py2 just so your ecosystem is a little cleaner. Yet that level of entitlement is apparently not uncommon in Python discussions.
There are two things in play here - one is that pypy is a large py2-only codebase that is unlikely to be ported. Second is that maintaining python 2 both does not cost much and it's used by the majority of our users, despite of what you hear on the internet. So, there is both good reason to keep it and a large amount of work to drop it.
I might be mistaken, but it was my impression that the changes to the 2.7 version now are mostly "core" changes that also apply to PyPy3. Basically, most language-specific work is going toward Python 3, and PyPy2 benefits when features they have in common are improved.
It'll be interesting to see if there's an upswing in PyPy usage over the next few months, for the explicit reason that they're not slated to drop 2.7 compatibility. I would imagine a noticeable percentage of 2.7 codebases without the available resources for a rewrite (that has by definition, no business value) would welcome anything close to a drop-in replacement. I'll be very interested to see if I'm wrong though!
What's the word on 3.7 support?
3.7 (or maybe even 3.8) support will start as soon as we can have a "final" 3.6 - which might be couple months from now. I think 3.7 won't be done in 2019, unless some magic source of funding pops up.
3.7 it's a minor update from 3.6, that basically consists in changes to the standard library rather than the language, and 3.8 I think it will be the same, so once the 3.6 support is implemented it will take less time to implement 3.7 support.
Dataclasses are a sweet feature that I can see getting wide adoption pretty quickly. This said, they have a backport and it's mostly just sugar, so I expect one could probably get them working on 3.6 pypy.
http://attrs.org is the inspiration for dataclasses, but attrs has more features and is available on all versions.

http://www.attrs.org/en/stable/why.html#data-classes

Dataclasses work in 3.6 with pip pkg I believe.
how isn't Google or Dropbox or Instagram not completely funding you
Google and Dropbox have been migrating from Python to Go to deal with Python's performance problems instead of trying to make Python run faster.

https://blog.pyston.org/2017/01/31/pyston-0-6-1-released-and...

https://opensource.googleblog.com/2017/01/grumpy-go-running-...

The main way big corpos are sponsoring os is hiring the developers typically. That comes with a whole lot of things that some people don't want to deal with, like changing your place of residence
Zurich or Munich shouldn't be too bad, isn't it?
Why not Cape Town?
I guess they aren't the ones that are using 3.7 anyway.
Does Google use PyPy at all? I wasn't aware.
maybe they could if they funded it to do what they need it to do
How realistic is it for an experienced programmer to dedicate 4 hours a week and join as a contributor? Not sure it is enough to bootstrap enough "project" domain experience to be productive. Asking for a friend.
There are some “easy” tasks to achieve 3.6 full compatibility, tell your friend to try it and see. We are around on #pypy on IRC, mostly european timezones
Anyone know if puppy works with deep learning libraries like pytorch/tensorflow, or if there are plans to do so? Not looking for numerical speed ups, but for speed ups in preprocessing code
I don't have a suitable workload with me right now. Has anyone tested the performance of PyPy3 vs PyPy2? Last time I tested it on a problem, PyPy2 was significantly faster (at least x3). That was maybe one or two years ago, when it was in beta.
As usual it depends. But pypy3.5-6.0.0 is fast enough for me that I now prefer it over pypy2.7-6.0.0.
Anyone using pypy in production? If so, what are your experiences like?
I did do but not for anything high traffic.

At work we had this internal service that did search entirety in memory, using sorted lists. When it was initially written, years ago, that was deemed "good enough".

By the time I joined the company a little over a year ago, search requests were sometimes exceeding 2 minutes, and would be killed due to HTTP timeout (somehow users were OK with search taking a long time, they were just annoyed that now sometimes it didn't work at all). While I was working to rewrite the whole thing to push data to ES and do everything from there, the old system needed to keep running.

So I moved it over to PyPy. I did need to swap blist(C extension) for sortedcontainers (pure python), but that was a minimal change. Search got about 30% faster, enough to keep chugging along while the rewrite happened.

The place where I work uses it for back-end report generation. In my experience, PyPy's claimed speed benefits are very realistic. For most of our code, we get about a 3x speedup. For some areas of code, we get up to a 7x speedup. The only compatibility issues we've run into stemmed from mistakes on our part (e.g. all dicts are effectively OrderedDicts in PyPy, but your code will break in CPython if you rely on that implementation detail).

As a side note, for some reason, it's terrible for pytest — your tests will take literally 100x as long. If you trust PyPy's compatibility, I'd recommend running your tests with CPython even if you run PyPy in production.

> all dicts are effectively OrderedDicts in PyPy, but your code will break in CPython if you rely on that implementation detail

Actually since CPython 3.6 (or maybe it's 3.5?) the same is true in CPython too, though I'd personally still advocate specifying OrderedDict if you particularly require that property anyway.

We used PyPy in production. It sped up workload loading JSON-per-line log to Amazon Redshift by factor of two with no changes.
For what it's worth, I was using it for a Django project and it did speed up noticeably - but I also ran into mysterious crashes. Now, I can't pin it to pypy necessarily, might have been some Django lib not playing ball with pypy. Had no time to test it in depth because moving back to regular Python was "good enough".
I wish PySide (and/or PySide2) would work with PyPy. It looks like (after a cursory search) some version of Gtk can be used but I prefer Qt over Gtk.
For gtk, gstreamer etc. PyGObject should just work. It has PyPy as part of its CI pipeline. Also in combination with matplotlib (GTK3Cairo backend) and numpy.

Not sure about PySide2..

Thanks, but I mentioned I don't want to use Gtk. I've not had a particularly joyous time with Gtk in the past and therefore chose Qt as my GUI platform of choice over Gtk.

Therefore until and unless PyPy has usable Qt bindings (e.g. PySide or PyQt), then unfortunately I need to stick with CPython :/

Yeah, sorry. It shows at least that it's possible with the current PyPy C API layer to have GUI bindings working.

Here is the only PyPy related issue I found for PySide2: https://bugreports.qt.io/browse/PYSIDE-535

PyQt doesn't have a bug tracker ¯\_(ツ)_/¯

Cool. Does anybody have an idea How Django or Flask web app can be benefited from overall performance perspective ?
I personally think that Rust port of CPython has the bigger potential than PyPy.
Why? Isn't the clever bit the JIT, rather than the implementation language?
Why do you think so? Rust is an interesting language in itself, but I don't see why you'd expect a Rust port of CPython to compete with PyPy in terms of performance. It's still, at its core, the CPython architecture.
Does it aim for security and correctness? As the interpreter has hard time to keep up with JIT in the speed race which I believe is the major driving force behind PyPy.