Hacker News new | ask | show | jobs
by treo 5517 days ago
That 3x speed up is about the same that I have seen with my code. I'm currently writing a database cache simulator to try different algorithms with it, and if I want to have anywhere near realistic results I have to use realistic access traces.

Tried it today with a tpc-c trace which has about 500 million accesses. The result: CPython would have run for about 90 minutes (I stopped it after 30 minutes, and began to look for a speedier possibility), PyPy only took 22 minutes.

2 comments

I've gotten about a 10x speed up on numerics code where there's so much branching involved in the calculations that I can't afford to use NumPy.

As for me, the main reason I haven't moved to PyPy yet is the lack of database and messaging support.

Which databases? At the moment we have SQLite, Oracle (haven't tested it myself), and Postgresql. Plus whatever you can find a pure python driver for. Also, what do you mean by messaging?
Oh wow, I didn't realize that Postgresql was working on PyPy. I heard that Django was only tested with SQLite so I made my assumptions from then on.

By messaging, I mean something like RabbitMQ, that way I can have batch scheduling at a little bit more sophisticated grain than "run a cronjob".

psycopg2 is implemented in a fork of mine: http://bitbucket.org/alex_gaynor/pypy-postgresql/ it requires compiling yourself, but works nicely (I was told by someone that this brought their script's time from 2 minutes to 8 seconds). As of last test it passes all Django tests. What's the current standard RabbitMQ lib? I didn't realize it was a c-extension (hell I've used it myself and never noticed).
Well the most used one is Celery. It depends on multiprocessing which blew up on me the last time I tried it in PyPy.

But.... I just tried "import multiprocessing" in PyPy 1.5 and it worked! Is this all part of the C-API compatability layer? Does that mean Cython code may soon work in PyPy too (that's my pony feature)?

RabbitMQ should work under PyPy currently then, all of its dependencies purport to be pure python. ---

Another RabbitMQ lib is Rabbitmq-c which is direct wrapping around librabbitmq-c. It ecks out extra performance vs pure python rabbitmq, but mostly it isn't needed.

Nope, multiprocessing was added to the Python standard library in 2.6, our previous releases implemented python 2.5, 1.5 implements 2.7, so it now includes multiprocessing.
There's also MySQL via PyMySQL
MySQLdb also works: https://bitbucket.org/pypy/compatibility/wiki/mysql-python

I just compiled it today and it works.

No, tpc-uva is a bit more than what I need right now. I might use it later on when I have decided on any single algorithm that I want to test in a more realistic environment. Because changing the caching algorithm that postgres uses isn't as easy as doing so in a standalone python simulator, I will have to be sure that I want to do that. I have already tried that before and it is a lot harder and takes a lot more time.