Hacker News new | ask | show | jobs
Optimized Python Docker Image (revsys.com)
59 points by webology 3121 days ago
7 comments

Binary docker image, with no (public) reproducible build system? No thanks.
Doh sorry that was an oversight on our part. Just pushed up the information and our benchmark runs here https://github.com/revsys/optimized-python-docker
Also, I wonder how many obscure bugs were introduced by using more aggressive compiler flags? The image has "lto" in the name, and at https://github.com/docker-library/python/issues/160 I see this comment:

> In the future the Python developers may decide to turn on further optimizations based on this argument, for example link-time optimizations (LTO), though they haven't worked out all the bugs for that one yet.

(Thanks mastax for finding the issue!)

So I see that profile-guided optimization are being used.

Profile-guided optimization, in layman terms, means that you run you code under a profiler for a while, see what parts (branches, functions, data structures etc) are being used the most and use this information to make a build of your code that considers the profiler's findings when doing optimization.

So what does it mean? It means, basically, that revsys is publishing a python build that is optimized for their use-case. Which may or may not be your use-case. This is not good nor bad.

Still, the claim "up to 19% faster" is false in general (but true in a particular case -- their use case).

Just keep this in mind, because this python build might perform worse than a regular python build.

Actually I thought that as well when I first saw it was an option, but these are numbers comparing the official images to these images we built using the standard Python benchmarks. Not compared to say some code we wrote or run in production.

We did 10 runs of each on a large AWS instance and averaged the results.

The build info and benchmark results are up here https://github.com/revsys/optimized-python-docker if we’re wrong about something please let us know.

However, definitely agree that any optimization like this could have negative performance consequences for certain specific bits of code. But that’s likely true of any build to the next.

I've been using --enable-optimizations for a while, and built my own 3.6 images a couple of weeks back.

Here's the Dockerfiles and readme:

https://github.com/rcarmo/ubuntu-python

Am I totally misunderstanding something or there is no Dockerfile referenced? If so, is this just a binary blob I pull from a public registry?

I have some trust concerns about basing my infrastructure on something opaque.

Absolutely! Was just an oversight on our part, here is the Dockerfile and build info https://github.com/revsys/optimized-python-docker
Super fast, thanks!
This was just mentioned today at north bay python.

Another alternative for a fast speed up by changing your base image is trying out the pypy images.

Yeah I thought it would be nice to release this at North Bay today. We also released our new website redesign (and new infrastructure setup) last night so there is probably some dust on a few things we'll clean up over the next few days.
It looks very useful thanks for the image and sponsoring events like this!
Why aren't these just upstreamed onto the official Python binaries?
I think these optimizations are upstream, but they're not in the canonical 'python' docker image: https://github.com/docker-library/python/issues/160
Ah I see, that makes sense, thank you.
The 10%ish speed boost is all down to PGO, I'm assuming?
I don't think we ran benchmarks with LTO and PGO in isolation, so honestly not sure exactly how much of a contribution each makes on each benchmark. PGO is definitely the bulk however.