Optimized Python Docker Image

Y	Hacker News new \| ask \| show \| jobs

	Optimized Python Docker Image (revsys.com)
	59 points by webology 3121 days ago

7 comments

Daviey 3121 days ago

Binary docker image, with no (public) reproducible build system? No thanks.

link

frankwiles 3120 days ago

Doh sorry that was an oversight on our part. Just pushed up the information and our benchmark runs here https://github.com/revsys/optimized-python-docker

link

greglindahl 3121 days ago

Also, I wonder how many obscure bugs were introduced by using more aggressive compiler flags? The image has "lto" in the name, and at https://github.com/docker-library/python/issues/160 I see this comment:

> In the future the Python developers may decide to turn on further optimizations based on this argument, for example link-time optimizations (LTO), though they haven't worked out all the bugs for that one yet.

(Thanks mastax for finding the issue!)

link

znpy 3120 days ago

So I see that profile-guided optimization are being used.

Profile-guided optimization, in layman terms, means that you run you code under a profiler for a while, see what parts (branches, functions, data structures etc) are being used the most and use this information to make a build of your code that considers the profiler's findings when doing optimization.

So what does it mean? It means, basically, that revsys is publishing a python build that is optimized for their use-case. Which may or may not be your use-case. This is not good nor bad.

Still, the claim "up to 19% faster" is false in general (but true in a particular case -- their use case).

Just keep this in mind, because this python build might perform worse than a regular python build.

link

frankwiles 3120 days ago

Actually I thought that as well when I first saw it was an option, but these are numbers comparing the official images to these images we built using the standard Python benchmarks. Not compared to say some code we wrote or run in production.

We did 10 runs of each on a large AWS instance and averaged the results.

The build info and benchmark results are up here https://github.com/revsys/optimized-python-docker if we’re wrong about something please let us know.

However, definitely agree that any optimization like this could have negative performance consequences for certain specific bits of code. But that’s likely true of any build to the next.

link

rcarmo 3120 days ago

I've been using --enable-optimizations for a while, and built my own 3.6 images a couple of weeks back.

Here's the Dockerfiles and readme:

https://github.com/rcarmo/ubuntu-python

link

muxator 3120 days ago

Am I totally misunderstanding something or there is no Dockerfile referenced? If so, is this just a binary blob I pull from a public registry?

I have some trust concerns about basing my infrastructure on something opaque.

link

frankwiles 3120 days ago

Absolutely! Was just an oversight on our part, here is the Dockerfile and build info https://github.com/revsys/optimized-python-docker

link

muxator 3120 days ago

Super fast, thanks!

link

alex- 3120 days ago

This was just mentioned today at north bay python.

Another alternative for a fast speed up by changing your base image is trying out the pypy images.

link

frankwiles 3120 days ago

Yeah I thought it would be nice to release this at North Bay today. We also released our new website redesign (and new infrastructure setup) last night so there is probably some dust on a few things we'll clean up over the next few days.

link

alex- 3120 days ago

It looks very useful thanks for the image and sponsoring events like this!

link

StavrosK 3121 days ago

Why aren't these just upstreamed onto the official Python binaries?

link

mastax 3121 days ago

I think these optimizations are upstream, but they're not in the canonical 'python' docker image: https://github.com/docker-library/python/issues/160

link

StavrosK 3121 days ago

Ah I see, that makes sense, thank you.

link

Twirrim 3120 days ago

The 10%ish speed boost is all down to PGO, I'm assuming?

link

frankwiles 3120 days ago

I don't think we ran benchmarks with LTO and PGO in isolation, so honestly not sure exactly how much of a contribution each makes on each benchmark. PGO is definitely the bulk however.

link