Hacker News new | ask | show | jobs
by scottyallen 5414 days ago
This article is missing the best tool of the bunch: virtualenv. Part of the fundamental reason managing python packages is such a pain (or perl packages, or ruby packages, etc) is that they're installed once for the whole system.

Virtualenv allows you to sandbox all your packages in a directory local to your development/deployment environment. You no longer need to install anything on the base system, and you can have multiple virtualenvs side-by-side. In addition, you can copy a virtualenv from your build environment to your production environment by just copying the directory.

This gives you huge wins in versioning packages, testing out different versions alongside each other, and being able to strongly validate that what you tested in your staging environment is the same thing that got deployed to your production systems.

In general, I'm a very strong believer of pushing _everything_ your production environment relies on to a single directory on your production system, rather than installing it on the base system. It's actually rather surprising more people haven't moved to this model, and that there aren't better tools to support it.

2 comments

Hi, I'm the author of the article. We've definitely looked at virtualenv but for right now we're just developing a single app. Thus far, we've been able to get away with munging the global package list but this could change in the near future. I tried playing around with virtualenv earlier this year but I couldn't figure out how to push directories OR make virtualenvs reliably relocatable. The main problems for us were/are:

1) We develop on macs 2) Our deploy environments are a mix of 32-bit/64-bit machines running Ubnutu 3) It wasn't clear if each deployment required its own virtualenv to start from scratch OR if we should reuse a virtualenv (which seemed to defeat the whole point of using virtualenv)

Got any ideas on what we could do?

I've used virtualenv extensively for production deployment in a large environment (100+ servers), with an environment very similar to yours. We developed on macs as well, and ran RHEL in production. All building/pushing to production was done from an RHEL similar to production. Some things that worked for us:

- We deployed tarballs to production machines which unpacked to a single directory that included a virtualenv that had both all our code and all the modules we relied on, and anything else that the code needed to function that wasn't installed in a very basic RHEL install.

- We used the --relocatable option in virtualenv to remove all references to absolute paths, which meant we could copy the virtualenv around to various directories and machines and still have it work.

- We had a series of makefiles that would make/update a virtualenv, and which worked on both mac and linux. We would use this both in development and when deploying. For several packages, we had to hand tune this to work on both mac and linux, but for most things, it Just Worked.

- When we deployed to production, we would unpack the tarball to a directory whose name included a version number for the push. We then had a symlink that pointed at the currently running version. This meant all we had to do to rollback a push was flip the symlink to the previous deployment directory, and restart. This rollback included any modules that changed (since they were in the virtualenv). I haven't seen any other way to reliably do this.

The one thing you mention that we didn't have to deal with was mixed 32/64-bit environments. One nasty solution is to have two build machines, one 32-bit and one 64-bit, but there's probably a better way...

- We deployed tarballs to production machines which unpacked to a single directory that included a virtualenv that had both all our code and all the modules we relied on, and anything else that the code needed to function that wasn't installed in a very basic RHEL install.

Ah, we're doing tarballs too but only for our code. We're having pip munge the global package list at the beginning of every deployment (somewhat dicey but with requirements files its easy to rollback package upgrades/downgrades) which does mean that packages can change underneath a running Python process :O

- We used the --relocatable option in virtualenv to remove all references to absolute paths, which meant we could copy the virtualenv around to various directories and machines and still have it work.

Hmmm this could work, I was under the impression --relocatable wasn't fully tested but if its works for you, we'll definitely take a look at it :)

- We had a series of makefiles that would make/update a virtualenv, and which worked on both mac and linux. We would use this both in development and when deploying. For several packages, we had to hand tune this to work on both mac and linux, but for most things, it Just Worked.

Yeah I took at look at virtualenvwrapper to handle the multi-environment thing but that was primarily geared for multiple apps. Probably got me more confused than necessary. I'll take a look at plain old virtualenv wioth scripts.

- When we deployed to production, we would unpack the tarball to a directory whose name included a version number for the push. We then had a symlink that pointed at the currently running version. This meant all we had to do to rollback a push was flip the symlink to the previous deployment directory, and restart. This rollback included any modules that changed (since they were in the virtualenv). I haven't seen any other way to reliably do this.

We do the exact same thing, minus the modules. We do have an issue with packages changing underneath versions but if we don't bounce the servers they should have the "old" modules already imported. We should fix this though :)

The one thing you mention that we didn't have to deal with was mixed 32/64-bit environments. One nasty solution is to have two build machines, one 32-bit and one 64-bit, but there's probably a better way...

Yeah, our solution was to leave it to pip :)

I'm relatively new to using virtualenv myself, but I think you replicate a virtualenv on a different os, architecture or location by creating a new virtualenv on the target machine then using pip to reinstall all the packages. You use "pip freeze" in your source environment to generate the package list then "pip install --requirements=<filename>" to reinstall packages on the target.
you can copy a virtualenv from your build environment to your production environment by just copying the directory

Not if you're developing on a different platform/architecture than you're deploying to (e.g. OS X vs. Linux). Unless you're only using pure Python modules, which excludes PIL and most database drivers.

Other than that, you're very right.

Yes, right. However, your build/staging environment should be the same platform/architecture as your production environment, for lots of reasons other than just this. Development environments can be different, of course...