Hacker News new | ask | show | jobs
by aaron42net 4256 days ago
We're moving all of production in EC2 from an old CentOS 5 image managed by capistrano to CoreOS, with fleet deploying images built by the docker.io build service and private repo. I love it.

Every week, we rebuild our base image starting with the latest debian:stable image, apply updates, and then our apps are built off of the latest base image. So distro security updates are automatically included with our next deploy.

We had been deploying multiple apps to the same EC2 instances. Having each app's dependencies be separate from other apps has made upgrading them easier already.

This also means all containers are ephemeral and are guaranteed to be exactly the same, which is a pretty big change from our use of capistrano in practice. I'm hoping this saves us a lot of debugging hassle.

Instead of using ELBs internally, I'm using registrator to register the dynamic ports of all of my running services across the cluster in etcd, with confd creating a new template for NginX and updating it within 5 seconds if a service comes up or drops out. Apps only need to talk to their local NginX (running everywhere) to find a load-balanced pool of whichever service they are looking for. NginX is better than ELB at logging and retrying failed requests, to provide a better user-experience during things like deploys.

Some of these things could be solved by spinning up more EC2 instances. However that usually takes minutes, where docker containers take seconds, which changes the experience dramatically.

And I'm actually reducing my spend by being able to consolidate more. I can say things like "I want one instance of this unit running somewhere in the cluster" rather than having a standalone EC2 instance for it.

1 comments

The biggest problem I have overall is pushing new code. When you push new code to git, do you then stop a container and restart it to get a new container working? (Assuming you do something like git clone in the Dockerfile)
I grant the docker.io build and private repository service access to my github repo, drop a Dockerfile at the root of my git repo, and the build server does the checkout outside of the Dockerfile and then executes the Dockerfile. I then use a github webhook to trigger a build when there's a new checkin to the master or qa branches. If the dockerfile completes successfully (based on exit status codes), it then spits out new docker images tagged with either "master" or "qa".

My fleet unit does a docker pull before it starts the container. So I just stop and start an individual unit to get it to run a new version.

Though fleet has a concept of global units (that run on all available machines), there's no way to do a rolling restart with them yet. Instead, I use unit templates to launch multiple instances of the same unit, and then stop and start each instance individually, and wait for it to respond before continuing to the next one. I intend to catch a webhook from the build server and do this automatically, but haven't written this yet.