Hacker News new | ask | show | jobs
by nzoschke 3503 days ago
Thanks for sharing your experience as another cautionary tale.

At Convox we have been running Docker in prod for 18 months successfully.

The secrets?

1. Don't DIY. Building a custom deployment system with any tech (Docker, Kubernetes, Ansible, Packer, etc) is a challenge. All the small problems add up to one big burden on you. 6 months later you look back at a lot of wasted time...

2. Don't use all of Docker. Images, containers and the logging drivers are all simple great. Volumes, networks and orchestration are complex.

3. Use services. Using VPC is far simpler than Docker networking. Using ECS is much easier than maintaining your own etcd or Swarm cluster. Using Cloudwatch Logs is cheaper and more reliable than deploying a logging contraption into your cluster. Use a DB service like RDS is far far easier than building your own reliable data layer.

Again thanks for sharing your experience as a cautionary tale.

If you are starting a new business you should not take on building a deployment system as part of the challenge.

Use a well-built and peer reviewed platform like Heroku, Elastic Beanstalk or Convox.

8 comments

At Faraday, we're also been running Docker in production for over a year, and we're very happy with how well things have been working.

As nzoschke recommends, we rely heavily on ECS, RDS and other managed services. We are very careful about using exotic new Docker features until somebody else has successfully used them in production. We use DNS, load balancers and regular networking for discovering containers and communicating between them.

And it all basically just works. The worst problem we've encountered is that twice a year, we deploy a new version of ecs-agent to our staging cluster and need to revert it because of issues.

For us, the biggest challenge has been setting up a good local workflow for developing Docker apps with multiple services and multiple underlying git repositories. The docker-compose tool is great but it doesn't go far enough and doesn't provide enough structure. We've open sourced some our internal Docker dev tools here http://blog.faraday.io/announcing-cage-develop-and-deploy-co... but we think there's a lot more which could happen to make it easy to develop complicated apps.

As another Faraday employee, I'll add that we didn't start right away with deployment. Instead, we used docker to provide a consistent way to test our apps. Once all of our apps were properly dockerized to run tests, we were able to move to deploying the app with a simple docker-machine + docker-compose setup. After that, it was a relatively easy move to ECS. docker-compose can probably handle 80% of your development needs, but the tool we wrote allowed us to quickly switch our apps from a canonical "image mode" to "source mode" while developing. It also provided a way to run acceptance tests locally with our entire app cluster running locally. A little more info: https://dkastner.github.io/2015/07/25/accidental-continuous-...
About a year we opted for azk.io as an alternative to docker-compose.
Re 1: Why wouldn't you recommend going with kubernetes? From someone that hasn't deployed anything with it (yet), it seems rather straight-forward if run on a managed cluster (gke). I might be wrong and am thus genuinely interested in what you did for deployments.
AFAICT, gke would be #3 (use services) not #1 (don't DIY)
That's right. GKE is awesome. I highly recommend using the GKE service that over running Kubernetes software on AWS or Digital Ocean.
I'm relieved, thanks guys. I really don't see the complexity when deploying to a (managed) k8s cluster. Sure, we'll still have to configure a deployment pipeline, add some scripts to push and deploy the right images etc., but overall, k8s seems really well suited to the task.
k8s is a great complement for docker. Usually any arguments against docker are in a context where you are running it outside of k8s.
I've been running Docker in production for half this time and got pretty much to the same conclusion. ECS has its issues, but is still much more manageable than anything else (on AWS). And if started today, would've definitely used Convox (iirc, at the time you didn't support private subnets).

I never liked Cloudwatch Logs and recently switched it for an ELK stack (hosted by logit.io).

If the argument is "don't DIY" then the software is kinda broken. The whole purpose of a software project is to have users... use it. If you're going the way of outsourcing why stop at using ECS, just pay someone to run your software business and focus on... idk, marketing?
No it's not - the don't DIY applies to whatever isn't your core competency, not all software building.
I wrote even more thinking about this up in blog form:

https://convox.com/blog/docker-in-production/

With all due disrespect and without commenting on the efficacy of using convox or any other platform, this becomes too self serving.

If the problem is Docker complexity that needs to be solved. If the problem is difficulties of running stateless apps, or distributed storage and networking these need to be highlighted and widely understood.

Using Convox or any other platform will not magically make them disappear. You still need to troubleshoot and understand what you are running. Another layer to hide the complexity can hardly help.

> Another layer to hide the complexity can hardly help.

I strongly agree! I'm sensitive to self-serving, but while Convox is another layer, it does make problems disappear by disallowing them.

Use Convox and try to boot or deploy a docker-compose.yml that uses networking or uses volumes incorrectly. You are blocked from doing so with a nice reason why.

> If the problem is difficulties of running stateless apps

This is not the problem. In fact this is a solved problem.

> or distributed storage and networking

These are hard problems.

Networking is largely solved with AWS VPC. Other providers do this very well too. This lends to your earlier point, adding the Docker networking layer generally doesn't help the problem. I take it for granted that any new networking stack shouldn't be trusted.

Distributed storage is barely solved anywhere.

Oops, typo there. That should have been 'with all due respect' not disrespect.
What's your opinion of docker services (networking, compose, swarm) versus k8s (cni, kubeadm)

I suppose it's much relevant for those of us who are not on AWS.

How do you define "custom deployment system"? Jenkins multi-branch pipeline deploy workflow?