Hacker News new | ask | show | jobs
by dasil003 3981 days ago
There was also a time when most people thought they didn't need version control. Back in the 80s and 90s it was a justifiable viewpoint because existing version control systems sucked.

The problem with Docker is not that it doesn't solve (or attempt to solve) widespread problems. At its best, Docker gives you dev/production parity, and dependency isolation which is useful even for solo developers working part-time. The problem is that it's not a well-defined problem that can be solved by thinking really hard and coming up with an elegant model—like, for example, version control—it's messy and the effort to make it work isn't worth it most of the time right now.

That's no reason to write off Docker though. Pushing files to manually configured servers or VPSes is messy and leads to all kinds of long-term pain. You can add Chef / Puppet, but it turns into its own hairy mess. There's no easy solution, but from where I stand, the abstraction that Docker/LXC provide is one that has the most unfulfilled promise in front of it.

2 comments

> At its best, Docker gives you dev/production parity

I get that when I use the same OS and built-in package manager?

I would virtualize the environment using something like VirtualBox for my dev and EC2/DigitalOcean/etc on prod.

> and dependency isolation

If you're going to scale something, you're going to split everything out on different virtualized servers anyway, so you'll get your isolation that way.

Basically, current mainstream practice is to virtualize on the OS level, where as Docker is pushing to have things virtualized on the process level.

I personally don't see the advantage ... just more complexity in your stack. I never have to mess with the current virtualization structure, I don't even see it. It looks just like a "server", even though it's not. Isn't that better?

To be fair, I've worked in places where all the devs were on the same OS, same version, and we still had problems.

But I agree, just use VirtualBox. I know Idea already supports deploying to VMs and you they just look like another machine, so no learning curve. All the benefits with none of the hassle.

Yeah, but then there's still the issue of secrets, you need to have testing PayPal credentials, testing mailing service credentials, etc. There's the issue of deploying changes fast without leaving files in an inconsistent state (you don't want half of some file to run). How about installing the required dependencies?

I don't use Docker, but those are problems I can think of off the top of my head.

Docker doesn't credibly solve the credentials problem and the other problems you outline (which do exist) are as practically solved with something like Packer. And I mean, I'm not a Packer fan--oh look, VirtualBox failed to remove a port mapping for the VM that just shut down, throw away the whole build--but it's built on much, much more battle-tested technology with a much wider base of understanding.

(And, later, if you want to play with Docker, Packer lets you do that too. But you should use the Racker DSL in any case, because life is too short to deal with Packer's weird JSON by hand.)

Thanks for pointing me to Racker (https://github.com/aspring/racker). I'm currently building Packer and Terraform images with chunked together Python scripts that work, but I wouldn't call them a great solution. I'm actually using Packer specifically so I can start with regular EC2, and then move to a more Docker-based infrastructure.
Packer severely frustrates me, with the maddening regularity in which it fails just for funsies. Or the consistent but completely inane ways that it fails, like refusing to proceed based on not finding a builder for an 'only' or 'except' clause (making it nearly impossible to re-use provisioners and post-provisioners across multiple projects). Racker does help--my shared Racker scripts are in a Ruby gem--though I think that it pretty much removes Packer as anything more than a dummy solution into which you dump directives on a per-builder basis. As a tool that you carefully feed the bare minimum of information to do its job in any specific situation, though, it works okay.

Terraform, on the other hand, I think is a huge, huge mess, and I don't think they're going to fix it. I wrote a Ruby DSL for it the last time I tried to use it in anger, only to encounter that Terraform didn't honor its own promises around the config language it insisted on instead of YAML or a full-featured DSL of its own. Current client uses it, and every point release adds new and exciting bugs and regressions in stuff that should be caught by the most trivial of QA. For AWS, I strongly recommend my friend Sean's Cfer[1] as a better solution; CloudFormation's kind of gross, but Cfer helps.

[1] - https://github.com/seanedwards/cfer

Credentials have to be managed separately from Docker anyway.

> There's the issue of deploying changes fast without leaving files in an inconsistent state (you don't want half of some file to run). How about installing the required dependencies?

rpm / dpkg also install dependencies, are quite fast and well tested. They have the advantage of working in a standard environment which most sysadmins know but the disadvantage that you need to configure your apps to follow something like LSB (e.g. install to standard extension locations rather than overwriting system files, etc.).

The one issue everything has is handling replacement of a running service and that's not something which Docker itself solves – either way you need some higher level orchestration system, request routers, etc. Some of those systems assume Docker but that's not really the value for this issue.

> the disadvantage that you need to configure your apps to follow something like LSB (e.g. install to standard extension locations rather than overwriting system files, etc.).

Common misconception. You only need to do this if you're going to try to push the packages upstream. If they're for your own consumption, you can do what you like. Slap a bunch of files in /opt, and be done with it - let apt manage versions for you and be happy.

As with many things, this is one area where you've just got to know what to ignore. It's simpler than it looks.

I think we're actually talking about the same thing – I said “like LSB” simply to denote following some sort of consistent pattern, which will vary depending on how widely things are shared.

/opt is defined in FHS for local system administrator use, so installing your company's packages there is actually the recommended way to avoid conflict with any LSB-compliant distribution as long as you use /opt/<appname> instead of installing directly into the top-level /opt:

http://www.pathname.com/fhs/pub/fhs-2.3.html#OPTADDONAPPLICA...

rpm and fast don't really go together. dpkg is much better. dnf will be interesting.
> Yeah, but then there's still the issue of secrets

How would Docker help with this? Genuinely curious.

I store them in bash scripts outside the repo that populate the relevant data into environment variables and execute the code. The code then references the environment variables.

> How about installing the required dependencies?

There are two kinds. On the OS level and on the platform level.

On the OS level, you can have a simple bash script. If you need something more complex, there are things like Chef/Puppet/etc.

On the platform level, you have NPM/Composer/PIP/etc which you can trigger with a simple cron script or with a git hook.

> There's the issue of deploying changes fast without leaving files in an inconsistent state

So the argument here is that you're replacing one file in one go vs possibly thousands? That in the latter scenario the user might hit code while it's in the process of being updated?

Ok. With docker, you would shut it down to update. You would have to.

Same goes for the traditional deployment? Shut it down, update, start it back up?

You can, of course, automate all of this with web hooks on Github/Bitbucket, for both docker and the traditional deployment.

The traditional deployment should also be faster, since it's an incremental compressed update being done through git.

Kubernetes secrets are a really great solution to this problem. [1] They are stored at the cluster level and injected into the pod (group of containers deployed together) via a file system mount. This means that each pod only has access to its secrets which is enforced by the the file system namespace. If an entire machine is compromised, only the secrets of pods currently scheduled onto that machine are able to be stolen. That's a high level, but it's worth taking a look at the design doc.

Edit: forgot to mention, the file system mount means that they don't need to be in env var, which are fairly easy to dump if you have access to the box or are shipping containers around in plain text.

1. https://github.com/GoogleCloudPlatform/kubernetes/blob/maste...

I don't know if Docker helps with this, I don't use Docker. But some kind of solution has to exist.

How AWS does updates is it first downloads the new code into a separate folder and then switches the link to point to the new folder instead.

But AWS has an unsatisfactory feeling because it downloads the entire code instead of doing a git update. These are all issues that could be fixed, and someone has to do them. I have no idea if Docker helps with any of them, but the opportunity is still there.

Ansible, systemd and go is stealing my heart at the moment. Basically pick the tech that doesn't cause the problems to start with.

I still reckon that the main reason VMware ESX is as successful as it is comes down to the lack of isolation and sheer deployment hell that windows has been for years. The same can be said for python or ruby on a Linux machine for example. Docker removes some of that pain like ESX does.

That's a bit of my point... If you're building relatively small independent services with Docker you can deploy service A with node 0.10 as it's tested environment and service B with iojs 2.4 on the same server, without them conflicting... when you need to update/enhance/upgrade service A you then can update the runtime.

The same can be said for ruby, python and any number of other language environments where you have multiple services that were written at different times with differing base targets. I've seen plenty of instances where updating a host server to a new runtime breaks some service that also runs on a given server.

With docker, you can run them all... given, you can do the same with virtualization, but that has a lot more overhead. It's about maximum utilization with minimal overhead... For many systems, you only need 2-3 servers for redundancy, but a lot can run on a single server (or very small cluster/set)

I have to agree on ansible, systemd and go... I haven't done much with go, but the single executible is a really nice artifact that's very portable... and ansible is just nice. I haven't had the chance to work with systemd, but it's at least interesting.

> The same can be said for ruby, python and any number of other language environments where you have multiple services that were written at different times with differing base targets. I've seen plenty of instances where updating a host server to a new runtime breaks some service that also runs on a given server.

This is a solved problem in Python and Ruby. In Python, use virtual environments. In Ruby, use RVM. You won't have the issue of one tenant breaking another.

And with node, you can use nvm... however there are libraries and references at a broader scope than just Python, Ruby or Node... Say you need an updated version of the system's lib-foo.

A runtime environment for a given service/application can vary a lot, and can break under the most unusual of circumstances. An upgrade of a server for one application can break another. Then you're stuck trying to rollback, and then spend days fixing the other service. With docker (or virtualization) you can segregate them from each other.

You're correct, but I can see that there's certainly a place for having a single solution that works across all ecosystems.

Also, RVM in production? Sledgehammer to crack a nut :-)

In local system yes, but in production its painful to work with. With RVM for isolation you would create gemsets for each app with specific ruby version. It OK for 2-3 applications, but anything more than that would be a pain to work with. And then if you plan to put everything behind passenger, it would just be too messy. Think of automating this? Would be a nightmare to maintain. Over here containerization does help.
Node has this too : nave, npm and n. But using these tools means that you are not longer using the package manager of the system and this can be a problem sometimes. Eg you need to open your firewall to something else that it is not the standard pkg manager.

I see docker as a valid attempt to fix limitations of existing and broken package system (eg: apt) at a price that I am not yet willing to pay.

Virtual environments doesn't work for the interpreter itself. Not only that some of the packages will need to build c extensions and they use different versions of the same library which might break.
It partially is, until you need native dependencies.
Which version of RVM are we running again?