Hacker News new | ask | show | jobs
by pbreit 2193 days ago
Can someone explain the value/purpose of docker to someone who (easily) deploys regular apps to a Digital Ocean droplet?
5 comments

Easily.

* A Docker image normally contains all the dependencies of a program, or a set of programs. You can run libraries and other software of whatever versions, not necessarily available on your host system; they are already baked into the image. Usually a Docker image only needs a compatible kernel (this is a very lax restriction). It is a damn easy way to distribute software, especially such software which is not trivial to install and set up: tired of wrangling with Grafana installation? just pick a container from their site. And of course you can mount whatever you need inside the container when you need to, so it has controlled access to your filesystem(s).

* A Docker image normally runs with its own firewall. That is, you explicitly say which IPs / ports are available ("exposed") from the container; everything else is blocked. This helps isolate containers from the internet and from one another, and also helps build private networks between containers not exposed outside the host machine. Since containers already talk to each other via a network, it becomes easy to distribute them across many machines when you need to scale.

* A Docker image is built out of layers, and they can share layers. If you are reasonable enough to put common stuff to the bottom layers, then you can have multiple containers with a lot of common software installed inside (like a Node runtime, a JVM, etc) which stored on the host system only once.

* Docker images / containers are the standard for many cloud management systems. AWS can run containers directly. K8s operates on containers. Docker itself offers a simple but rather reasonable orchestration tool called docker-compose. It's great for small deployments and for things like running your setup locally, for development and integration testing.

Containers are not always better for everything you can think of. But they solve a number of common problems; some of these problems might be ones you'd like to have solved, some not.

The biggest advantages though is having to explicitly define al the edges:

- You need persistent storage? You better define it or you'll lose it one the next (re)deploy. - You need to expose network services? Tell me which-ones or it won't work.

If we're talking on small scale single server deploys, it makes backup, upgrade/rollback and migration of applications a LOT easier.

As long as you're not talking about a k8s cluster - which you should avoid with application architectures that aren't "cloud native", I assume you'll have a local docker-compose file which you just start/stop to bring the entire application stack you need (database/app/proxy server/monitoring/...) with one command, which means all external service dependencies are also contained in one 'stack'.

What I also use it for on small-scale apps is having a test environment of the same software running on the same droplet. I just put a Traefik reverse proxy in front of it that autodetects the docker containers, handles HTTPS/ACME certificates and routes the test-url to the test-containers, the real URL to the "production" containers, and they're all isolated.

The joke that I think actually explains it pretty well is that it eliminates the "well it works on _my_ machine" problem, by not just shipping the code, but shipping the machine.
It's also a great way to make sure your build artifacts are owned by root so you don't accidentally delete them.
*shipping the userland

It's not the equivalent of handing over a VM image. You're still open to unintended sensitivity to kernel versions, for instance.

I know that this joke isn't exactly what's happening. I just feel like it's a good way to explain the _idea_ of Docker to people who have no idea what it is.
Reproducibility is the biggest value in my opinion. A Dockerfile encapsulates all the messy dependencies in a single isolated environment. This also makes deployments easier too.
I would say that's portability rather than reproducibility. Docker increases the extent of, but doesn't guarantee, reproducibility.
In which case does it not guarantee reproducibility?
I can answer this one. Sometimes you have lines like this:

    FROM ubuntu:focal
    RUN apt-get -y install libssl-dev
    <your app details>
Since libssl-dev gets periodically updated (security updates and whatnot) if you build this now and build it again in 1 year you're very probably not going to get the same OpenSSL version. So it MIGHT be reproducible, but can easily give you different results depending on updates to the packages and the way your Dockerfile imports external dependencies. And that's before we even mention updates to the base container image.

Of course, you can refer to a specific container image id and pin all your packages, which would go a long way to improving reproducibility.

So it's wrong (or rather uninformed) usage of Docker that leads to this, the tech itself is sound and does guarantee reproducibility.
Yes, if you have constant inputs, it will produce mostly constant outputs (timestamps, etc).

But once you start getting time differential and caching, you run into... if not bugs, faults - your cached image came from a month ago, apt is now invalid. Sure, I should've used a private dpkg repository, but I want the side-effects from all of those.

A docker image I built today, is very likely to be ostensibly the same as the one I build in five minutes, but it doesn't guarantee that because its cache is at the image-level and the externalities (see: every possible step that isn't an ADD or COPY) is variadic.

It does a very valiant attempt at consistency to be clear, and I have my own gripes about it, but it's not like it creates a new planet from which technology can be built upon; It's more like creating a moon on which you can build a base, but you're still subject to the orbit of the earth.

> the tech itself is sound and does guarantee reproducibility

Does the tech guarantee reproducibility if you can use it to create un-reproducible artifacts? I don't think Docker claims anywhere to guarantee reproducible builds...

Docked doesn’t guarantee reproducibility anymore than a tar file in that context.
It's for reifying brittle engineering - if you're doing well without it, you're already at a level above it.
Yeah, I've had a hard time justifying using docker yet too, but hopefully I find a compelling case