Hacker News new | ask | show | jobs
by cmeacham98 459 days ago
Because Docker/OCI/etc got the most important part right (or at least much better than the alternatives): distribution.

All you need to start running a Docker container is a location and tag (or hash). To update, all you do is bump the tag (or hash). If a little more complicated setup is necessary (environment variables, volumes, ports, etc) - this can all be easily represented in common formats like Docker compose or Kubernetes manifests.

How do you start running a system-nspawn container? Well first, you bootstrap an entire OS, then deal with that OS's package manager to install the application. You have to manage updates with the package manager yourself (which likely aren't immutable). There's no easy declarative config - you'll probably end up writing a shell script or using a third party tool like Ansible.

There have been many container/chroot concepts in the past. Docker's idea was not novel, but they did building and distribution far better than any alternative when it first released, and it still holds up well today.

2 comments

Yeah, this. Docker/container's greatest feature is less the sandboxing than the distribution. The sandboxing is essential to making the distribution work well, but it's a side feature most of the time
It’s kind of funny that people think of “sandboxing” as the main feature of containers, or even as a feature at all. The distribution benefits have always been the entire point of Docker.

The logo of Docker is a ship with a bunch of shipping containers on it (the original logo was clearer, but the current logo still shows this). “Containers” has never been about “containment”, but about modularity and portability.

Docker introduced an ambiguity in the meaning of the word "container". The word existed before Docker, and it was about sandboxing. Docker introduced the analogy of the shipping container, which as ranger207 says, is about sandboxing at the service of distribution.

The two meanings - sandboxing and distribution - have coexisted ever since, sometimes causing misunderstandings and frustration.

It's not about sandboxing or distribution, it's about having a regular interface. This is why the container analogy works. In the analogy the ship is a computer and the containers are programs. Containers provide a regular interface such that a computer can run any program that is packaged up into a container. That's how things like Kubernetes work. They don't care what's in the container, just give them a container and they can run it.

This is as opposed to the "old world" where computers needed to be specifically provisioned for running said program (like having interpreters and libraries available etc.), which is like shipping prior to containers: ships were more specialised to carrying particular loads.

The analogy should not be extended to the ship moving and transporting stuff. That has nothing to do with it. The internet, URLs and tarballs have existed for decades.

Docker containers ran as root by default for a great number of years. I'm not even sure if it has now finally been changed.

They provided no sandboxing whatsoever.

That’s a horrendously bad take, running as uid0 in the container doesn’t mean “no sandboxing whatsoever”. You’re still namespaced with respect to pids/network interfaces/filesystem/etc, and it’s not supposed to be possible to escape it, even when running as root in the container.

Is it possible to do container escapes on occasion? Yes, but each of those is a bug in the Linux kernel that is assigned a CVE and fixed.

Running as non-root in the container is an additional layer of security but it’s not all-or-nothing: doing so doesn’t make you perfectly secure (privilege escalation bugs will continue to exist) and not doing so doesn’t constitute “nothing whatsoever”.

I see you're not aware of `mknod`?

> Is it possible to do container escapes on occasion? Yes, but each of those is a bug in the Linux kernel that is assigned a CVE and fixed.

No bug, if you have permissions to run mknod it's an entirely by design escape that docker lets you do :)

I wasn't talking about kernel bugs, of course there have been a lot of those causing escapes. I am talking about the default configuration that does absolutely 0 sandboxing. And it's not a bug, it's as intended.

If you want to run as root and don't even touch capabilities… yeah it's root. 0 protection, the stuff in the container is running as root and can easily escape namespaces.

I really wonder how can use escape a container given a root shell created by `docker run --rm -it alpine:3 sh` without using a 0day? Using latest Docker and a reasonably up-to-date Linux kernel of course.

With the command above it is still possible to attack network targets, but let's just ignore it here. I just wonder how is it possible to obtain code execution outside the namespace without using kernel bugs.

Can you show me how? Like, if I'm in a stock debian-slim container, and have mknod, and I've started as root, how can I get from inside the container to the host? Could I create files/run program on the host? Portscan localhost? Do something crazy with the docker socket?
> I see you're not aware of `mknod`?

Try harder, friend, those require granted capabilities

  $ PAGER=cat man 7 capabilities | grep -C1 MKNOD

       CAP_MKNOD (since Linux 2.4)
              Create special files using mknod(2).

  $ docker run --rm -it public.ecr.aws/docker/library/ubuntu:24.04 /usr/bin/mknod fred b 252 4
  /usr/bin/mknod: fred: Operation not permitted
It'd be interesting to know what's in your /etc docker configuration :)
Yeah you’re going to need to elaborate and post your sources here. If there’s zero protection at all, show how I can run `docker run -it alpine sh` and break out of the container. Without exploiting any 0days.

No, --privileged doesn’t count. No, --cap-add=<anything> doesn’t count. The claim here is that docker has “zero sandboxing” by default, so you’re going to need to show that you don’t need either of those. Not just moving the goalposts and saying you can break out if you use the command line flag that literally says “privileged”.

Sorry. I agree, but that's a different question. I'll circle back to that then. Why don't technical people make these interfaces, giving the same love to user experience that something like Docker gets. As you said, it is scriptable, and I think -- us all being programmers here -- we all know that means you can just make the interface easier.
Are you implying that docker or podman hasn't been made by _technical people_?
No? I'm not sure I follow. I wouldn't say Apple wasn't made by technical people either. Saying technical people frequently ignore the importance of design does not mean that anyone who recognizes the importance of design is non technical