Hacker News new | ask | show | jobs
by windexh8er 2528 days ago
For a while I could understand divergent ecosystems for container architectures. But all Redhat did was reinvent tooling, and not particularly to any significant advantage. I feel like all these tools were the brainchild of Dan Walsh as a rogue marketing campaign via Redhat to compete with Docker. All these articles are the same... How to replace your exact Docker workflow with Buildah! Now, less than ever, am I incented to use any products that come from Bluehat.
2 comments

I was never happy with the way docker was designed - it tried to steal too much work from the operating system. Docker should never had logging framework not should it be a daemon+client talking over socket, creating permission, indirection and async problems.

Docker is straigh hostile to systemd, tried to bite part of its responsibilities and does not cooperate with it in many parts.

If you want to run a docker image as a system service, its much easier to do that with podman - the docker image will inherit the system.limits and will behave like a Type=simple service with proper start/stop control and logging.

-- add: worth noting, that podman and buildah are very alike "docker" and "docker build" up to the point that you can do alias docker="podman" and can expect all the docker features work. they consume the same docker files, they build the same OCI images and can use the same registries. trying podman/buildah/scopeo really got me thinking - where's the moby inc. business? how can they commercialize a commodity?

Moby Inc.'s business should be in services related to Docker. On itself, the docker software is just a helper tool to set up some kernel services, any bunch of shell scripts could do that.
> I was never happy with the way docker was designed - it tried to steal too much work from the operating system. Docker should never had logging framework not should it be a daemon+client talking over socket, creating permission, indirection and async problems.

> Docker is straigh hostile to systemd, tried to bite part of its responsibilities and does not cooperate with it in many parts.

you had me until the rationale for this was protecting systemd, which is doing the exact same thing..

i had a specific operating system that already come with systemd. and all our company's programs are packaged as systemd services. i dont like the way systemd treated system.limits, udev, logind, dbus that is impossible to remove and logging that is worse than rsyslog, but hell, we already paid the price for adopting systemd, why paying extra for docker exibiting same behavior?
They keep claiming their stuff is more secure; is that wrong?

Being beholden to a desperate competitor isn't just marketing; it could be a matter of survival and a strategic response seems reasonable.

> They keep claiming their stuff is more secure; is that wrong?

The docker daemon has a large surface area and also has fairly hefty privileges. It's a juicy target of attack.

A platform can take various actions to further lock down the runtime with AppArmor or SELinux, but out of the box you wind up hearing the motto that "containers don't contain".

Notably, the docker daemon has too many responsibilities. It's a builder. It's a shipper. It's a runner. It's everything to everyone, which makes for a convenient installation process but means that a subversion of any of these functions potentially allows someone to drill sideways into one of the others.

> Being beholden to a desperate competitor isn't just marketing; it could be a matter of survival and a strategic response seems reasonable.

I'm prepared to go with: the engineers thought it was a good idea and product management didn't gainsay them. It's been a common theme from multiple tech companies in the past year or two. Google, Amazon, Red Hat, Pivotal (which is where I work) have all been chipping away at breaking off parts of the daemon's responsibilities.

I hate to engage in this petty in-fighting (especially since I want podman to succeed and actually be as secure and well-designed as rkt was but with OCI runtime support).

Unfortunately, I don't agree with this whole "it must be more secure because we broke it into bits" argument. That alone is not sufficient in order to increase security. The vast majority of the code in libpod/cri-o is very similar to (generate a config and pass it to runc) or copied directly from Docker (containers/storage, with containers/image honestly having quite a few more problems than Docker's image parsers). When I found CVE-2018-15664, not only was the libpod/cri-o stack also vulnerable but it was as vulnerable as Docker was more than 5 years ago when I fixed the original security flaw in 2014. I feel bad saying this (I don't want to blame the folks working on this, who I do respect immensely) but it really should be a serious consideration if you want to put "more secure" in your advertising.

This is why I argued for several years that we should add OCI runtime (and custom storage driver) support to rkt instead of having to redesign everything (and since we started with cri-o instead of libpod, getting rid of the daemon was a pain there too). But of course, like every other discussion I've had with Dan Walsh, it was brushed away. Whatever.

I do really like the folks behind the project. I just wish we'd spent our collective energy on improving something that already existed instead of repeating mistakes pointlessly. I'm definitely not a fan of Docker's politics either (and at the very least nobody from the cri-o/libpod project has sent me abusive emails calling me stupid and "brainwashed by Red Hat" for criticizing their project's governance model -- which Solomon Hykes did in the past when he was still the CTO of Docker).

Disclaimer: I work for SUSE and maintain runc, and have worked on containers for a depressingly long period of time. Obviously the above are my views not those of my employer (who ships both Docker and cri-o, and my team maintains it in our products). I'm just tired of all the fucking drama.

> I'm just tired of all the fucking drama.

I wasn't aware there was any, my perspective comes from working from a different end (dockerless builds).

I know what it can be like to work in communities fraught with vendor politics and other troublesome dynamics and have come close to burning out a few times.

It always feels too important to walk away from, even temporarily, but I promise, it's not true. Your own health is important.

Amen. There is a heap of mess from the land-grab war.

We are all on the same team, just want to make this stuff better, but people still seem intent on fighting.

For the sake of balance -- I do want to clarify that there is definitely land-grabbing happening on both sides of the aisle here. cri-containerd is a good example of the "Docker side" trying to land-grab cri-o's niche.

But again, I don't like all of this stupid politics over such trivial crap. It's ridiculously draining that I have to deal with people from the Docker project demanding me to apologise for things that Dan Walsh has said (or playing dumb when someone else from the SUSE makes a snide comment about several-year-old PRs that have burned out several of our engineers -- and then asking me to try to force them to apologise for it), as well as having to deal with the issues I outlined above. All of this back-and-forth has no benefit to anyone involved (or outside) and is just a waste of our collective lives. Posting this publicly probably won't help either, but I really don't know what the solution is other than to just quit and work on something else where we aren't just collectively accelerating human entropy.

To be frank, this is the main reason I've been working more with the LXC folks in recent years -- there are more interesting problems there and I don't have to deal with this crap. They also have really brilliant engineering, but that's not the main thing that attracted me to working with them.

Docker's runtime is long spun out into containerd (though there is more work to do here). Builder is in buildkit, which sits on top of containerd.

Would be nice to decouple networking, this will take some work.

Docker is more and more becoming an API that sits on top of a bunch of other services. It does take time to make this happen without breaking compatibility, though.

Docker can also run without root as of Docker 19.03. Even so, "docker requires too many privileges" is marketing speak. Setting up cgroups requires root, setting up mounts (w/o fuse) requires root, setting up network devices requires root, etc... anyone who wants to do these things requires root. Rootless mode on all this tech attempts to work around such limitations, but each workaround comes with trade-offs (slower networking, no cgroup support, slower fs access...).

Definitely agree, though if you want to run services through systemd Docker is not well designed for that purpose.

> Docker is more and more becoming an API that sits on top of a bunch of other services.

Which is a good thing. With Cloud Foundry we moved from using a pre-Docker container engine to using runc as soon as it was available; containerd is the next move.

> Even so, "docker requires too many privileges" is marketing speak.

I don't agree. The API surface still exists and includes too many disparate purposes. The modularisation of Docker is improving that risk profile, but it still exists. Fully segregating the API and the modules is worthy.

Modularization improves maintenance overhead, it does not reduce privilege.
I understand that. My point is that it reduces the blast radius of any one part being compromised.
Agreed. One other point is containerd isn't a Kubermetes specific runtime. Why cri-o chose to lock in like that doesn't seem to make long term sense except with regard to ease of development.
The long-term goal is to switch users to use podman (and buildah I guess), with cri-o being a daemon that uses the same underlying code. The only reason a daemon is required in the first place is because the CRI uses gRPC.

This is loosely similar to how rkt operated, with rktnetes just being a front-end for rkt (though at the time, the CRI didn't exist so it was all hardcoded in Kubernetes).

The goal isn't to have a differently-named daemon required in order to do any operation, and podman enables that (though again, so did rkt).

Thanks for your insight! I guess I don't understand why having multiple runtimes helps. If CRI is just an agnostic interface to the tooling what is the advantage of cri-o over containerd? Why make the ecosystem harder when it's all a standard runtime interface anyway? Couldn't Docker tooling, podman and buildah all talk directly to any runtime?
You don't have to use Docker in prod. Keep in mind the most popular runtime is also contributed by Docker: containerd.
It has to do also with the use of user namespaces (LXC also does this). User namespaces (user/group id mapping) + userspace file systems (FUSE) is what enables building & running containers without root.

Unfortunately the documentation is not really there yet[0], but that's the gist of how it's more secure outside of the general reduction-of-responsibility ways that others have mentioned.

[0]: https://github.com/containers/buildah/issues/1469#issuecomme...