Hacker News new | ask | show | jobs
by throwaway20371 1687 days ago
It's important to understand that containers are not a security device. Containers are a mechanism to separate resources used by processes. You should not assume any significant security benefits to containers, regardless of what anyone claims (even a kernel developer - maybe especially them....) because it all depends on Linux kernel security, which is pretty crap.

If you want security with containers, use Firecracker. It uses Micro VMs rather than just kernel-level restrictions, so even a Linux kernel security bug shouldn't be able to jump out to the host or other containers/Firecrackers.

7 comments

It is important to understand that as implemented by runtimes such as docker, containerd, cri-o, and podman, containers are absolutely a security device.

These things force the application to run in restricted environments where capabilities are reduced, system calls are filtered, apparmor profiles and selinux labels are applied, among other things.

These are the same sort of things that chrome, for instance, does to make it harder to do bad things from the browser. They are hardening techniques. Images are automatically checksummed on download, and for certain things even signatures are checked (more of this soon).

Just because they do not protect you from a kernel exploit (also actually not true, seccomp filters have prevented a number of kernel exploits from inside the container) does not mean they are not added security.

The important thing to know what your attack surface is and what is acceptable risk for the workloads you are running. And yes, it is important to understand that containers do not isolate you completely from kernel exploits.

It is also important to understand the control plane (runtime up to the orchestrator) is much more likely to introduce security issues than the container itself.

Obviously containers do add a of security relative to uncontained processes but there are security challenges(as I'm sure you're aware)

There are multiple independent projects involved in securing a standard orchestrated docker style container (some of the set of Linux kernel/Linux distro/runc/containerd/docker/k8s) and no obvious owner of overall security configuration and problems.

we've seen examples of this, e.g. k8s disabling Docker's seccomp filter, or more recently the difficulty in how to handle clone(3) and seccomp filters.

For me it's that comparison with dedicated security sandboxes, is that in other projects there's a single team handling the whole security picture , which is likely to make things easier to manage.

> These things force the application to run in restricted environments where capabilities are reduced, system calls are filtered, apparmor profiles and selinux labels are applied, among other things

This characterisation is too charitable, without qualifications. It depends a lot on the container runtime and configuration. Out of the box with Docker you don't get AppArmor or SELinux, and apps run as root where they normally wouldn't (because uid remapping, aka user namespaces is disabled by default, and people rarely bother to set up non-root users in containers).

As a bonus, applying security updates is left as an exercise to you, meaning it often doesn't happen.

Also, complexity is the enemy of secutity. Containers are yet another added layer that you have to juggle in your head when trying to make sense of the whole system you're building and operating. If used wisely and with good understanding, monitoring and processes, they can be a net positive despite this, but not necessarily so.

Out of the box apparmor is enabled by default when the system has apparmor available.

I don't recall why we don't turn on selinux when it is available.

I mean this is basically saying that locks aren't security devices because you have to put them in your door and use them. I guess there's a point there but we are talking about professional devs and ops people. If they can't bother to pass --enable-selinux they aren't gonna use Firecracker either.

And docker at least has an update delivery story (same as microVMs as well) compared to traditional ops where there are patching cycles and anything that can't be yum/apt updated is basically ignored and updated on quarter/year time. Build a base image in your CI pipeline, update it every night, have your app build against it, smoketest deploys, and largely forget that patching exists.

“If they can't bother[ed] to…”

They can’t. Not one developer I have worked with in the last 10 years has lifted a finger in the name of security.

This is why managing containers is a full time job by itself, a specialised discipline.

If you can’t afford an FTE to manage containers you can’t afford containers.

So is managing VMs and devs don't do that either. Every dev I've ever known (and most ops people tbh) have just unthinkingly turned off SELinux the moment it gets in their way. If you're painting an arbitrary distinction between "containers are fully owned by devs who by assumption don't care about security" and "vm's are fully owned by ops who care about security" then you're doing it wrong. It's like ... the whole point of devops mannn.
It's also important to remember that security is not merely confidentiality. Kubernetes and docker both assist in availability and integrity through redundancy and (at least for most of its history) the ability to run code by content-addressed cryptographic hashes. And validate other signatures etc.

There are a ton of security mechanisms that are enabled by the ecosystem itself, even if it does introduce new complexities and does have certain weaknesses against full hardware virtualization. It also has significant and meaningful security strengths (namely in availability via lower resource usage) against exclusively using hardware virtualization.

Agree, if you then like some of the huge companies spin up containers per user/customer it works more or less as sandboxes giving huge security benefits.
> You should not assume any significant security benefits to containers, regardless of what anyone claims (even a kernel developer - maybe especially them....) because it all depends on Linux kernel security, which is pretty crap.

Completely disagree. How much experience/exposure do you have to kernel security that you say is crap?

> It uses Micro VMs rather than just kernel-level restrictions, so even a Linux kernel security bug shouldn't be able to jump out to the host or other containers/Firecrackers.

You realize that firecracker uses KVM, which is part of the "crap" kernel that you don't trust? A "Linux kernel security bug" could absolutely allow a Firecracker VM to jump to the host or other containers/Firecrackers.

The attack surface of a micro VM is tiny compared with that of a full Linux kernel. That's the issue.
I don't disagree with you, but that's a very different thing than what GP said. Comparing attack surface is a very different thing than saying that containers don't give you any practical security over a non-containerized process (my paraphrase of OP, subject to misinterpretation). The former (comparing attack surface) is a useful exercise in a high-security environment. The latter is simply a ridiculous thing to say.
Trying to secure a container via non-VM means is a painful slog. You can pretend containers give you security, and then one of the hundreds of different attack vectors provides a breakout. It's been demonstrated time and again, largely because Linux security is just shit and always has been.
I did it this morning and it wasn't a painful slog, because I don't have to start from scratch with just docker every time. I can reuse work done by others, and there are numerous tools that assist (OpenShift, selinux, seccomp for example). Your example of firecracker is the same thing. It is a tool wrapped around a lower-level implementation (KVM) that covers the primitives so they are easier/faster to use.

If your complaint is that container implementations leave the hardening scope to other tools, then sure, but I would argue that's just philosophy difference between the unix approach of do one thing and do it well, and chain tools together to solve problems, and the approach of one program to rule them all.

It's not a philosophical difference, it's just complexity. More complex systems are more prone to failure. If the security system is more complex to set up, it's more likely to fail. More code means more bugs, and more domain-specific knowledge leads to more potential for user error. So if you have 'one program to secure it all', it's almost guaranteed to be better than having to use many programs all in the right way. And it isn't even a defense-in-depth issue because all those layers added to container security are really just to avoid the much larger attack surface; getting rid of attack surfaces reduces what you need to defend.
I don't think there's ever been a year without a half dozen privesc holes in the Linux kernel. Linus is also belligerently anti-security because he thinks it always results in worse user outcomes. And containers were never created with security as a top priority, they're just an amalgamation of resource abstractions, so of course it works as well as anything else not designed with security in mind.

The hypervisor isolates guest kernel bugs from the host by nature of strictly controlling resource use from the lowest level. There are of course hypervisor bugs that allow breakouts, but they are a couple orders of magnitude rarer than the typical Linux privesc bug.

I recently wrote a white paper on host level container security for a security oriented product.

Say what you want about kernel security, but most side channel attacks are statistically using syscalls which are highly unusual for a regular, hosted application to make. Using a combination of SecComp and LSM apps (SELinux/AppArmor) you can defend against most of these attacks.

You are right that containers alone are not sandboxes, but namespacing does help in terms of isolation. If you want to sacrifice performance for some further security you can use an application kernel, which further defends the host kernel. Additionally, you can try to sandbox with host and node level isolation for services, which refines the application syscall profile to be very consistent and predictable. Then if something unusual occurs on the host (like writing to disk) then you can take definitive action like shutting the host down. That's some of the principles Bottle rocket was written on, among others.

Link to white paper, will be appreciated.
It still has a ways to go before it's disseminated externally, but I'll reply here when it does.
I’ve seen customers assume that because they’ve heard that Linux containers are secure, they can use containers to secure Windows applications.

Microsoft ignores security reports related to container escape vulnerabilities because they don’t consider it to be a security boundary!

Containers may not be a security boundary yet, but they do promise a level of security and even though the kernel and various runtimes may not be there yet, one day they will be.

Every time there's a container escape or privilege escalation, it's treated as serious and patched. I look forward to the day when we can confidently say that containers are in fact a security boundary.

They are most definitely a security device. They are not the only security device.
Containers, aka "just download random crap off the internet and run it as root, #YOLO" are an anti-security device.
This obviously has nothing to do with containers. You can download and run untrusted code (as root or otherwise) whether or not it’s containerized, and indeed containers at least give you some degree of isolation. What would possess someone to post such embarrassingly obvious misinformation?
What determines if code is "untrusted" or not anyway? It's fine to run postgres or redis (someone else's code! that I have certainly not audited!) on a server, but as soon as you run it in a container that's... less secure?
That is certainly the perception. Like any software, you have to make sure you're pulling from a reliable source--if you're pulling an image from `hub.docker.com/r/definitely-not-a-hacker/postgres` rather than the official postgres image, you're exposing yourself. But it's transparently ignorant to argue that this is particular to containers--one can also download a postgres ELF binary from an untrusted source.

I really think a lot of criticism of containers is absurdly low quality (e.g., criticizing containers for issues that are universal to all software)--it feels like people are really grasping at straws. One gets the distinct impression that some people have spent years or even decades perfecting bespoke, rube-goldberg-esque application runtime environments and now containers are obsoleting their value proposition. Of course, I'm very hesitant to psychoanalyze and would never argue that any individual is so motivated, but this is the impression I get in aggregate.

Surely it's a bit more complex question than that. The traditional way of running software includes some sort of privilege management, uids, ulimits, chroots but sometimes also things like pledge and selinux. Those things are sometimes summarizes as privilege minimization.

Privilege minization is much harder when stuffing everything in a container. I'd wager that running Chrome normally is probably safer than running it inside Docker, for example, because not all sandboxing functionality works when running inside a container.

So it would depend on what software, and what type of container.

I suspect the overwhelming majority of software shops aren't doing the diligence you describe as "traditional". For those folks, containers represent a strict improvement in security. I would be curious to learn more about which "privilege minimization" features are incompatible with containers, however.
All my containers are rootless
> not my container!

Good for you, seriously.

But that's not why people use containers. People use containers because they want to deploy random crap from the internet at the press of a button. I'd wager "rootless" is a bug, not a feature in this scenario.

I don't know how to take this comment seriously. Of course people don't want to deploy "random crap", but yes, people want to deploy software more easily--it's not clear to me why this is such an awful thing.

> I'd wager "rootless" is a bug, not a feature in this scenario.

You would be mistaken. Containers don't have any magic that makes it easier or harder to run as root. In this respect, they're just Linux processes, and an administrator can run them as root or not. And like Linux processes, the widely-understood best practice is to run them without root, and indeed many orchestrators require you to explicitly opt-in to "privileged execution".

As point of fact, containers have strictly more security layers than vanilla Linux processes. They are typically thought to have weaker isolation properties than VMs, which is why we (as an industry) invariably run containers (and vanilla Linux processes) inside of VMs or forego multi-tenancy altogether.

You misunderstand how it is in the wild. Much (most?) of the time Docker and docker-compose are used to package big-ball-of-legacy-mud applications and their dependencies and push them out on the unsuspecting world. Installation instructions for most web app software today is "here, download and run this docker-compose.yml".

Of course nobody vets or even looks at the mess inside the compose file, and most of this software won't even run without root privileges. (Because it hooks into various system bits and violates all sorts of isolation rules.)

People value Docker as a packaging tool; especially as a go-to tool for packaging legacy crap and software-as-a-pet systems.

Running this stuff without any sort of checking and as root is bonkers, but it is what it is.

We're kind of back in the Windows 95 era of packaging software as far as server backends go. Maybe it will change after some very serious worms and viruses his the Docker ecosystem. (Windows changed very slowly and only after tremendous pressure from cybercrime.)

> Installation instructions for most web app software today is "here, download and run this docker-compose.yml".

Plenty of software is distributed as "copy/paste `curl ... | sh`" or "npm install ..." or "pip install ...". This is absolutely not unique to containers.

> most of this software won't even run without root privileges

I don't buy this at all. The container runtime probably needs root privileges, but individual containers rarely need privileged access. Moreover, in many (all?) cases we can use security policies to prevent root containers by default.

> Running this stuff without any sort of checking and as root is bonkers, but it is what it is.

Again, true of any software, containerized or not. For what it's worth, I'm pretty sure people are more likely to inspect a docker-compose.yml than they are to decompile an ELF binary.

> We're kind of back in the Windows 95 era of packaging software as far as server backends go. Maybe it will change after some very serious worms and viruses his the Docker ecosystem.

We've always been in that era. The only difference is that today our systems are designed with more security in mind.

> Of course nobody vets or even looks at the mess inside the compose file, and most of this software won't even run without root privileges. (Because it hooks into various system bits and violates all sorts of isolation rules.)

I don't know about that. I recently was tasked with installing a web service, running in a container.

First thing I noticed was that the container was bundling a nginx reverse proxy, a node js server, a redis database, a rabbit-mq service and a pgsql database. I noticed because after installing the whole thing I wondered why the database was working fine even when the env variables for the pgsql were wrong. Those env variables are in the README (along with other variables holding secrets so it's not clear what is necessary and what is not from reading the documentation).

Then trying to proxying the thing I, after an afternoon of banging my head, I remembered docker has an impact on iptables rules. The documentation and support forums mixes both nginx as the reverse proxy/frontend running inside the container and nginx running in front of it.

So, documentation was not good and I don't see how anyone downloading this docker-compose.yml file would be able to run it without digging through it but it doesn't mean a well documented shipped docker bundle with 1 process per container is hard or impossible or not desirable.

I do agree that the tendency of some projects to ship a docker-compose.yml with one service running multiple processes (databases, server, proxy, cache, etc.) is a giant PITA. I mostly see that from opensource projects with a paid enterprise version. I am pretty sure it's never the bundle they are actually running in production.

Now, when I read some `sudo docker` I think the same things that I think when I read `sudo pip install`.

This is a process problem. The wild is exactly that, a place to experiment and learn and iterate and fail. In any professional organization, all of your concerns can be adressed with the proper talent and tooling.
> But that's not why people use containers. People use containers because they want to deploy random crap from the internet at the press of a button.

No, it really isn't. At all. In fact, your comment is so out of touch it actually reads like a poor attempt at trolling.

People use containers because they offer an easy and very convenient and self-contained platform to build, deploy, configure, and run multiple instances of the same application, regardless of node or platform.

In fact, if you ever manage to get any experience with containerized applications you'll eventually notice that in all containerized apps the bulk of applications, specially microservice-based applications, are comprised of apps developed in-house.

On top of that, there is a wealth of container orchestration systems that provide support for blue-green deployments, autoscaling, system introspection, auditing, and even secrets management, not to mention networking.

Your assertion makes as much sense as claiming that people use Linux distros a because they want to deploy random crap from the internet at the press of a button, just because they provide a package management system.

You're going a bit far the other way, I think. People definitely use containers because they're a nice workflow, and yes a lot of containers are shipping all in-house code. But to ignore the popularity of Docker Hub and claim that people aren't also jumping head first into containers because they make it easy to grab random unvetted binaries is a step to far.

> Your assertion makes as much sense as claiming that people use Linux distros a because they want to deploy random crap from the internet at the press of a button, just because they provide a package management system.

If a major reason were using Linux was for the AUR, then that'd be a valid criticism. In practice, most distros ship a package manager that only pulls from official repos, and changing that requires jumping through hoops (even Arch with the AUR requires manual work to build an AUR package). And in practice, a lot of people are pulling random unofficial images off Docker Hub (and gcr.io and such) and just running them without review.

You could make the same analogy with github or npm or whatever unvetted repository. Nothing specific to containers here.
> You're going a bit far the other way, I think. People definitely use containers because they're a nice workflow, and yes a lot of containers are shipping all in-house code.

Containers might have provided a convenient way to package, distribute, and deploy software, but that's just a nice-to-have complementing the whole reason anyone adopts containers.

To put things in perspective, while Docker is practically a household name, does snap ring a bell to anyone? Does anyone bother at all with snap?

> But to ignore the popularity of Docker Hub and claim that people aren't also jumping head first into containers because they make it easy to grab random unvetted binaries is a step to far.

This assertion is proven false with the inception of container registry services provided by service providers such as GitLab, GitHub, and all major cloud service providers such as AWS, Google Cloud, and Azure, and even lesser services such as Alibaba cloud and IBM cloud. These services might make container images available to the public, but their main role is to allow people like you and me to push their own container images to be pulled by your container orchestration services. In fact, even if you adopt third-party services you end up either using the official images made available by each project or repackaging the software yourself to reflect your own config and deployment needs.

How do you handle access to network shares?
I've not yet, but root inside the container is the user docker is running as outside, so I assume if that user can access the share it would be ok.

What protocol were you thinking of?

NFS or SMB. Just wondering as I've recently started exploring (lxc) containerization in my home lab and that was the big roadblock I hit with unprivileged containers. I guess the solution as you suggest is to mount in host and bind-mount in the ct, but that seems pretty unappealing for multiple reasons- it breaks the logical compartmentalization of app config in the container, no visibility on the server of which ct owns a connection, can't scope nfs permissions per container etc.