Alternative idea: throw away docker and katacontainers and move to freebsd, where jails were introduced on 14 Mar 2000 (no, seriously, superior technology exists for 19 years - stable, time proven, working).
Jails are virtually identical technology to Linux containers from a security point of view. They've had holes before and they likely will again, and a breakout like this (seems like the root cause here is a writable file descriptor to the host binary) can absolutely compromise the host system.
The upthread recommendation was using hardware VM technology, which is a fundamentally different isolation model from what software can provide and (at least in theory) makes that kind of exploit impossible. And while there are tradeoffs with everything, for you to throw that argument out due to personal platform loyalty is really, really bad advice.
My understanding is that jails were designed as a security boundary from the get go, unlike containers. Wouldn't that result in code that's less likely to be exploitable?
FWIW, "containers" aren't a thing. Namespaces, cgroups et. al. certainly were designed with security in mind, as was docker/runc.
Look, this isn't about whether jails are secure containers or not. I'm sure they're great. It's that responding to "if you want more isolation, try hardware virtualization" with "FreeBSD is just better because 19 years!" is not really enaging with the argument as framed.
Sure, but there were 19 years of time proofing them. Each product has vulnerabilities which get weeded out when time passes. And for kata and docker, in context of what they are used for, they are bleeding edge.
(from a technical perspective, you would be running jails for years too - so much about platform loyality)
Vulnerabilities don't get weeded out by time like radioisotopes decaying. Vulnerabilities get weeded out by attention, and attention happens when people use a system in production to protect a high-value target.
Jails haven't been used to protect as many high-value targets as Linux containers have. This is not a comment on the technical quality of jails. It may well be a comment on the world's anti-FreeBSD prejudice. But either way it's still true, and that means the 19 years of existence didn't magically harden the product.
> Jails haven't been used to protect as many high-value targets as Linux containers have
This is not true in my experience at all. It may be true that it hasn't been in use at startups until Docker came out, but a few large, established companies I've worked at absolutely used Jails or Zones to protect their most valuable IP. And have been for a long time.
What was the attack surface of the jails/zones? I don't think the distinction here is startup vs. large company but internal vs. external. We used jails at my last company as a last line of defense (and, full disclosure, I wrote about 100 lines of code to use unshare(1) etc. when that machine was our last FreeBSD box remaining in our Linux conversion), but it was on a non-internet-accessible server where the jailed network connection was routed only to a single other (much larger) business that we had an established relationship with. If attacker code were executing inside the jail, there was already a serious breach.
The distinction here is that people are running containers in the cloud and also often running untrusted code (e.g. vendor software, random exciting open-source things) inside containers, and collocating those with high-value targets in other containers. And large, established companies are doing that now just as much as startups are.
Is there a centralized jail image repository with images of jails running popular open source software applications that I can search from the command line and spin up locally or in a cluster with a few commands? Can I easily replicate and distribute an image of a jail? Because that is what Docker offers.
The major use case for Docker is really as a massively simplified package manager and an entrypoint for distributed applictions. Other features, like quicker runtime than VMs and system isolation, are just icing on the cake.
It took a massive marketing campaign to get people to use Docker and realize it made their life easier, so something like iocage would need the same push. (Also, nobody wants to start adopting additional OSes unless absolutely necessary)
There have been vulnerabilities in jails previously. Also, Linux gets far more attention from exploit researchers because of wider adoption - so the number of incidents isn't a good metric. Kata has hardware isolation, so will be safer.
If I have misunderstood jails and it's immune to kernel exploits please do correct me.
Linux VServer project started in 2003: http://linux-vserver.org/ChangeLog-1.2 . We used it in production more than 10 years ago at multi-terabyte site (Bazaarvoice).
Taken from some other thread:
"Most "unix" admins only know linux and will advocate for it vigorously because it is so much better than.. "what do you use again? Fedora? Ah, FreeBSD, something with F, I knew it!""
The upthread recommendation was using hardware VM technology, which is a fundamentally different isolation model from what software can provide and (at least in theory) makes that kind of exploit impossible. And while there are tradeoffs with everything, for you to throw that argument out due to personal platform loyalty is really, really bad advice.