|
|
|
|
|
by raesene9
2585 days ago
|
|
So (as I'm sure you know) linux container isolation isn't just a product of namespaces, but namespaces+capabilities+cgroups+(SELinux/Apparmor)+seccomp-bpf. Each one of those layers provides some aspect of isolation and for a Linux kernel exploit to succeed in escaping a container it needs to bypass/compromise each one (or as in the case of the runc vulnerability occur prior to the sandbox being fully established). So just taking Linux kernel bugs as a metric doesn't really apply. That's why I gave the list I did, as those are the only ones which I'm aware of which can bypass all the layers of isolation in a standard Linux container. If the ground truth "containers don't contain" applies, then it appears you're saying that Linux is innately and architecturally unsuitable for multi-user/process use, which seems like a fairly bold statement given its prevalence... After all, all a container is, is a Linux process with Linux isolation mechanisms applied to it... |
|
docker has started adding hardening with SELinux+Seccomp because theres a realisation that the linux kernel bugs keep coming, but this is just a bandaid. the other problem with this approach is that in practice a hardened config is too restrictive for real-user use and has real maintenance cost so most will never use them (as argued by others in this thread for why the gvisor approach is superior). AppArmor is very poorly maintained, buggy, and not a practical solution