Hacker News new | ask | show | jobs
by kasey_junk 1839 days ago
Do you have any links to secure container runtimes that don’t either virtualize or replace all the system calls of the container such that it might as well be virtual?
4 comments

First, saying it might as well be virtual is a bit of a misnomer. There are various options, and although they may act like a VM they are significantly faster than machine-based VMs like QEMU:

https://kubernetes.io/docs/concepts/policy/pod-security-poli...

> As of Kubernetes v1.19, you can use the seccompProfile field in the securityContext of Pods or containers to control use of seccomp profiles.

If you're looking for a more general abstraction, there is gVisor and others as well.

Again, not an expert but security policies aren't immune from container breakouts right?

Which leaves you to either use something like firecracker or gvisor which are either virtualization solutions or the next closest thing in that they intermediate all of your syscalls?

Almost all container breakout concerns rely on running containers as a privileged user:

https://stackoverflow.com/questions/53024790/kubernetes-supp...

There is an issue that I've been tracking, and there has been a new PR that will hopefully land soon to implement this in Kubernetes in a simplified manner:

https://github.com/kubernetes/enhancements/issues/127

As for whether security policies prevent breakouts, it really depends on what the exploit is but they can significantly help. The idea of user namespace remapping solves a secondary issue though... if there is a breakout, what user privileges will they have.

We can't answer that question because "secure container runtime" is not a well defined idea. Secure from what, in what way, with what guarantees? Docker is both secure and not depending how you draw the lines.
Sure. I mean as secure as a traditional virtualization environment.
Singularity is likely* less secure than default container runtimes.

*not a security person or an expert on singularity but it advertises that it doesn’t do file system or user isolation by default

Pod security policies and seccomp for call filtering at an OCI level