| There are two primary strengths that gVisor provides over the seccomp model, the second of which you've actually alluded to above. 1. Layered security While seccomp allows users to limit the attack surface on the kernel, the application is still directly interacting with it and any single bug in an allowed system call will allow compromise. One of the design principles of gVisor is that no single bug should allow compromise of the host system/user data. By intercepting and handling all application system calls, the gVisor kernel is the first layer of defense against the application. The gVisor kernel itself puts itself inside a seccomp sandbox as a second layer of defense, so if the application gets privilege escalation into the gVisor kernel its attack surface to the host is still limited. The gVisor kernel seccomp policy [1] is much more restrictive than the system calls we implement. For example, note that "open" and friends are not allowed at all. File system access is mediated by an external agent [2] which does not trust the gVisor kernel, so even a compromised gVisor kernel has no elevated file system access. 2. Ease of use > > Kernel features like seccomp filters can provide better isolation between the application and host kernel, but they require the user to create a predefined whitelist of system calls. > Isn't that something you'd effectively have to do anyway if you want a sandbox? This is something we'd like to challenge with gVisor. gVisor intends to be "secure by default" and configuration-free to the largest extent possible. gVisor runs and sandboxes arbitrary, unmodified Linux binaries. You don't need to specify a sandbox policy because gVisor safely implements the entire Linux API [3]. Building a sandbox policy can be a difficult and time consuming. It can also be a difficult maintenance burden to update as the application changes over time, especially if you've made modifications to the application to reduce its syscall surface. Additionally, some use-cases wish to sandbox arbitrary workloads, for which a sandbox policy cannot be defined. With gVisor, we hope to remove this painful step in sandboxing and enable developers to easily sandbox their workloads. (Note: I work on gVisor.) [1] https://github.com/google/gvisor/blob/master/runsc/boot/filt... [2] https://github.com/google/gvisor#file-system-access [3] Note we don't technically fully implement Linux, as work is ongoing, but missing features are simply unimplemented, not left out for security reasons. See https://github.com/google/gvisor#will-my-container-work-with... |