| HN Mirror

Thanks again for the detailed response.

> Have a look at what gVisor actually does

I am aware of what it does, though I had missed the fact that the Sentry and/or Gopher run within a user-ns (could not find this in the docs). Had also missed the fact that it does perform procfs/sysfs emulation (makes sense), so I stand corrected on that. In light of this, I'll modify the Sysbox GH table to show gVisor as having a stronger isolation rating (in fact, our Sysbox blog comparing technologies [1] did give gVisor a stronger isolation rating).

> the sysbox approach is one kernel bug away from host system compromise

All approaches are one bug away from host system compromise (gVisor, VMs, etc.), though I agree that approaches like gVisor and VMs have a reduced attack surface.

> I've read numerous claims that sysbox is suitable for untrusted workloads

It's not a black or white determination in my view. Users choose based on their environments & needs. We always make it clear to our users that VM-based approaches provide stronger isolation, per the Sysbox GH repo:

"Isolation wise, it's fair to say that Sysbox containers provide stronger isolation than regular Docker containers (by virtue of using the Linux user-namespace and light-weight OS shim), but weaker isolation than VMs (by sharing the Linux kernel among containers)."

> Firecracker runs a full Linux kernel inside the VM, so it could always run regular Docker, Kubernetes or anything else

That's good to know (thanks), though the table in the Sysbox GH repo meant to compare Sysbox against Kata + Firecracker (since Kata is a container runtime). To the best of my knowledge running Docker, K8s, k3s, etc. inside a Kata container is not easy (see [1] and [2]).

> For containers, this used to be the case, but the situation improved in recent kernel releases.

It's correct that rootless docker/podman approaches are improving as far as what workloads they can run inside containers, although they still have several limitations [3], [4].

With Sysbox, most of these limitations don't apply because the solution works at the more basic "runc" level, Sysbox itself is rootful, and it uses some of the techniques I mentioned before (user-ns, procfs & sysfs virtualization, syscall trapping, UID-shifting, etc.) to make the container resemble a "real host" while providing good isolation.

Good discussion, please let me know of any more feedback.

[1] https://github.com/kata-containers/kata-containers/issues/20... [2] https://github.com/daniel-noland/docker-in-kata [3] https://docs.docker.com/engine/security/rootless/#known-limi... [4] https://github.com/containers/podman/blob/main/rootless.md