Hacker News new | ask | show | jobs
by p12tic 492 days ago
Container security boundary can be much stronger if one wants.

One can use something like https://github.com/google/gvisor as a container runtime for podman or docker. It's a good hybrid between VMs and containers. The container is put into sort of VM via kvm, but it does not supply a kernel and talks to a fake one. This means that security boundary is almost as strong as VM, but mostly everything will work like in a normal container.

E.g. here's I can read host filesystem even though uname says weird things about the kernel container is running in:

  $ sudo podman run -it --runtime=/usr/bin/runsc_wrap -v /:/app debian:bookworm  /bin/bash
  root@7862d7c432b4:/# ls /app
  bin   home            lib32       mnt   run   tmp      vmlinuz.old
  boot  initrd.img      lib64       opt   sbin  usr
  dev   initrd.img.old  lost+found  proc  srv   var
  etc   lib             media       root  sys   vmlinuz
  root@7862d7c432b4:/# uname -a
  Linux 7862d7c432b4 4.4.0 #1 SMP Sun Jan 10 15:06:54 PST 2016 x86_64 GNU/Linux
1 comments

gVisor is solid but it comes with a perf hit. Plus, it does not work on every image
FWIW the performance loss got a lot better in ~2023 when the open source gVisor switched away from ptrace. (Google had an internal non-published faster variant from the start.)