Hacker News new | ask | show | jobs
by wavesquid 1536 days ago
Most containers.
1 comments

Most containers are going to block unshare() via seccomp, no?
Therein lies an interesting detail. Docker does block unshare in default configurations, using its seccomp filter.

However in Kubernetes, by default, Docker's seccomp filter is disabled. At the moment you need to re-enable it on a pod by pod basis. There is work to allow a default cluster-wide setting but that isn't at GA yet.

Most containers run as root inside the container, which means they can access nftables in the container.

One of many reasons running as root inside a container is a bad idea.

Most containers would not have CAP_NET_ADMIN and not be able to access nftables.
My understanding is that containers actually can access nftables with CLONE_NEWUSER even without CAP_NET_ADMIN.

EDIT: Apparently the Docker default capabilities don't allow CLONE_NEWUSER: https://opensource.com/business/15/3/docker-security-tuning

Except the default seccomp policy is not used for Kubernetes containers.

I didn't really think about this vector where you CLONE_NEWUSER in a container... definitely on systems that allow unprivileged users to do this it is a problem.

root@ee375d5150bc:/# pscap -a ppid pid name command capabilities 0 1 root bash chown, dac_override, fowner, fsetid, kill, setgid, setuid, setpcap, net_bind_service, net_raw, sys_chroot, mknod, audit_write, setfcap

That's ubuntu.

> Most containers run as root inside the container

Is that actually surveyed / quantified somewhere? I can't say I see that too often in professional environments and even home stuff sees a lot of standardisation around separate users (https://docs.linuxserver.io/general/understanding-puid-and-p...)

Who would have thought it !

Where are these admins who demand this configuration ?