| Can you provide more context on why you feel that's true (or even possible)? For the last few years, I managed the Container Runtime group at Facebook. My experience has been: 1. `if (has_capability(..., X)) { ... }` gets put into code pretty haphazardly in a way that's not necessarily super well structured. Once it's there, it's ABI, and you're screwed if you want to iterate on it. That's why cap_sys_admin is /almost/ root. 2. If you wanted to do the right thing from the jump (e.g. for bpf itself), you'd have to add a new capability. This is a heavy lift for something that might not actually get any traction. It requires changing a bunch of common tools, and you likely end up breaking a bunch of applications. 3. Debugging capability failures is a pain in the ass. We ended up building and deploying capability tracing infrastructure just to figure out what people are actually using. 4. For gradual roll outs of enforcement/changes, you need the flexibility to warn first, enforce second. We did large scale monitoring of all such changes to make sure we didn't break the workloads. 5. Even if you nail all of the above, the ability to make finer-than-capability-grained decisions (i.e. binding to port 20 or 80 is okay but not port 22) is really valuable. I'm all for kernel abstractions that just work and solve all problems for all people, but I think the overwhelming trend has been towards kernel interfaces that provide a lot of flexibility and then more opinionated libraries/tools that kind of let us have our cake and eat it to (io_uring => liburing, bpf => libbpf, btrfs => btrfstools). |