Hacker News new | ask | show | jobs
by EdSchouten 1800 days ago
FreeBSD already supported something like this effectively, but in my opinion better way.

You can call cap_enter(), which disables open(), unlink(), mkdir(), etc. entirely. You can, however, still use openat(), unlinkat(), mkdirat() with relative paths that expand to a location underneath a directory file descriptor. This achieves the same thing, except that you can now have as many chroots as you want. Not just one.

Unfortunately, the idea never caught on, because virtually no software on UNIX uses the *at() functions. Also: the non-*at() functions are still available as symbols, meaning that you can't perform simple compile-time checks to ensure that you application works properly when this form of sandboxing is enabled. Turns out that off-the-shelf software (e.g., libraries) end up misbehaving in unpredictable ways if you disable ~50% of the POSIX API.

It's a shame, because this feature effectively requires you to treat the file system in an object oriented/dependency injected way. Pretty good from a reusability/testability perspective.

7 comments

One of my minor disappointments with Go, considering the time it came out and the UNIX heritage that it descended from, was that it didn't prioritize the *at() functions. It's difficult, if not virtually impossible, to write secure code with the "traditional" path-based system because every time you do one thing, then some other thing to a path that has some sort of security implication, you've written a TOCTOU problem if somebody can wedge between those two things to change some critical aspect of the file.

It's hard for me to blame programmers for not using these functions more when hardly any language properly exposes them. But since nobody exposes them, nobody's aware they should use them.... chicken & egg strike again.

But openat, for example, is still path-based; it just changes the directory that the path is relative to. If you give it an absolute path, it will open it, and I didn't see any reason in the man page why you couldn't just pass in a bunch of ../../ as the usual exploits do. Maybe you're referring to another category of bugs?
Sorry, I was unclear. Too much context in my head from the times I've jousted with this and I forgot to contextualize properly. (Which is ironic since part of my complaint is precisely that too few people know this stuff.) That family of functions allows you to open things based on handles more easily. So you can open a directory, and while holding on to the handle for that directory, know that you are still in that directory, even potentially open files in that directory and then, once you do that, know that you have a file in that directory (or, atomically, don't).

It's the difference between

     dirHandle = open("some path");
     fileInDir = openat(dirHandle, "some file");
versus

     dir = open("some path")
     // examine the directory, then
     fileInDir = open("some path/some file");
In the second case, between those two lines, you can have something else jump in and modify or remove or repermission or whatever the "some file". It has never been the largest security issue, but it's been a running undercurrent of securit issues for decades.

In the first case, you have atomically-safe operations; you either get the directory or don't, then either get the file handle or don't, etc, and once you have the handle nobody else can take it from you, even if they rename the file under you, etc. It means that if you are writing logic like "if the file is setuid, do this", there's no way for an external process to wedge in between the two things.

In other words, you ought to be able to not just read from a file handle, but also open relative to the handle directly, and do all those other things. Any API that operates in terms of paths is pretty much intrinsically open to TOCTOU, because any time you "check" a path vs. "use" the path, which is fairly common, you have a window of opportunity for lossage. I'm not sure I've yet seen a non-C way of doing this built into a standard library.

Also... before you jump in with some "what ifs", no, these functions don't magically make your code more secure. You still have to use them correctly and it's still pretty easy to mistakenly let path-based logic slip in accidentally even so. It doesn't make insecure code secure; it makes guaranteed insecure (in security-sensitive contexts, obviously a lot of time this isn't a security issue) code possible to write securely.

Makes sense, but I think that you only gain safety when you are checking attributes of the directories leading to the file, but not when you are checking the file itself. For example, you said

> In the second case, between those two lines, you can have something else jump in and modify or remove or repermission or whatever the "some file".

Modifying/removing/repermissioning "some file" is still possible even with openat() if you do it between the time you open("some path") and openat("some file"). There is still a race condition there in either case if you are examining the contents of the directory (e.g. "stat"ing the file and then calling openat). You can also modify/repermission "some path" as well. The only thing openat() protects you from is removing/replacing "some path" (not "some file") and I agree that that is valuable for security purposes.

'Modifying/removing/repermissioning "some file" is still possible even with openat() if you do it between the time you open("some path") and openat("some file").'

This is part of what I was trying to head off with my parenthetical. You still have to use it correctly to do secure things. But at least it's possible. This kind of security is basically impossible with pure path-based APIs. Plus, as mentioned elsewhere, there are some additional flags you can use for even more security that you can't get out of an API that is "open(filename)", simply because that API is mathematically incapable of carrying such flags (assuming you don't start trying to encode them in the filename itself, but that way lies madness).

It's doable when you need it, something like:

    filefd, err := syscall.Openat(int(dir.Fd()), filename, os.O_RDONLY, 0)
    file := os.NewFile(uintptr(filefd), filename) // for use with library functions
For what it's worth, Linux 5.6 introduced openat2 [1] which accepts some additional flags controlling path resolution.

For example, RESOLVE_IN_ROOT "is as though the calling process had used chroot(2) to (temporarily) modify its root directory (to the directory referred to by dirfd)".

[1] https://man7.org/linux/man-pages/man2/openat2.2.html

He was - TOCTOU has its own wiki page [1]. These can be nastier, because they don't require the attacker to be able to submit strings or file names.

[1] https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use

I guess I'm not sure how you would use open() that would expose a TOCTOU bug that openat () wouldn't. Can you give an example?
I was unclear. See my other cousin reply; you can't use it yourself to have a directory handle and securely open files in that directory. You can only open things by path.
I’m confused. How would using *at() APIs prevent race conditions?
FWIW (and iirc), with programs using recent-ish glibc, you will never see a call to open() in the wild unless the program takes special care to bypass the implicit libc wrapper. glibc will transparently convert these calls to openat() under its own hood. I do notice that this probably doesn't do you any good on FreeBSD, though :)
This is mostly true on FreeBSD as well. The real problem is that capability mode also disallows openat(AT_FDCWD) - there has to be an explicit directory descriptor.
Mildly off-topic note, the parent is the author of CloudABI (https://github.com/NuxiNL/cloudlibc), which was (in my opinion) a truly brilliant approach to running untrusted code in a FreeBSD system.
Capabilities mode is useful, but it's very difficult to apply to programs that don't fit the model.

If you need to make network connections, you have to do that before entering capabilities mode, because there is no capability to allow it later. You can work through a proxy program, but adding that complexity doesn't seem worthwhile to me unless your program to be sandboxed is very complex.

I haven't worked with OpenBSD's pledge, but the idea of being able to end use of specific dangerous things seems more widely applicable.

> You can work through a proxy program, but adding that complexity doesn't seem worthwhile to me unless your program to be sandboxed is very complex.

I would love it if all network connections of all programs were created through a proxy. It would allow me to do load balancing, firewalling, tunneling, packet capturing, etc. etc. etc. entirely in userspace, without needing to rely on administrative features like pf/iptables, tun/tap, bpf, etc..

You see that in Kubernetes land folks are trying to achieve the same thing by using so-called service meshes (e.g., https://istio.io ). Right now those systems launch a proxy next to every container. For projects like these, it would have been so much easier if UNIX-like systems already had a standard for making the network stack used by a program injectable.

That's an interesting thought, but you'd probably end up with many different (captive) proxy programs that enabled the different types of sockets their clients needed, so it likely wouldn't be any easier than say LD_PRELOADing all the libc socket calls, or one of the tap/tun things and/or some sort of network namespace.
The problem is that many libraries need access to configuration files or other stuff that comes with the library.

So if you start with a system that has some form of persistent objects, then very quickly a root namespace object is created to solve those library issues.

And then you are mostly back to a Unix root directory.

cap_enter can be invoked after library initialization. Libraries can open the files and directories they need during initialization.

A single jailed root is where you end up when you take the route of putting software into sandboxes for which they weren't designed, because now you need to emulate a traditional environment.

pledge and unveil are a middle ground, albeit closer to Capsicum, in that they're much more accommodating of existing software patterns. But they do still require application refactoring. OpenBSD has refactored their entire userland codebase this way. That typically involves identifying the necessary resources a program needs and either shifting their acquisition to before privilege dropping (i.e. early in main), or arranging so that they're subsequently accessible (e.g. using unveil).

It's a shame Linux never merged the Capsicum patches. While pledge and unveil are more convenient from a developer perspective, they can't easily be adopted in a standardized way by other operating systems, like Linux. Capsicum was the closest thing we could have gotten to a standardized sandboxing model in the POSIX universe. If it became widely available (cough Linux), I believe a large chunk of software, especially critical network-facing software, would slowly migrate; and an ecosystem of idioms, patterns, and libraries would evolve to increasingly smooth the transition.

What's doubly shameful is that Capsicum is architecturally extremely simple. In principle it would be easy for any POSIX system to adopt. The APIs are trivial, and Linux is already nearly there now that it has process descriptors and an openat that can prevent parent directory traversal. Most of the leg work is in blocking access, after cap_enter has been invoked, to non-standard interfaces and syscalls that expose resources.

You would need to standardize passing of current root as a file handle, I think? Probably will break some software...
Why not treat open(path) as openat(AT_FDCWD,path)?
Because cap_enter() blocks that too.
Specifically, it blocks going higher than the handle, so using either absolute paths or paths with a ".." component.

Not sure if anything changes for symlinks.