Hacker News new | ask | show | jobs
by yjftsjthsd-h 1616 days ago
> We'll never have any of the things it really promised until we give up on POSIX, tbh.

What about POSIX is in conflict with Plan 9? I would have called Plan 9 a subset of POSIX

4 comments

I would call it maybe vageuly similar, but it is by no means a subset. It has some features in common and many that are completely nonsense in a posix context.

The most pressing problem for implementing plan9-like semantics in a POSIX system is the permission system. In particular, setuid as a mechanism for privilege escalation. This is a big part of why users can't make their own namespaces on linux without help/intervention from root-owned processes (like dockerd or systemd).

Think about it: if you can make the file namespace any shape you want, and then run `sudo`, which is a setuid process that looks at /etc/sudo.conf to decide whether your escalation is allowed, how do you secure it?

How do you even begin to do distributed permissions if everything's looking at /etc/passwd and /etc/group in the current process' namespace to decide who you are?

POSIX is very much built on the idea of a canonical view of the filesystem, and plan9 is built on a vfs that may as well be sand.

Capabilities were supposed to split up the need to having a single root account that could do everything. I'm not sure how far it has gone.

https://tbhaxor.com/understanding-linux-capabilities/

https://blog.container-solutions.com/linux-capabilities-in-p...

They're a good step but they're really a step in a different direction, even though capabilities are at the heart of how plan9 does permissions as well. Plan9 capabilities are more like kerberos tokens, so you get them from privileged services and then can use them to perform privileged actions.

Linux capabilities don't really change any of the issues around namespace security because they don't inherently provide a way to elevate privileges without setuid.

In modern Linux this is no problem. You can now give a process its own UID namespace. In the calling namespace its UID is non-zero, but in its own namespace it's root.
I dunno about the sibling comment's "backwards" comment but the thing here is that the goal isn't to prevent a user process from obtaining privileges, which is what uid namespaces are for.

This is the kind of thing I mean about "not knowing what you don't know," because you're looking at namespaces through the lens linux does, which is that they exist to limit capabilities.

Plan 9 uses namespaces to allow users to control their own environment. It's not a special operation, it's just a thing you do all the time.

For a small practical example: there's no PATH environment variable in plan9. You just union mount things into /bin, and /bin is where your shell looks for things to run. It's that much of an every day operation.

If you put a user under a uid namespace in linux, and then give them the right to create their own filesystem namespaces then sure, you've enabled them to potentially do things like this. But you've also blocked them from escalating their privileges, because now they can't use setuid binaries to obtain "real root" or whatever.

So you're left with one or the other: either you can manage your own namespace, but you have to be protected from potentially breaching root security through a setuid or cap flag on a binary; or you have to be prevented from managing your namespace outright in order to avoid lying to sudo about who can do what.

The thing about `sudo` is confusing because you don't even need to modify the namespace to overwrite `/etc/sudoers` if you have root. You just need to write to the file.

What it sounds like plan9 is doing is giving a local view of the root that local processes see. Which Linux can do too. Not with the same use-cases in mind as Plan9 though, as such capabilities were added for sandboxing/containerization. But the mechanisms are probably(?) general enough to do Plan9 in Linux.

A `sudo` that is seeing a local view of the root is going to have privileged access to that local root, not to the global system root. And that is correct. That is what sudo does. It gives root access to the same root that contains /etc, not to any "outer" or "more global" root.

It doesn't mean you can't have any access to the global root from the local root though. There are many ways to arrange such privilege escalation. (They do have to be arranged, of course, by someone writing the userspace code -- like sudo had to be written.)

>If you put a user under a uid namespace in linux, and then give them the right to create their own filesystem namespaces then sure, you've enabled them to potentially do things like this. But you've also blocked them from escalating their privileges, because now they can't use setuid binaries to obtain "real root" or whatever.

Privileged processes can have a global view of the namespace while the user does not. An ordinary setuid binary on a filesystem the user controls can't get a global view, only because the user does not (should not) have authority to do that. A process with the global view and root can grant the authority though.

The important thing, it seems to me, is that the global outer namespace can grant to the process local namespace any capabilities available through the outer namespace. I'm not sure if this is 100% completed but the ongoing containerization efforts do involve reaching toward that 100% mark.

> The thing about `sudo` is confusing because you don't even need to modify the namespace to overwrite `/etc/sudoers` if you have root. You just need to write to the file.

The point is that you DONT have root, and you DONT have access to write to the file. But you're free to rearrange your namespace WITHOUT having root, and you want to arrange for SOME users to escalate privileges and, say, debug the kernel. Or do something else dangerous that requires elevated privileges in the global context.

> A `sudo` that is seeing a local view of the root is going to have privileged access to that local root, not to the global system root. And that is correct. That is what sudo does. It gives root access to the same root that contains /etc, not to any "outer" or "more global" root.

Yes, and that's a concise description of the flaw: you can't use suid+file based privilege escalation to modify system-wide configuration, without restricting the ability to manage freely your namespace.

This is what unix does by design, and why the authentication design from unix isn't going to work for systems used in the style of plan 9.

You can use suid programs to do what they do. You can use other mechanisms to do other things.

Suid programs on Plan9 could not possibly behave any differently. If the user rebinds `/bin` and then runs a suid program that calls other programs, that suid program cannot use the rebound `/bin`. That kind of binding simply can't be allowed to cross security contexts.

> But the mechanisms are probably(?) general enough to do Plan9 in Linux.

They are not. The whole point of this subthread is that the ability to create namespaces as an unprivileged user[1] would be key to actually 'doing plan9 in linux'. You can not believe that if you want, but I'd suggest you read up a bit more on plan9 if so, because it becomes obvious pretty quick that it's the case.

[1] And here by 'unprivileged user' I mean someone who is still a user of the machine, and not a user who has been containered away into a separate user namespace, let alone into a whole docker-style container.

You understood the problem backwards. How do you give out actual, real, global root, without taking away the ability to do arbitrary namespaces?
I feel like plan9 nerds (of which I’m one) are missing the point. People are asking “what’s so special about plan9?” and the best response is some esoteric point about a thing people empirically don’t want to do? Who cares?

There has to be a better answer than “Wow, I can make my computer that can’t run anything people want to run secure in a hypothetical hierarchical organization structure of permissions that can each have their own subtree sudo. I even call it treedo, ha-ha!” .. it just doesn’t resonate.

I dunno I think the positives are all actually pretty practical. Probably even more so today where heterogeneous computing is so much more common than it was in the 90s.

I would frequently love to have the ability to just mount a bunch of cpus off a beefier machine onto my laptop and take advantage of that to speed up my builds. I can use DISTCC but holy hell is it a lot more complicated to set up.

Or like, mounting a zip file as a directory, without needing a whole enormous systemd or gnome hairball along with fuse or gfs to make it happen as a regular user. Or hell, mount a usb stick even!

These are literally things I wish were easier every day as a software developer. The vaguely plan9-shaped bits that have been added to linux over the years have brought me no closer to them.

Fuse is how it works on Linux. Fuse is the mechanism the kernel provides to do this. So the proviso "without needing ... fuse ... to make it happen as a regular user" is terminal.
The problem here is boring and practical: how do you make `bind` (or, in Linux land) `mount --bind` secure and still allow you to authenticate as a separate user, elevate permissions to change kernel config, and so on?

The authentication mechanisms on most Linuxes are based around suid binaries that read configuration files in order to decide on what to do, so if you can bind in a namespace, you can fool the authentication mechanisms.

In plan 9, this is solved with the kernel capability device. It's not particularly exciting, it's just one of the things that need to happen when you remove the concept of a global 'root user' from the system.

I think you're right that I don't understand the problem.

If you want to give actual real global root, I think you can do it by having a gifting process put the real global root process into the same process namespace as the giftee process.

And how does that interact with binding files as non-root, including /etc/sudoers.conf?

(plan 9 has no suid, so this is not a problem there)

I said more about that in the thread already.
Pretty much everything is in conflict.

POSIX standarized and attempted to unify a dozen different incompatible systems that developed independently on top of the original unix from bell labs. Those systems were developed by building new functionality on top of what unix provided. In order to keep at least some sort of compatibility the old and at times obsolete functionality was kept in the system.

Plan 9 on the other hand intentionally broke compatibility with its predecessors and had those same features that were glued recklessly on top of each other in various unices thoughtfully redesigned from scratch, often omitting stuff that didn't seem relevant enough to its authors.

Additionally it also took the role of being C's standard library that ISO C did not want to take upon themselves.
A good starting point:

http://9p.io/sys/doc/ape.html

Note that this - as almost everything else that is plan 9 related - is dated.

The number of special cases.

For example: consider how you'd write some generic code to forward all ioctls transparently across the network. Keep in mind that the data attached to the ioctls is machine dependent, driver dependent, and has no information about how it's formatted. Every ioctl for every driver is its own special case.

Meanwhile, faithfully forwarding all devices in a plan 9 system is trivial. Control messages aren't strictly formatted -- but they're done via reads and writes on file descriptors, so sending them to the devices that understand them, and relaying back the result, is trivial. It's just 9p: https://man.9front.org/5/0intro

Doing this fully, for all devices (except /srv, which is a bit magical) is implemented here, in a short shell script. This is the remote login program used by 9front, which gives you something resembling ssh or vnc, but with full access to the data and devices on your local machine, graphics, audio, mouse, keyboard, USB, network, and anything else, even if it hasn't been implemented yet. It does both client and server side:

https://git.9front.org/plan9front/plan9front/HEAD/rc/bin/rcp...

The client side sets itself as a file server, using exportfs. It exports everything in its namespace, including /dev, over to the server.

The server takes the client's namespace, and mounts it over /mnt/term. Then, it takes /mnt/term/dev/cons and binds that over /dev/cons and starts a shell. That means that every time a program is run, it opens /dev/cons to interact with the user, using the client's mouse, keyboard, and so on, forwarding all the operations transparently over the network.

The idea can go further; Instead of using network translation layers, for example, a plan 9 machine would import a different machine's network stack and mount it over itself:

    # whats my ip?
    % cat /net/ipselftab
    192.168.1.11                                 01   4u
    % hget https://api.ipify.org
    74.{home.address}

    # ok, let's import another machine's network stack and use it.
    % rimport orib.dev /net/ /net

    # what's my ip now? look ma, I'm proxying!
    % cat /net/ipselftab
    144.202.1.203                                01   4u
    % hget https://api.ipify.org
    144.202.1.203
There are no special hooks in the network stack for this. It doesn't know. This happens for free because the network stack being accessible through the filesystem API.

This kind of thing happens everywhere, because everything goes through 9p, and everything can be namespaced. There isn't any other special case to consider: If you forward 9p, you forward all operations you can do with a device. Or any other file server.

If everything is in a namespace, you don't pull the devices other programs are using out from under them, so you can put one login in one sandbox with a remote mouse and keyboard, and a different one in a different sandbox with a different network stack.

This falls apart when you have the 53,719 special cases bundled with posix. If you need a special case for each operation you through the network, or interpose in userspace, you're in for a rough time.

Plan 9 works because it's relatively simple and uniform.