Hacker News new | ask | show | jobs
by uhura 1286 days ago
This kind of article develops a dangerous line of reasoning.

The failure of Plan9 is not about what went wrong or right. It is about how the market evolved and made decisions, which goes beyond the scope of this comment.

The first conclusion "...is not to try to fix things that are not broken..." is dangerous because what someone identifies as broken is not the same as someone else. As a Plan9 user you may identify Unix deficiencies as problems and a Unix user might disagree.

The second conclusion "...is to try to identify if there is a market..." and there are many things in life that contradicts this (maths, basic science, maybe the beginning of Unix itself...).

And the final conclusion is about backwards compatibility. This is sound in the authors perspective because of the first conclusion but does not hold against good reasoning. Plan9 broke some compatibility because it needed and kept others because they were already good in the POV of the developers/researchers.

The Plan9 effects to me is far different. It is about the fact that Plan9 is unable to die against all expectations. 9P is there to stay, 9front gets releases every year, /proc is everywhere, same for UTF-8, and so on.

Good ideas stick, they are hard to let go and even harder to ignore. Plan9 failed in the commercial OS sense, just as many others failed. "You may never know it's broken until you fix it" (I'm sure heard it somewhere).

4 comments

> Good ideas stick, they are hard to let go and even harder to ignore.

Horrible ideas are even stickier unfortunately. Like Unix, Plan 9 is also designed around forking processes and even has asynchronous signals.

Can you please elaborate on why those two ideas are "horrible"? Forking processes seems like a rather elegant concept, to me.

Granted, it's sometimes weird that in order to have a process running image A create a process running image B, it must first have a mirror image of itself for a while, then replace it. But the symmetry and simplicity at the conceptual level is nice, and the ability to have code for both parent and child in the same image is neat.

> Can you please elaborate on why those two ideas are "horrible"?

The problems with fork are thoroughly explained by this Microsoft paper:

https://www.microsoft.com/en-us/research/uploads/prod/2019/0...

As for asynchronous signals, they are a completely broken concept and implementation.

It's not safe to do pretty much anything in a signal handler other than setting a flag and returning. You cannot use any non-reentrant function which eliminates pretty much everything useful, including the vast majority of standard library functions.

The system drops events. Signals are essentially a pending delivery flag in the kernel, signaling a process just sets the flag to 1 so there's no difference between doing it once and 10000 times.

Was in a system call when your process was signaled? It will be interrupted or cancelled. It returns EINTR and your code needs robust retrying logic to handle such a case. I've seen code that couldn't handle it and crashed.

There's race conditions everywhere. Signals can arrive while you're handling other signals, you need to block block them if you don't want that. Every thread has its own signal handling behavior. Signals sent to a process are handled by "any" thread. I remember reading in some standard that using signals in multithreaded programs was undefined behavior. Who even knows what's going to happen?

The only borderline sane way to handle signals is with file descriptors: you block traditional signal handling on all threads and set up a signalfd that you can epoll along with everything else on a thread dedicated to the event loop. Even this is pretty bad:

https://ldpreload.com/blog/signalfd-is-useless

https://news.ycombinator.com/item?id=9564975

Asynchronous processor interrupts have analogues to basically all these issues that signals do. Signals are certainly a pain to implement and get right, but people seem to think they're something that they aren't, or expect them to be used for something they can't do. Clearly signals aren't a message passing scheme, they're a notification system. And obviously taking an asynchronous interrupt needs to use reentrant code.

As for the fork paper, seems like a typical academic type of critique.

"Fork today is a convenient API for a single- threaded process with a small memory footprint and simple memory layout that requires fine-grained control over the execution environment of its children but does not need to be strongly isolated from them."

I.e., exactly what it is good for and used for. And it goes on

"Fork is incompatible with a single address space. Many modern contexts restrict execution to a single address space, including picoprocesses [42], unikernels [ 53], and en- claves [ 14]."

And talking about heterogeneous address spaces, and all other things academics love but nobody really uses.

They also make pretty outlandish claims "Fork infects an entire system." based on the Windows implementation where the Win32 API and kernel explicitly do not implement fork. And it's provocative language "infects" -- the Linux kernel fork implementation is maybe a thousand lines of C code. And that does not constrain it or prevent it offering several of the suggested alternatives.

K42 was also a failure of a research operating system, and another microkernel (surprise). Turns out (as usual) that designing a system around some latest craze or fad in technology no matter how good (RCU and other lock free algorithms) rather than designing it to deal with workloads that people actually use, is still the recipe for disaster and one of the main causes of second system syndrome.

> Asynchronous processor interrupts have analogues to basically all these issues that signals do.

I know. That doesn't really excuse the suckiness of asynchronous signals. It's quite simply stupid to have operating system facilities that are as limited as actual hardware interfaces. With signals, you're supposed to write code like your computer is a Super Nintendo or something.

> Clearly signals aren't a message passing scheme, they're a notification system.

Signals suck as a notification system too, simply because there is no queue. You're supposed to be notified by a signal when a child process exits but the system can actually lose that information forever in certain conditions which means it is literally impossible to build truly correct software, best you get is good enough. You also get a signal when you write to a pipe with no readers or when the terminal is resized, making it that much more of a pain to deal with that stuff.

People build literal user interfaces using signals. SIGINT and SIGTERM, everybody knows about that. SIGHUP to make some running process reload configuration or something. Even dd prints a progress report if you send SIGUSR1, how insane is that? I really don't want to think about signal insanity when doing important I/O operations.

> And talking about heterogeneous address spaces, and all other things academics love but nobody really uses.

The paper mentions several concrete examples in wide use right now even in consumer machines. Systems on a chip with accelerators, GPUs...

> I know. That doesn't really excuse the suckiness of asynchronous signals. It's quite simply stupid to have operating system facilities that are as limited as actual hardware interfaces. With signals, you're supposed to write code like your computer is a Super Nintendo or something.

This isn't really comprehensible. The issues with signals are presented as something that makes them "broken". Are you claiming that hardware interrupts are broken?

> Signals suck as a notification system too, simply because there is no queue.

You are conflating two different things. A queue is for messages. An interrupt is for notification. This is how hardware interrupts work. Some work arrives somewhere (in a queue, a register, some memory, whatever), so an interrupt is raised to notify that work is pending.

Likewise signals can be and are associated with queues or messages or pending events that can be interrogated after the notification that there is work.

> You're supposed to be notified by a signal when a child process exits but the system can actually lose that information forever in certain conditions which means it is literally impossible to build truly correct software, best you get is good enough. You also get a signal when you write to a pipe with no readers or when the terminal is resized, making it that much more of a pain to deal with that stuff.

I won't go into every type of signal. There are some that are not defined in a way that can be used by what people want to use them for. That's not a problem with "signals", it's a problem with a particular signal or lack of additional interfaces around that to provide what is required.

> The paper mentions several concrete examples in wide use right now even in consumer machines. Systems on a chip with accelerators, GPUs...

For the most part not heterogeneous. Masters that have access to host address translation services (like cache coherent GPUs or FPGAs on some busses, like nvlink or CXL with ATS) have equal ability to access the entire process memory space. And fork doesn't look really different from many other operations on an address space from the point of view of an MMU whether it's on the core or associated with an accelerator -- all it is is changing memory protections and taking page faults, the same kind of COW is done with private writable mappings, or page deduplication, for example. They really are just handwaving up things that nobody actually uses.

Good discussion both pro and against the paper here: https://lwn.net/Articles/785430/

Fork causes huge complications. I summarised some of the paper here https://news.ycombinator.com/item?id=31702952

Edit: I imagine forking and signal handlers don’t compose well, and I also would hate to have to think how forking and SCM_RIGHTS interfere with each other: https://googleprojectzero.blogspot.com/2022/08/the-quantum-s...

Fork is actually very fast. Too funny a paper coming from Microsoft about fork/exec speed -

https://www.bitsnbites.eu/benchmarking-os-primitives/

Linux absolutely destroys the "proper" API. Nearly 40x faster at launching a program with fork+exec than Windows' CreateProcess. Not to mention the fact that vfork has always been available which is even faster.

Fork is also pretty scalable, it requires no global locks. It is thread-safe, it has defined semantics in threaded programs and can be used to exec a process. And it isn't insecure, it does what is advertised, as securely as advertised.

And close on exec is hardly a huge complication, it's actually a detail of exec(), not fork. It applies independently of exec, and you could make an exec that closes fds by default unless they're marked with a persist-on-exec flag. Library or runtime code can do this anyway really without any "huge complication". I don't know what you mean about SCM_RIGHTS interfering with fork, do you have something in mind? The problem would really be at the exec boundary, fork does not purport to alter any security attributes of the child or parent, so it really doesn't make sense to call it insecure. It doesn't suddenly get new rights, or have any limits enforced.

I mean it is complicated stuff, but so is any process runtime environment that provides async notifications, threads, spawning, etc. Anybody who tells you they can make this simple and broadly usable is selling you snakeoil or a toy API. If people can't cope with reading documentation and thinking carefully about this stuff, they shouldn't use it anyway, they should use a higher level runtime or library to do process management. The handwringing about fork is a bit baffling. Reminds me of the handwringing about fsync, it seems that people just don't read documentation and make silly assumptions about how things should work, and then get embarrassed and blame the tools.

I mean fork retains file descriptors from the parent process. This is not some obscure undocumented behavior, it's like the second thing you read in the manual page. Same as execve. I don't like to make excuses for badly designed APIs and code, but honestly if a programmer isn't capable of thinking about what happens to file descriptors there they certainly should not be writing code that uses fork or exec, let alone something that's security sensitive. I don't think that's being unreasonable or elitist. You wouldn't want them writing security sensitive Windows code either, would you?

So M$ write this to convince people their CreateProcess is better than fork? Not smart.

Thinking signal sucks is just because people trying relying on it to do something it is not designed to do. You can not complain a pile of wood is not a table.

You got me. That is quite horrible.
Forking is one of those things that is a super elegant solution when things are simple, but breaks down when things become complicated.

Multithreaded app? Fork is now a liability, and is only useful if the only (more or less) thing you do in the child is exec. Might as well only have Windows-style CreateProcess at that point.

For single-threaded programs, sure, fork is fine and gives you more flexibility than a CreateProcess-type API.

Asynchronous signals have a lot of the same problems, but those problems are also present in single-threaded programs. Quite a few APIs have been added over time to try to make working with async signals easier and safer, but all of them add their own new gotchas.

> Asynchronous signals have a lot of the same problems, but those problems are also present in single-threaded programs. Quite a few APIs have been added over time to try to make working with async signals easier and safer, but all of them add their own new gotchas.

Isn't this referring to uses beyond what async signals are good for, or are you saying that async signals should just not exist in favor of something else? It's not like they're meant to be the only IPC mechanism, but they're good as a standard way to inform a process of certain things while having default handlers.

EDIT: Nevermind.

so... shitpost warning but...

Perhaps forking is the elegant refined mechanism and multi threading is the abomination that should have never been invented.

Multithreading is the concept that a process(an independent execution unit) can share memory space with another process. and it turns out you can, only at the cost of making all your memory access methods extremely fragile and error prone. The concept should have never been invented.

Sharing an address space was the default until the MMU was invented. That said I agree with you that multiple process with some optional shared memory seems to be a much safer approach than the share by default multithreading I don't know why MT won..
> Forking processes seems like a rather elegant concept, to me.

Forking is good because it saves you a rich process manipulation API, and because generally speaking doing things to an execution environment always ends up more clunky than doing things in one. (Cleaning up after doing them is another matter.)

Forking is bad[1] because it (essentially if not literally) forces memory overcommit on you, at which point resource accounting becomes hopeless.

[1] https://lwn.net/Articles/785430/

It's called trolling. They're using the word "horrible" to bait a response on something that can be easily argued on.
It's called having an opinion and expressing it. You can accept my opinion, ask me to elaborate or convince me it's wrong. What you can't do is accuse me of trolling just because you disagree.
No, I thought you were being hyperbolic, but after you've explained your problems with signaling, I agree with it. Kind of depressed at the state of that, now...
You and me both. I've wasted way too many neurons trying to understand this legacy brain damage. Hyperbole doesn't quite do it justice, it's a design that deserves an epic rant like mpv's locale commit:

https://github.com/mpv-player/mpv/commit/1e70e82baa9193f6f02...

A more agreeable example would be Windows forbidding specific file and folder names to reserve for device names, something it apparently inherited from CP/M.
Thanks you to express in English my inconfort growing while reading this ( otherwise nice ) article.

Market fit is insight 20/20 IMO, but you phrased it better.

Overall, my discomfort is that it’s the type of reasoning that keep you with your two feets in the mud. Stuck.

I’m stealing “you don’t know it’s broken until your fix it”

Procfs appeared originally on UNIX, see USENIX paper.
And BSDs removed it, deeming it a security risk.

Personally, I don’t like that it’s more of parsing text files to get a number, when you could have functions returning structs (of variable length, to get extensibility, while preserving backwards compatibility).

A fair assessment, but at the end of the day it is all just a stream of bits, bits packed into a into a structure, how do you know the structure?, how do you know how wide each part is? or bits packed into an array of encoded bytes. what is the encoding? how do you parse it?

The array of encoded bytes despite it's complexity overhead has an advantage in that is lays on the human visible side of computing, that part of computers designed for the human to use. I have to admit more often than not I prefer to eat the overhead and have an interface that I can see.

The BSD approach at the time was to require setuid access for programs like ps to be able to read the kernel memory space via /dev/kmem to produce a running process list.

That is infinitely more stupid than procfs.

The stupidity of that is orthogonal to whether to use procfs or another sane, but less plan9 approved design like a syscall interface or an ioctl.

I'm reminded of another thing that used to be file based but moved away, and towards syscalls: random number generation. One criticism of a /dev/random approach I've seen is that open(2) could have some fringe error case (descriptor table too big?), and you don't want your secure RNG to bail on you. In particular lazy initialization of a secure RNG where the caller may not be able to check for errors.

Overall it's far more universal interface, you app "just" needs to parse relatively simple texfiles instead of api call per data type.

The text format is a problem on its own tho, "procfs but serialized using single format" would IMO be a best middle ground between tying your app to essentially kernel headers and parsing a bunch of random textfiles

Not just BSDs; no major operating system other than Linux uses procfs anymore.
Sure, but Linux is also on more devices than any other OS in the world.

Your statement would be more damning it Linux was a minority player or on the decline, but that's not the case.

Procfs seems... fine, really.

Being pedantic, isn't Minix more widespread because Intel embedded it in CPUs?
Also most Linux machines, like phones, also contain instances of things like SEL4, often a couple of them.

And Linux... well, it managed to accumulate a lot of historical baggage for something that young. Device numbers are another example.

Solaris has had procfs for years.
Solaris is a history at this point.
Oracle and Fujitsu still sell and support it.
Really? Tell that oxide computers...well it's not Solaris but Illumos.
> Good ideas stick, they are hard to let go and even harder to ignore.

That's a noble wish, but that doesn't make it true.

If an idea does not stick, maybe it does not provide a benefit?

Then it's not really good, but merely good-looking.