Hacker News new | ask | show | jobs
by xenadu02 3449 days ago
Who said the unit philosophy was the be-all, end-all? Unix doesn't even believe it's own philosophy because it's too damn painful. Why does ls do sorting? Why does grep do -R recursive searching? How is that "Do one thing and do it well"?

Unix is just a collection of random decisions made by various people over the years. The Unix philosophy is really more like "I've always done it that way so don't you dare change it".

fork/exec is garbage. Signals are garbage. You basically can't make any API calls after fork or in a signal handler because threads came after. The interaction of fork and threads is bananas and many hacks have been required over he years to paper over the problems.

The layout of /usr/bin, /usr/sbin, /usr/local/bin, et al isn't a good design. It's dogshit but it was necessary because early Unix file systems couldn't span multiple volumes and early disk systems were small.

The C compilation model of separate header files is not a good design. People have retroactively determined some of the side-effects are not only good but The One True Way. In reality the design was a result of extremely limited RAM and slow CPUs. The preprocessor itself was never designed, just grafted on ad-hoc.

Unix file permissions are shit. Every unique combination of permissions requires a group. Owners are by simple integers. NFS legitimately gives people nightmares.

Let's not even get into everything is a file, except when it's not, and some files are more equal than others.

What about dependency hell? How's that "simple" model working out?

The "Unix philosophy" can piss right off.

4 comments

"Who said the Unix philosophy was the be-all, end-all?" I didn't and I don't know anyone that has. I was comparing systemd and SysV.

"Unix is just a collection of random decisions made by various people over the years."

"Linux has never been about quality. There are so many parts of the system that are just these cheap little hacks, and it happens to run." - Theo de Raadt

"The layout of /usr/bin, /usr/sbin, /usr/local/bin" I don't agree with you. Having base in a directory and having user installed bins in another makes sense. I don't understand why most modern Linux distros install on only one partition by default. You lose the ability to mount with flags ie noexec,nosuid,nodev.

NFS is horrible. Unix/Linux is a multi-user operating system why would you not want to have groups?

KISS

> NFS is horrible. Unix/Linux is a multi-user operating system why would you not want to have groups?

Now that's backwards. He obviously mean a better permission system. Do you seriously think groups is the best way?

Access Control Lists are a better way.

"The C compilation model of separate header files is not a good design"

This is probably just your lack of experience not having worked on 50+ million LOC compiling for 12 hours and not having anything else as a better option. There is a reason these things exist.

I've worked on a project that combined C++ and C# in approximately equal amounts (say 2MLOC each). The C++ project compiled and linked for 20 minutes , C# compiled and linked in under 2 minutes. Go figure.

I agree: C and C++ compilation model is not a good design. It's a patch for not having a decent module system. Heck, even Borland Pascal compiled faster in 90-ies than C++ does now on an orders of magnitude faster machine.

For as long as C has existed, other languages have provided alternatives where symbol information is extracted from the main source files automatically by the compiler and optionally cached for next time, or if you don't want to ship users of a library source, for example. In other words: This has been solved in a better way since the 70's.
> There is a reason these things exist.

Enlighten me. And while you're at it, explain why this is better than, say, Rust's module system, where we don't need separate header files.

Isn't the limitations of the header file system the reason why the C++ module system is being developed?

Why wouldn't something like the C++ module system be of benefit in C also?

One of the reasons it seems.

It would, but by the look of it, C++ evolves faster these days.

??? Header files is the problem, not the solution.
Do you want to clarify that, because I think you may have missed a "not" in there.
Yes, yes, yes! Never has someone articulated what I believe so damn well. There is so much blind worship of tradition and heroes in the UNIX world, so glad to see someone else believes what I've always believed.
Most of this comment is incorrect.
I think xenadu02 raises some valid criticisms, but I think those criticisms would have been better received if they were expressed more politely.

I'd love to see a rebuttal of the specific points made as opposed to just "Most of this comment is incorrect".

Most of the statements aren't really conducive to rebuttals because they are lacking substance.

But I can imagine what xenadu02 might have meant, if you like, and provide some counter arguments.

Signals aren't "garbage" (whatever that means).

Signals can call APIs (the set of async-signal-safe APIs). They can't call non-async-signal-safe APIs not because of threads, but because signals can interrupt a routine at any point (necessary for asynchronous notification of certain events which must be handled before the normal instruction control flow can be resumed) and that interrupted routine may not have been written to be reentrant.

This is true even without threads in the picture.

The fork/exec model is not "garbage". It is actually a fairly nice alternative to the "provide one API to start a child process and give it a large number of parameters for all possible situations". And you can call plenty of APIs between fork and exec in the child safely, just like from signal handlers.

I haven't dealt with dependency hell ever since shared libraries got sonames.

The rest of the comment doesn't list anything of substance. If you want rebuttals for "the file system layout is a bad design" or "the C compilation is a bad design" or anything else, provide some reasons why those are bad designs; some of those reasons may be valid criticism, and some may not be, but one can't just make vacuous statements like that and expect a reasonable discussion to follow.

Unix signals have been called garbage by some and "unfixable" by others [1]. The article [1] explains the evolution of signal handling, from sigvec(), sigaction(), to signalfd() -- a rocky history fraught with problems, an article in the series "Unfixable designs".

> So while signal handlers are perfectly workable for some of the early use cases (e.g. SIGSEGV) it seems that they were pushed beyond their competence very early, thus producing a broken design for which there have been repeated attempts at repair. While it may now be possible to write code that handles signal delivery reliably, it is still very easy to get it wrong. The replacement that we find in signalfd() promises to make event handling significantly easier and so more reliable.

Another critic makes the case that "signalfd is [also] useless" [2]:

> "UNIX[] signals are probably one of the worst parts of the UNIX API, and that’s a relatively high bar."

Signals came up recently on HN when someone remarked that not even memset() is signal-safe! [3]

All in all, working with signals correctly requires mastering a tremendous degree of complexity. Other platforms have provided simpler APIs, such as Structure Event Handling (SEH) [4].

[1] https://lwn.net/Articles/414618/

[2] article link from https://news.ycombinator.com/item?id=9564975

[3] https://news.ycombinator.com/item?id=13313563

[4] An HN comment describing how it's simpler: https://news.ycombinator.com/item?id=13323870

P.S. Please note that the views quoted above are not necessarily my views.

Like I said, there are some valid arguments on both sides. But a blanket "signals are garbage" is not useful or correct.
I'm not going to defend everything xenadu02 said, but I think there were some points that resonated with me even though I agree they could be expressed more constructively.

> Why does ls do sorting? Why does grep do -R recursive searching? How is that "Do one thing and do it well"?

I think these are valid examples of how Unix itself fails to follow the "Unix philosophy" of "Do One Thing and Do It Well".

> The fork/exec model is not "garbage". It is actually a fairly nice alternative to the "provide one API to start a child process and give it a large number of parameters for all possible situations". And you can call plenty of APIs between fork and exec in the child safely, just like from signal handlers.

fork-exec complicates the implementation of threads (see atfork handlers). Rather than "a large number of parameters for all possible situations", another alternative would be to have (1) a call which given executable name and arguments returns an opaque handle (or file descriptor) representing the process to be started (2) a bunch of further calls to set attributes on that handle – new features could add new APIs acting on the handle, or an extensible API like ioctl could be used – if there is a handle to represent the current process, then you only need one API call to set it for the current process or a child to be started (3) finally, a start call which turns the process-to-be-started handle into a running process handle.

> Unix file permissions are shit

The user-group-other model is arguably too limiting. ACLs are a better idea, but then should you use POSIX ACLs or NFSv4 ACLs?

The distinction between primary group ID and supplementary group IDs is silly.

Why must every file have both a UID and a GID? For files owned by a single user, you end up creating a dummy group like "staff" or so on just to obey the rule that every file must have a GID. For shared files, e.g. project files, files generally end up owned by their creator, even though in a business sense they really belong to the project not to whoever created them. It would make more sense if the owner could be either a user or a group, and then also have zero or more non-owning groups associated with it.

In most cases permissions should only exist on the directory, and then automatically apply to any files in the directory. (In most cases every file in the same directory should have the same permission; Unix bases its design on the exception rather than the rule.) Of course, hard links make this impossible, but I think hard links were a mistake.

The executable permission bits actually do double duty as a file type indicator. That's rather ugly. If Unix had explicit file types (rather than just a naming convention of file extensions), then certain file types could be declared to be executable. Executable permission would then mean "you are allowed to execute this if it is an executable" instead of "this is an executable". Stuff like the +x vs +X distinction in chmod would never have been necessary.

> Let's not even get into everything is a file

Unix would have been much better if everything were a file descriptor, rather than having stuff like pid_t. Linux at least is evolving in this direction. Plan9 does it better. Even the WindowsNT philosophy of "everything is a handle" is better than the traditional Unix approach.

Regarding ACLs, I'd say that there's little choice here: it has to be NFSv4.

The rationale for this is that POSIX ACLs are firstly too simple to model what we need. And they are also non-standard (POSIX .1e ACLs are a DRAFT specification which was never ratified).

NFSv4 ACLs are vastly more featureful, already implemented to support NFSv4 in kernel, though not available in userspace AFAICT. On FreeBSD and other platforms using ZFS, they are also used by ZFS and are directly exposed to userspace, making rich ACLs usable as the default permissions model system-wide when running on ZFS. Linux, unfortunately, doesn't yet do any of this, even when using ZFS.

The irony is that whilst the standards document was never ratified most people implemented it anyway. So actually, they are a standard. (-:
Programs have features because they are useful. Some features may not fit your view of what the philosophy should dictate, and that's OK. Having a recursive ls doesn't bother me for example.

Fork-and-exec isn't complicated by threads. Only fork-and-keep-executing is.

UNIX doesn't have a naming convention using file extensions.

Some of your points are valid opinions that are shared by others, but I don't know how much they have to do with the UNIX philosophy.

Some APIs can be improved, sure. And some are being improved. It takes time because of unix's success and most systems' desire to remain backward compatible (especially in source form).

> Fork-and-exec isn't complicated by threads. Only fork-and-keep-executing is.

Another issue is that fork-and-exec doesn't work well with languages with complicated runtimes, e.g. multithreaded garbage collection. It forces you to use a lower level language (such as C) to write all the code between fork and exec. An API based on process handles with a separate "start" call to convert a not-yet-started handle into a running process wouldn't have that deficiency.

Another issue is that it is very hard to implement robust error handling without race conditions in the fork-exec model. What if the child process encounters an error between the fork and the exec? How does it notify the parent process of exactly what error it got (e.g. "setsid failed"?) You need some sort of IPC mechanism between the child and the parent. And such an IPC mechanism is prone to race conditions. By contrast, the process handle-based API I suggested doesn't have this problem since it doesn't introduce more concurrency into the system than is absolutely necessary.

> UNIX doesn't have a naming convention using file extensions.

Yes it does. The average Unix system is full of file extensions like .c, .h, .so, .html, etc. Even in Unix V1 file extensions were used as a convention - http://minnie.tuhs.org/cgi-bin/utree.pl?file=V1

> Some of your points are valid opinions that are shared by others, but I don't know how much they have to do with the UNIX philosophy.

Is there a clear definition of what the "UNIX philosophy" is? Is any criticism of Unix systems as actually implemented a valid criticism of the "Unix philosophy"? Or do you want to define the "Unix philosophy" so vaguely as to put it beyond any possibility of criticism?

The Windows NT model built on operating systems design thought that happened in the 1980s, that took far too many years to trickle into the other operating systems whose designs such thought was looking at.

However, FreeBSD has had process descriptors since roughly 2010. They have the slightly odd semantics of terminating processes when all descriptors to them are closed. But they can be used as descriptors with kqueue() and the like.

«Foo is garbage» is not a valid criticism.