| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by DonHopkins 880 days ago

Even PowerShell's flawed and limited execution definitively proves how terribly misguided and morally bankrupt the pin-headed one-dimensional "everything is a file" Unix philosophy is, especially with its the ridiculously hard-to-parse ad-hoc non-standard mixed-bag syntax of /proc and /etc files, which would have been much better simply using xml or json, yet which don't even come close to achieving that low bar of machine readability and ease of scripting.

If everything is a file, then why the fuck is the insufferable abomination of ioctl so crucial to haphazardly papier-mâché and duct tape over all the uncanny gaps and gaping cracks?

Yet another strike against the mindless cargo cult nonsense and unquestioning sycophantic hero worship of TAOUP.

https://web.mit.edu/~simsong/www/ugh.pdf

13 The File System

Sure It Corrupts Your Files, But Look How Fast It Is!

Pretty daring of you to be storing important files on a Unix system. —Robert E. Seastrom

The traditional Unix file system is a grotesque hack that, over the years, has been enshrined as a “standard” by virtue of its widespread use. Indeed, after years of indoctrination and brainwashing, people now accept Unix’s flaws as desired features. It’s like a cancer victim’s immune system enshrining the carcinoma cell as ideal because the body is so good at making them.

Way back in the chapter “Welcome, New User” we started a list of what’s wrong with the Unix file systems. For users, we wrote, the the most obvious failing is that the file systems don’t have version numbers and Unix doesn’t have an “undelete” capability—two faults that combine like sodium and water in the hands of most users.

But the real faults of Unix file systems run far deeper than these two missing features. The faults are not faults of execution, but of ideology. With Unix, we often are told that “everything is a file.” Thus, it’s not surprising that many of Unix’s fundamental faults lie with the file system as well.

https://news.ycombinator.com/item?id=26604833

Ask HN: What are the bad parts of Unix?

https://news.ycombinator.com/item?id=26613607

lapsed_lisper on March 28, 2021 | prev | next [–]

[...]

But I also think there's a deeper set of problems in the "genetics" of Unix, in that it supports a "reductive" form of problem solving, but doesn't help at all if you want to build abstractions. Let's say one of the core ideas in Unix is "everything is a file", i.e., read/write/seek/etc. is the universal interface across devices, files, pipes, etc.). "Everything is a file" insulates a program from some (but not all!) irrelevant details of the mechanics of moving bytes into and out of RAM... by forcing all programs to contend with even more profoundly irrelevant details about how those bytes in RAM should be interpreted as data in the program! While it is sometimes useful to be able to peek or poke at bits in stray spots, most programs implicitly or explicitly traffic in in data relevant to that program. While every such datum must be /realized/ as bytes somewhere, operating on some datum's realization /as bytes/ (or by convention, as text) is mostly a place to make mistakes.

Here's an example: consider the question "who uses bash as their login shell?" A classical "Unixy" methodology to attacking such a problem is supposed to be to (a) figure out how to get a byte stream containing the information you want, and then (b) figure out how to apply some tools to extract and perhaps transform that stream into the desired stream. So maybe you know that /etc/passwd one way to get that stream on your system, and you decide to use awk for this problem, and type

awk -F: '$6 ~ "bash$" { print $1 }' /etc/passwd

That's a nicely compact expression! Sadly, it's an incorrect one to apply to /etc/passwd to get the desired answer (at least on my hosts), because the login shell in the 7th field, not the 6th. Now, this is just a trivial little error, but that's why I like it as an example. Even in the most trivial cases, reducing anything to a byte stream does mean you can use any general purpose tool to a problem, but it also means that any such usage is going to reinvent the wheel in exact proportion to how directly it's using that byte stream; and that reinvention is a source of needless error.

Of course the sensible thing to do in all but the most contrived cases is to perform your handling of byte-level representations with a dedicated a library that provides at least some abstraction over the representation details; even thin and unsafe abstractions like C structs are better than nothing. (Anything less than a library is imperfect: if all you've got is a separate process on a pipe, you've just traded one byte stream problem for another. Maybe the one you get is easier than the one you started with, but still admits the same kinds of incorrect byte interpretation errors.) And so "everything is a file", which was supposed to be great faciltiy to help put things together, is usually just an utterly irrelevant implementation detail beneath libraries.

[...]

https://en.wikipedia.org/wiki/Talk:Everything_is_a_file

Origin of the phrase "Everything is a File"

The writings of the original UNIX people are well-scrutinized yet there is no sign of them having written this phrase ever. I think the phrase is not authentic. The article is self-defeating in that it states "some IPC" is a file, meaning "not everything is a file as some IPC is not. The phrase is used as a joke in at least one presentation.

"What UNIX Cost Us" - Benno Rice (LCA 2020):

https://www.youtube.com/watch?v=9-IWMbJXoLM#t=24m35s

>UNIX is a hell of a thing. From starting as a skunkworks project in Bell Labs to accidentally dominating the computer industry it's a huge part of the landscape that we work within. The thing is, was it the best thing we could have had? What could have been done better?

>Join me for a bit of meditation on what else existed then, what was gained, what was lost, and what could (and should) be re-learned.

>Complex problem have simple, easy to understand, wrong answers.

>Understand the past, but don't let it bind the future.

2 comments

JackSlateur 880 days ago

You are somehow right but miss some points out of anger

For instance, your awk example with /etc/passwd : the important thing is not that awk fails to be the perfect way to parse the file's content. The important thing is that you can parse the file with a generic tool like awk.

To me, this is the good idea behind all this : using a standard set of tool, you can process the data.

I cannot find a better alternative, and you failed to provide one too : standardize the files ? Their format, yes, not their content. Create a dedicated tool for each kind of file ? Good idea, people would need to learn hundreds of tools, one for each file.

Because most of those files are stored in ascii and can be handled as regular files, you can use just cat to get the content, you do not need to run a specific tool with internal binary parser that create syscall or grab data from shared memory or whatever (but you could).

Another thing : a file-system with an "undelete" capability is a filesystem that do not support data deletion. That would be pretty useless for most use cases, so most filesystems (from the Linux's gang to ntfs to ufs to refs to fat) do not implement this.

opan 880 days ago

>So maybe you know that /etc/passwd one way to get that stream on your system, and you decide to use awk for this problem, and type

>awk -F: '$6 ~ "bash$" { print $1 }' /etc/passwd

I like to apply similar logic to when I use search engines and start broad, narrowing down as needed. If you jump straight to grep or awk you may entirely miss what you were looking for. I frequently search my local IRC logs, and my tool of choice is `less`. I would probably also use that here. You can comfortably get around, all the data is there, and you can search within it. You see adjacent lines which may be relevant to your interests.

You could maybe describe this difference in mindset as programmers vs users. I consider myself a user, but not a programmer. Just as in your example, I thought to look in /etc/passwd for the desired info, but I don't understand why you jumped to that gnarly awk one-liner. Maybe the missing context is you were writing some sort of script/program, but the way you framed it read to me like "I want to get this information on a unix-like machine", and as a human I would just open the file in less. Using cat is another option, but I hate when a file was longer than expected and I have to scroll way up after printing it out. Plus, it's easier to search the text with less than it is in my terminal emulator (though tmux scrollback search helps in a pinch).

I find that the unix approach to things often feels intuitive from a user-perspective where the cranky programmers who insist using python would be better than working with shell have some alternative solution I can't really grok. Maybe it's just people preferring what they're used to, though.