Hacker News new | ask | show | jobs
by gumby 3520 days ago
The best thing about POSIX at this point is back compatibility, nothing to be sneered at!

But it carries a lot of (in retrospect) bad habits and decisions from the 60s and 70s as well as a tendency to redundancy due to some competing standards that were unified and need for some back compatibility.

Now not everyone agrees on what is good and what is bad, so some experimentation in this area is good for everyone. Examples of what bother me include the ludicrousness of ioctl(), messed up / redundant semaphore semantics, ditto for IPC, primitive memory mgmt semantics, fork() -- a great hack for its time but since it's 99.9999% of the time followed by exec() should be split into separate address space and thread management), outdated and simultaneously simplistic and baroque security model(s), and various IO issues to many to go into in a HN comment.

But my loathed feature is undoubtably someone else's sacred cow. As I said, letting more flowers bloom is in everybody's interest.

3 comments

I think the main issues with POSIX are:

- (Correct) IO is ridiculously non-portable and painful in so many ways that it isn't even funny anymore

- Locks are ridiculously non-portable and painful to the point where you're better off just using "mkdir"

- POSIX is stuck in the "everything is bytes and we slap an encoding on it some of the time" era thinking. This makes it painful and hard to implement proper text handling in many instances. This also leads to a lot of bad behaviours.

- Memory management is IMHO lacking from a user space perspective. For example, it's practically impossible to implement a cooperative memory cache on this. To the best of my knowledge no OS has the necessary interfaces, though.

- ioctl as you mentioned

- SysV/POSIX IPC is so bad that no one ever bothered actually using it for anything

- Personally I think it's a misleading API (conceptually, see above, the text example for example) almost to the point of deceptiveness. It's very easy to write correct looking programs that behave far from intended, especially in edge cases. IMHO code using it is practically unreviewable in everything but the most trivial cases. Non-portability is practically guaranteed, you have to test every platform. Portable code usually turns out to be quite ugly due to platform deficiencies and minor API incompatibilities.

> POSIX is stuck in the "everything is bytes and we slap an encoding on it some of the time" era thinking.

I actually view this as a feature. Encoding/decoding of data should be an application level thing, not an OS-level thing. As far as the OS is concerned, data should be bytes.

(Of course, it is true that, since POSIX defines a terminal spec, it has to at least specify how bytes are mapped to characters that print on the terminal. But I would rather see that removed altogether, so a terminal becomes just another application, than have an OS try to muck about with encodings.)

Applications have for the most part proven that they cannot be trusted to get text encoding and decoding right, especially not in any consistent way. Operating systems definitely should make it possible to deal with the raw byte streams, but the default and preferred method of text handling should be a standard higher-level interface.
> Applications have for the most part proven that they cannot be trusted to get text encoding and decoding right, especially not in any consistent way.

That's because text encoding and decoding is a mess. Operating systems doing it doesn't make it any less of a mess; it just inserts the mess deeper into everything. For example, look at all the quirks and edge cases in file name handling between different OS's, simply because nobody is willing to just admit that to the OS, file names should be sequences of bytes, which are easy to share between machines running different OS's.

The basic issue is that text encoding and decoding exists because bytes have meanings. But unless/until we invent artificial intelligence, computers can't deal with meanings (because the meanings are not simple computable functions of the bytes). And OS's, particularly, should not even try. Applications might have to try, but the cost if they get it wrong is much less.

Regardless of whether operating systems get involved in tasks like re-encoding text, they really should at least carry along the metadata about encodings whenever they're handling bytes that represent strings. Completely ignoring the problem and leaving it up to applications further up the stack just ensures that there will be incompatible competing standards for how to tell applications how to decode the string data they get from the OS. You don't want some apps trying to write filenames in UTF-8 while others use UTF-16, but allowing it to happen silently is even worse.
> Completely ignoring the problem and leaving it up to applications further up the stack just ensures that there will be incompatible competing standards for how to tell applications how to decode the string data they get from the OS.

I think it's naive to think Operating Systems aren't going to fragment in order to offer "features" (and lockin), and then papering over all that fragmentation has to happen in the application anyway.

unless there's a standard, and if there's a standard the application itself can deal with it.

> Regardless of whether operating systems get involved in tasks like re-encoding text, they really should at least carry along the metadata about encodings whenever they're handling bytes that represent strings.

I have no problem with this as long as the metadata itself is just additional bytes. But if the metadata needs to be decoded in order to figure out how to decode it, we have a problem... :-)

> As far as the OS is concerned, data should be bytes.

If the OS knows the type of those bytes, it can do things like implement global garbage collection, intelligent caching, intelligent snapshotting &c. It can also enforce invariants across all user code.

To me this means you have an application, not an OS--or perhaps an application that also happens to be an OS. (Emacs comes to mind...)
Well, I think that OSes could do a lot more (and kernels a lot less … but that's a different story). Why _shouldn't_ an operating _system_ do an awful lot to ensure user safety, resource utilisation &c.?
> have an OS try to muck about with encodings

Hardware encode/decode often requires DMA capabilities. There are many optimizations that kernel mode can bring.

x86 has string instructions that do not require DMA. Nobody needs to hardware accelerate string decoding and encoding though...
> Hardware encode/decode

Of text?

but is it webscale?
I'll also disagree on the "everything is bytes and we slap an encoding on it some of the time"... But add there:

- User level security, instead of app level or task level, or whatever.

- Aged IPC primitives that assume too much to the point that modern hardware has to be built around it (and slower because of that).

- Added later, non-core network support, leading to bad integration.

- Added later, non-core, or sometimes never added encryption support, leading to bad integration.

Well, I'm not going to disagree except to say that there are posix_spawn and vfork. I don't think anyone thinks the IPC solutions on offer are great but perhaps we should lower our expectations on that.
Is there any interest in developing a newer, better standard or is this something we're going to be stuck with forever?