| > However if you were creating UNIX today you'd definitely want to look at OOP and inheritance as a core pillar. I doubt it. Imposing structure and types on IPC would make it very difficult to compose separate programs. Suppose I created UNIX-2, where the only difference between UNIX and UNIX-2 in principle is the fact that UNIX-2 programs all pipe serialized Objects to and from each other, instead of byte streams. Now, the ls program obviously outputs more information than just a list of strings--it also outputs types of files, permissions, owners, inode numbers, sizes, timestamps, etc. I might be inclined to have ls output an lsOutputObject (derived from Object) that encapsulated all this information. Suppose I wanted to pipe the output of ls into wc. How does wc handle an lsOutputObject? Either wc is programmed to know how to handle lsOutputObject, or it is not. Since we want an object-oriented environment, we'll assume the former case, so wc can call the appropriate object-specific methods and access the appropriate object-specific fields. But, now wc is tightly coupled to ls. This problem generalizes. For a given program P in a set of N programs, wc will need to know how to access P's output-object-specific fields and methods. So, wc needs O(N) different subroutines to interact with the N other programs. This does not scale--each additional program I add to UNIX-2 will require me to write O(N) additional IPC handlers--one for each program. The only way to avoid this IPC-handler-explosion in the design is to define the set of IPC objects a priori and mandate all programs know how to handle them. Then, there are O(1) IPC handlers per program, and adding a new program does not require me to couple its implementation to any other programs. This is effectively what UNIX does: there is one IPC object--a string of bytes. In UNIX-2 I could have more types of objects, but the fact that they're defined independent of the programs means that I will still be "hoping the receiving process understands" when I give it data from arbitrary programs. Suppose I relax this a priori object mandate above to allow programs to extend the base IPC object types. But then, programs that do so will only compose with programs that implement IPC handlers for their extended object types. In this scenario, I can expect there to be disjoint sets of programs that compose with one another, but not with others. This is effectively what happens in SOA/microservice architectures: you get a set of programs that are composible only with programs that speak their (arbitrarily-structured) messages (the set of which is much smaller than the global set of SOA/microservice programs). My point is, trying to enforce OOP on IPC will take away universal program composibility, which is the killer (if not defining) feature of UNIX. |
Or the output of a 'uniq -c', you can then 'sort -g' to order lines by the number of occurrence, but if you want to take the top 5 lines and discard the counts with 'cut', you need whitespace transformations in between. (AWK would be an alternative to cut that performs the whitespace conversions on its own, but AWK is a full blown programming language, so one may as well have something that deals with dictionaries, arrays, etc, as input types anyway).
All this is to say that the untyped bytestream relies on conventions in Unix to make the bytestream useable between composable programs. These conventions are adequate for current uses, but show some weaknesses, and suggest that perhaps there are additional conventions, that if sufficiently simple, could be used to build composable programs that don't need to understand a format that is particular to just one program.
I totally agree that objects (meaning data + associated data-specific code) are probably overkill, though optional object interfaces may be nice if the programmer is willing to pay the computational cost, like one does when using AWK over cut. And I definitely think that inheritance is an idea best to avoid for data.