Hacker News new | ask | show | jobs
by kbenson 3948 days ago
It has it's pros and cons, but in the end it's fundamentally different. Powershell wouldn't work nearly as well on a UNIX system, where much of the tooling is already just text, provided by many different sources, and uses a much more diverse set of utilities. Text makes a good lowest common denominator in that situation.
2 comments

"Text" only makes a good lowest common denominator on UNIX because the designers of UNIX decided that ASCII would be the lowest common denominator of the system. The designers of modern Windows decided that MSIL objects would be their lowest common denominator.

If the UNIX designers had chosen some other lowest common denominator, all of these diverse utilities would be communicating over that protocol instead of ASCII. The choice to use "text" came first and the ecosystem came afterward.

> "Text" only makes a good lowest common denominator on UNIX because the designers of UNIX decided that ASCII would be the lowest common denominator of the system. The designers of modern Windows decided that MSIL objects would be their lowest common denominator.

"Text" makes a good lowest common denominator because it's low. Everything, including humans looking at it, can deal with text. Essentially all programming languages on all platforms provide basic primitives analogous to getline(), find_first_of(), substr(), etc. You can take text from Unix and easily do something useful with it on any arbitrary platform.

MSIL assumes an entire infrastructure that isn't universally available. You can't take MSIL from Windows and easily do something useful with it on any arbitrary platform. It requires you to have a solid .NET virtual machine for your operating system, an API for dealing with MSIL, bindings for that API in every language you want to use, etc.

Users of System360 computers, which used EBCDIC instead of ASCII, might not agree with you about the inherent portability of ASCII-based text interchange. Not saying .NET objects are as portable, but it's a continuum and a trade off.
ASCII and EBCDIC are just different ways of encoding the same information. It's like saying comma-delimited files aren't portable because there are also tab-delimited files and some programs support one or the other. It doesn't matter because there are trivial, widely available utilities to convert between them.

There is nothing analogous to convert MSIL to. EBCDIC and ASCII both have a representation for 'A'. Only .NET has "Microsoft.Win32.Registry.CurrentUser.CreateSubKey()".

Can you explain me a bit the difference if you mind (I don't know much about Windows).

In Linux/Unix I can also pipe audio to /dev/audio for example. And image processing through a sequence of steps by pipes is quite often done. Are these objects coming with default metaparameters to make piping easier? Or are there other things implemented that might be cool?

I have never used PowerShell, so I don't know a lot about the details. But PowerShell pipes .NET objects from one command to the next, which are not just bytestreams but are objects, with methods and such. It's fundamentally different.

In UNIX, you have bytes, and the underlying system doesn't know what they represent. Most tools assume they're ASCII or UTF-8 (or whatever) encoded text, but that's all they agree on. In practice, you're working almost always working with structured data - tables or lists or mappings - but you have to remember that for the output of this program, you want to cut on '=' and take -f4, and first you need to pipe it through head and tail to clean up some junk output[1], and this other thing for that program, and whatever.

[1] This is a way to get the `time` field from the output of `ping`.

> It's fundamentally different.

That's because the object pipeline is functionally analogous to function chaining within a single runtime like in Python or Ruby, not streaming data between arbitrary executables like in Unix pipelines.

This would be a lot clearer if PowerShell advocates compared it to things that it's actually functionally analogous to, but they are trying to position it as a Unix shell competitor.

> That's because the object pipeline is functionally analogous to function chaining within a single runtime like in Python or Ruby

No it is not. See my response below.

PowerShell's object pipeline only exists within the PowerShell/.NET runtime and is functionally analogous to method/function chaining in languages like JavaScript and Ruby. The object pipeline does not exist outside of the PowerShell/.NET runtime.

Pipelines in Unix shells use OS-level functionality to connect standard streams of data, often encoded as text, between arbitrary executables written in any language.

The reason this is confusing is because one of the main tactics used in PowerShell promotion is to disingenuously compare PowerShell's "object pipeline" to Unix pipelines in order to try to make it appear to be more novel than it is, when they should be comparing it to what it's functionally equivalent to, which is function chaining within a single runtime.

> which is function chaining within a single runtime

If PowerShells pipelines are merely equivalent to function chaining, then you should have no problem replicating the following very simple pipeline with function chaining in Ruby, Python or JavaScript:

    cat log.txt -wait | sls "error"
In case you need an explanation for what it is doing: It continously monitors the log.txt file for new lines, and selects those that contains the substring "error".

But I am curious: Why is it important whether the pipelines could be implemented using function chaining or not? PowerShell is a shell where I interact with the system through commands. Yes - those commands are not the typical Unix executables - but they act as commands within the shell

Is your problem that it is not a Unix shell? You could equally well say that Unix shell pipelines are just processes connected by file descriptors. That is factually correct but does not represent the true utility of pipelines.

PowerShell pipelines are not file descriptors. PowerShell commands do not execute in separate processes. But PowerShell commands are versatile and can be combined in a way analogous to Unix shell pipelines where the output of one command is consumed and acted upon by the next command.

The object of a operating system command line shell is to expose the operating system features to the user of the shell. Why does it matter whether it uses Unix file descriptors. Even if PowerShell pipelines were equivalent to "just" chained functions, why does it matter?

I get the point that you need the .NET runtime to run PowerShell, because it is implemented using .NET. At what point do we consider a runtime part of the operating system. When it is intrinsically distributed with the operating system and cannot be uninstalled?

To underscore how the idea of a UNIX shell working on file descriptors between executables isn't really important, it's worth nothing that bash allows piping between it's built in commands/keywords. Not only can you pipe the aggregates output from a for, if or while construct, you can pipe the output of bash builtins such as bg, fg and disown.
"objects" means data structures. It means you can have some structure with fields, or list or array or whatever as the output of one command, and input of the next one, so the next one doesn't have to do character-level parsing.

Of course, this kind of piping is present in nice languages that precede the PowerShell.

A one page Lisp macro will give you a left-to-right syntactic sugar for filtering. Clojure has a threading operator, Ruby has cascades of dots: object.{ blah }.foo().bar() ... and so on.

This is not what "objects" means (objects have methods), and .NET objects are a much higher level abstraction than structured data.
man locale(1)

Non-ASCII has been supported for a long time.

This is obviously not what I was talking about. I was talking about data representations heterogenous to the idea of "text".

I used "ASCII" instead of "text" to stress that "text" is just a protocol like anything else. The UNIX I'm running is set to use UTF8.

I don't think it is fundamentally different. When I think of a shell I think of pipes and interactivity. What you pass between through the pipe is secondary. The fundamental nature of the shell is those two things and everything else is just sugar on top. PowerShell has that and more so I'd say it is not fundamentally different and in fact it is objectively better.
The word "shell" in this context already means something. A text shell is a command line interpreter (https://en.wikipedia.org/wiki/Shell_(computing)#Text_.28CLI....). There are many shells that do not use the pipeline convention (e.g., irb), so your attempt to redefine it to promote PowerShell fails from the start.

PowerShell's object pipeline is functionally analogous to function chaining in languages like Ruby, Python, and JavaScript, not Unix pipelines. PowerShell's object pipeline does not exist outside of the .NET runtime and you can not stream objects between arbitrary executables outside of the .NET runtime.

You can't compare an apple to an orange and say one is "objectively better," no matter how hard PowerShell advertisers try to pretend their apple is an orange.

> PowerShell's object pipeline is functionally analogous to function chaining in languages like Ruby, Python, and JavaScript

No, in those languages function results are bound upon returning from the function. PowerShell's pipeline would be more like function chaining where each function is a co-routine that can yield partial results, which would make it considerable more complicated.

So your attempt at trivializing PowerShell fails from the start.

For your information, here is roughly how a PowerShell pipeline is set up, processed and torn down:

1. BeginProcess is invoked for each command in the pipeline, starting with the last command of the pipeline, working towards the first. When each command is initialized this way, it can start receiving objects from preceding command on the pipeline. Each command may start produce objects already during this phase, as all subsequent commands are ready to process them. If it produces objects, ProcessRecord of the subsequent command is invoked immediately. 2. ProcessRecord is invoked on the first command of the pipeline. If it yields objects, ProcessRecord of the subsequent command is invoked for each object, thus "pushing" objects through 3. EndProcess is invoked for the first command of the pipeline. It it yields objects, ProcessRecord of the subsequent command is invoked for each object. When EndProcess completes for the first command of the pipeline, EndProcess is invoked for the subsequent command and so forth until EndProcess has been invoked for all commands of the pipeline.

As you can see, each cmdlet has 3 phases: initialization, processing and completion. All cmdlets (functions?) become active at the same time, with the control flow passing to each command (function?) multiple times during each phase. This can be implemented in any turing complete language, but function chaining is a lame attempt at trivializing it. Throw in reactive sequences and it may get closer.

Consider how Sort-Object (alias sort) is implemented: It does nothing during BeginProcess, during ProcessRecord it adds to an internal collection but does not yield any objects (yet), during EndProcess it sorts and yields all of the collected records. This is clearly analogous to how sort in byte/text stream is implemented: When the process starts it does nothing but initialize and starts listening on stdin, for each "line" on stdin it adds to an internal collection, and when stdin is closed it sorts the collection and writes the "lines" to stdout. 3 phases.

Even so, it doesn't really matter how it is implemented, a shell is inherently about how you interact with the shell.

> PowerShell's object pipeline does not exist outside of the .NET runtime and you can not stream objects between arbitrary executables outside of the .NET runtime.

You are locked in the Unix view of the world and is unable to consider that long standing assumptions about "the way to do it" may need a refresh. If you want to leverage objects for system management then yes, you need a system-wide object model. Unix was not built with an object model. Windows was sort-of build with an object model (handles) but it is not versatile enough as a system-wide management model.

However, the good thing about objects is that they are abstractions which can easily wrap existing operating system concepts, such as files, processes, access control lists, drivers, users, groups, network connections etc. .NET is a mature object model and with a lot of tooling. Building PowerShell so that the abstractions over network adapters, users etc become represented by .NET classes enable a lot of tooling and infrastructure right from the beginning.

useerup, virtually everything in your comment is irrelevant handwaving and obfuscation, including a bunch of irrelevant comments about features that are internal to the PowerShell runtime.

No matter how much you want to try to talk around it, PowerShell is a .NET CLI, the object pipeline exists within its .NET runtime, and the commands are instances of .NET classes. Unix shells work primarily with arbitrary executables outside of the shell runtime using OS system calls to create pipelines with standard streams.

Here's the cold hard fact stated yet again: PowerShell's object pipeline is functionally analogous to function chaining in languages like Ruby, Python, and JavaScript, not Unix pipelines.

> Here's the cold hard fact stated yet again: PowerShell's object pipeline is functionally analogous to function chaining in languages like Ruby, Python, and JavaScript, not Unix pipelines.

Ok, what then is the functionally analogous function chaining of Ruby, Python or JavaScript to the following:

    cat log.txt -wait | sls "error"
A simple function chain will do.