|
|
|
|
|
by aidos
4034 days ago
|
|
Your comment was useful, so not totally in vein :-) Just incited me to have a little look at powershell (read through [0], useful intro). It looks nice, I can definitely see the utility in have a simple object model for transferring information between processes. In nix land you get pretty good at extracting data from simple text forms, though sometimes it's harder than it should be. One thing that jumped out at me there is the overhead of the commands. 430ms: ls | where {$_.Name -like "*.exe"}
140ms: ls *.exe
27ms : ls -Filter "*.exe".
Not so much the absolute numbers but the fact that there are 3 different ways of doing it and the more flexible choice is over a magnitude slower.What happens when you add another command to the pipeline? Do they buffer the streams like in linux? I guess the situation will improve over time but how complete is the eco-system at the moment? One area nixes will always shine is the total ubiquity. Everything can be done over commands and everything works with text. [0] https://developer.rackspace.com/blog/powershell-101-from-a-l... |
|
It's not uncommon for the most flexible option to be the slowest, though. In my own tests my results were 18 ms, 115 ms and 140 ms for doing those commands in $Env:Windir\system32, so the difference wasn't as big as in your case. For a quick command on the command line I feel performance is adequate in either case, unless you're doing things with very large directories. If you handle a large volume of data, regardless of whether it's files, lines, or other objects, you probably want to filter as much as you can as close to the source as you can – generally speaking.
As for buffering ... I'm not aware of, unless the cmdlet needs to have complete output of the previous one to do its work. Every result from a pipeline is passed individually from one cmdlet to the next by default. Some cmdlets do* buffer, though, e.g. Get-Content has a -ReadCount parameter that controls buffering in the cmdlet (man gc -param readcount). Sort-Object and Group-Object are the most common (for me at least) that always need complete output of the stage before to return anything, for quite obvious reasons.
However, even though I did some work on Pash, the open-source reimplementation of PowerShell, I'm not terribly well-versed in its internal workings, so take the buffering part with a grain of salt.
As for completeness, well, the Unix ecosystem has an enormous edge here, simply by having been there for decades and amassing tools and utilities. Since PowerShell was intended for system administrators you can expect nearly everything needed there to have PowerShell-native support. This includes files, processes, services, event logs, active directory, and various other things I know little to nothing about. Get-Command -Verb Get gives you a list of things that are supported directly that way. It seems like even configuration things like network, disks and other such things are supported by now. At Microsoft there's a rule, I think, that every new configuration GUI in Windows Server has to be built on PowerShell. Which means, everything you can do in the GUI, you can do in PowerShell, and I think you can in some cases even access the script to do the changes you just made in the GUI – e.g. for doing the same change on a few hundred machines at once, or whatever.
Of course, you can just work with any collection of .NET objects by virtue of the common cmdlets working with objects (gcm -noun object). For me, whenever there is no native support, .NET is often a good escape hatch, that in many cases isn't terribly inconvenient to use. You also have more control over what exactly happens at that level, because you're one abstraction level lower. As a last resort, it's still a shell. It can run any program and get its output. Output from native programs is returned as string[], line by line, and in many cases that's not worse than with cmd or any Unix shell.
_____
¹ Keep in mind, the file system is just one provider and there are others, e.g. registry, cert store, functions, variables, aliases, environment variables that work with exactly the same commands. That's why ls is an alias for Get-ChildItem and there is no Get-File, because those commands are agnostic of the underlying provider.
² So much for do one thing – but understandable, because ls' output is not rich enough to filter for certain things further down in the pipeline.