Hacker News new | ask | show | jobs
by jiggawatts 641 days ago
Every time I see a “good” bash script it reminds me of how incredibly primitive every shell is other than PowerShell.

Validating parameters - a built in declarative feature! E.g.: ValidateNotNullOrEmpty.

Showing progress — also built in, and doesn’t pollute the output stream so you can process returned text AND see progress at the same time. (Write-Progress)

Error handling — Try { } Catch { } Finally { } works just like with proper programming languages.

Platform specific — PowerShell doesn’t rely on a huge collection of non-standard CLI tools for essential functionality. It has built-in portable commands for sorting, filtering, format conversions, and many more. Works the same on Linux and Windows.

Etc…

PS: Another super power that bash users aren’t even aware they’re missing out on is that PowerShell can be embedded into a process as a library (not an external process!!) and used to build an entire GUI that just wraps the CLI commands. This works because the inputs and outputs are strongly typed objects so you can bind UI controls to them trivially. It can also define custom virtual file systems with arbitrary capabilities so you can bind tree navigation controls to your services or whatever. You can “cd” into IIS, Exchange, and SQL and navigate them like they’re a drive. Try that with bash!

8 comments

I also hate bash scripting, and as far as Unix shell go, bash is among the best. So many footguns... Dealing with filenames with spaces is a pain, and files that start with a '-', "rm -rf" in a script is a disaster waiting to happen unless you triple check everything (empty strings, are you in the correct directory, etc...), globs that don't match anything, etc...

But interactively, I much prefer Unix shells over PowerShell. When you don't have edge cases and user input validation to deal with, these quirks become much more manageable. Maybe I am lacking experience, but I find PowerShell uncomfortable to use, and I don't know if it has all these fancy interactive features many Unix shell have nowadays.

What you are saying essentially is that PowerShell is a better programming language than bash, quite a low bar actually. But then you have to compare it to real programming languages, like Perl or Python.

Perl has many shell-like features, the best regex support of any language, which is useful when everything is text, many powerful features, and an extensive ecosystem.

Python is less shell-like but is one of the most popular languages today, with a huge ecosystem, clean code, and pretty good two-way integration, which mean you can not only run Python from your executable, but Python can call it back.

If what you are for is portability and built-in commands, then the competition is Busybox, a ~1MB self-contained executable providing the most common Unix commands and a shell, very popular for embedded systems.

> What you are saying essentially is that PowerShell is a better programming language than bash

In some sense, yes, but there is no distinct boundary. Or at least, there ought not to be one!

A criticism a lot of people (including me) had of Windows in the NT4 and 2000 days was that there was an enormous gap between click-ops and heavyweight automation using C++ and COM objects (or even VBScript or VB6 for that matter). There wasn't an interactive shell that smoothly bridged these worlds.

That's why many Linux users just assumed that Windows has no automation capability at all: They started with click-ops, never got past the gaping chasm, and just weren't aware that there was anything on the other side. There was, it just wasn't discoverable unless you were already an experienced developer.

PowerShell bridges that gap, extending quite a bit in both directions.

For example, I can use C# to write a PowerShell module that has the full power of a "proper" programming language, IDE with debug, etc... but still inherits the PS pipeline scaffolding so I don't have to reinvent the wheel for parameter parsing, tab-complete, output formatting, etc...

Windows still has horrendous automation support, PowerShell falls short and loses its USP as soon as you need anything that is not a builtin and series of bandaids like DSC didn't even ameliorate the situation. The UX is bad even when working with nothing but MS products like MSSQL.

The biggest leap for automation on Windows has been WSL, aka shipping Linux.

> Dealing with filenames with spaces is a pain, and files that start with a '-',

Wait! The fact that arguments with a leading hyphen are interpreted as options is not bash's fault. It's ingrained in the convention of UNIX tools and there's nothing bash can do to mitigate it. You would have the same problem if you got rid of any shell and directly invoked commands from Python or C.

Indeed, it is not the fault of bash but of the Unix command line in general. Made worse by the fact that different tools may have different conventions. Often, "--" will save you, but not always. And by the way, it took me years to become aware of "--", which is exactly the reason why I hate shell scripting: a non-obvious problem, with a non-obvious solution that doesn't always work.

One of GP arguments in favor of PowerShell is that most commands are builtin, so this problem can be solved by the shell itself, and furthermore, it is based on strongly typed objects, which should make it clear what is a file and what is a command line option. And I think he has a point. Regular command line parsing is a mess on Windows though.

In "real" programming languages, library APIs are usually favored over command lines, and they are usually designed in such a way that options and file arguments are distinct. You may still need to run commands at some point, but you are not reliant on them for every detail, which, in traditional shell scripting includes trivial things like "echo", "true", "false", "test", etc... Now usually builtin.

As for bash "doing something about it", it would greatly benefit from a linter. I know they exist, but I don't know if it is standard practice to use them.

> Wait! The fact that arguments with a leading hyphen are interpreted as options is not bash's fault. It's ingrained in the convention of UNIX tools and there's nothing bash can do to mitigate it. You would have the same problem if you got rid of any shell and directly invoked commands from Python or C.

A better system shell could make it easy to define shims for the existing programs. Also it could make their calling easier, e.g. with named arguments. So when you wanted to delete your file called -rf, you would say

  rm(file="-rf")
or something like that, with your preferred syntax. It would be much safer than just pass big strings as arguments, where spaces separate the different arguments, also spaces can appear in the arguments, also arguments can be empty. Bash or Posix sh is not very good at safely invoking other programs, or at handling files.
What you're suggesting is that the shell should have every possible command builtin and not call external programs.

Let's analyze your example with 'rm': it works as long as 'rm' is an internal routine. If it's an external program, independently of the syntax you use to specify the arguments, sooner or later the shell will need to actually call the 'rm' executable, and to pass '-rf' to it as argument number 1. The 'rm' executable will then examine its arguments, see that the first one begins with a hyphen and interpret it as an option.

As I said, the only way to avoid all this would be to replace 'rm' with an internal routine. Then you would replace 'cp' and 'ln', and what else? Of course 'echo' and 'printf', 'cat', 'ls', 'cd' maybe, why not 'find' and 'grep'? What about 'head', 'tail', 'cut'? Don't forget 'sed' and 'awk'... the list is getting longer and longer. Where do you draw the line?

Seriously, the only mitigation would be to define a function to 'sanitize' an argument to make it appear as a file if used as an argument to an external program. Something like:

  force_file() {
    case "$1" in
      -*) echo "./$1" ;;
      *)  echo "$1" ;;
    esac
}

This doesn't work with 'echo' though.

Bash also has a built-in to validate parameters; it’s called test, and is usually called with [], or [[]] for some bash-specifics.

Re: non-standard tools, if you’re referring to timeout, that’s part of GNU coreutils. It’s pretty standard for Linux. BSDs also have it from what I can tell, so it’s probably a Mac-ism. In any case, you could just pipe through sleep to achieve the same thing.

> …inputs and outputs are strongly typed objects

And herein is the difference. *nix-land has everything as a file. It’s the universal communication standard, and it’s extremely unlikely to change. I have zero desire to navigate a DB as though it were a mount point, and I’m unsure why you would ever want to. Surely SQL Server has a CLI tool like MySQL and Postgres.

The CLI tool is PowerShell.

You just said everything “is a file” and then dismissed out of hand a system that takes that abstraction even further!

PowerShell is more UNIX than UNIX!

What? How are typed objects files?

What I’m saying is that in *nix tooling, things are typically designed to do one thing well. So no, I don’t want my shell to also have to talk MySQL, Postgres, SQL Server, DB2, HTTP, FTP, SMTP…

> "So no, I don’t want my shell to also have to talk..."

What's the point of the shell, if not to manage your databases, your REST APIs, files, and mail? Is it something you use for playing games on, or just for fun?

> designed to do one thing well.

Eeexcept that this is not actually true in practice, because the abstraction was set at a level that's too low. Shoving everything into a character (or byte) stream turned out to be a mistake. It means every "one thing" command is actually one thing plus a parser and and encoder. It means that "ps" has a built-in sort command, as do most other UNIX standard utilities, but they all do it differently. This also means that you just "need to know" how to convince each and every command to output machine-readable formats that other tools on the pipeline can pick up safely.

I'll tell you a real troublshooting story, maybe that'll help paint a picture:

I got called out to assist with an issue with a load balancer appliance used in front of a bunch of Linux servers. It was mostly working according to the customer, but their reporting tool was showing that it was sending traffic to the "wrong" services on each server.

The monitoring tool used 'netstat' to track TCP connections, which had a bug in that version of RedHat where it would truncate the last decimal digit of the port number if the address:port combo had the maximum possible number of digits, e.g.: 123.123.123.123:54321 was shown as 123.123.123.123:5432 instead.

Their tool was just ingesting that pretty printed table intended for humans with "aligned" columns, throwing away the whitespace, and putting that into a database!

This gives me the icks, but apparently Just The Way Things Are Done in the UNIX world.

In PowerShell, Get-NetTCPConnection outputs objects, so this kind of error is basically impossible. Downstream tools aren't parsing a text representation of a table or "splitting it into columns", they receive the data pre-parsed with native types and everything.

So for example, this "just works":

    Get-NetTCPConnection | 
        Where-Object State -EQ 'Bound' | 
        Group-Object LocalPort -NoElement | 
        Sort-Object Count -Descending -Top 10
Please show me the equivalent using netstat. In case the above was not readable for you, it shows the top ten TCP ports by how many bound connections they have.

This kind of thing is a challenge with UNIX tools, and then is fragile forever. Any change to the output format of netstat breaks scripts in fun and create ways. Silently. In production.

I hope you never have to deal with IPv6.

For fun, I took a crack at your example and came up with this craziness (with the caveat it's late and I didn't spend much time on it), which is made a bit more awkward because grep doesn't do capturing groups:

  netstat -aln \
  | grep ESTABLISHED \
  | awk '{print $4}' \
  | grep -Po '\:\d+$' \
  | grep -Po '\d+' \
  | sort \
  | uniq -c \
  | sort -r \
  | head -n 10
Changing the awk field to 5 instead of 4 should get you remote ports instead of local. But yeah, that will be fragile if netstat's output ever changes. That said, even if you're piping objects around, if the output of the thing putting out objects changes, your tool is always at risk of breaking. Yes objects breaking because field order changed is less likely, but what happens if `Get-NetTCPConnection` stops including a `State` field? I guess `Where-Object` might validate it found such a field, but I could also see it reasonably silently ignoring input that doesn't have the field. Depends on whether it defaults to strict or lenient parsing behaviors.
I know this sounds like nit-picking but bear with me. It's the point I'm trying to make:

1. Your script outputs an error when run, because 'bash' itself doesn't have netstat as a built-in. That's an external command. In my WSL2, I had to install it. You can't declaratively require this up-front, you script has to have an explicit check... or it'll just fail half-way through. Or do nothing. Or who knows!?

PowerShell has up-front required prerequisites that you can declare: https://learn.microsoft.com/en-us/powershell/module/microsof...

Not that that's needed, because Get-NetTcpConnection is a built-in command.

3. Your script is very bravely trying to parse output that includes many different protocols, including: tcp, tcp6, udp, udp6, and unix domain sockets. I'm seeing random junk like 'ACC' turn up after the first awk step.

4. Speaking of which, the task was to get tcp connections, not udp, but I'll let this one slide because it's an easy fix.

5. Now imagine putting your script side-by-side with the PowerShell script, and giving it to people to read.

What are the chances that some random person could figure out what each one does?

Would they be able to modify the functionality successfully?

Note that you had to use 'awk', which is a parser, and then three uses of 'grep' -- a regular expression language, which is also a kind of parsing.

The PowerShell version has no parsing at all. That's why it's just 4 pipeline expressions instead of 9 in your bash example.

Literally in every discussion about PowerShell there's some Linux person who's only ever used bash complaining that PS syntax is "weird" or "hard to read". What are they talking about!? It's half the complexity for the same functionality, reads like English, and doesn't need write-only hieroglyphics for parameters.

put the pipe character at the end of the line and you don't need the backslashes
> What's the point of the shell, if not to manage your databases, your REST APIs, files, and mail? Is it something you use for playing games on, or just for fun?

It's for communicating with the operating system, launching commands and viewing their output. And some scripting for repetitive workflows. If I'd want a full programming environment, I'd take a Lisp machine or Smalltalk (a programmable programming environment).

Any other systems that want to be interactive should have their own REPL.

> This kind of thing is a challenge with UNIX tools, and then is fragile forever. Any change to the output format of netstat breaks scripts in fun and create ways. Silently. In production.

The thing is if you're using this kind of scripts in production, then not testing it after updating the system, that's on you. In your story, they'd be better of writing a proper program. IMO, scripts are automating workflows (human guided), not for fire and forget process. Bash and the others deals in text because that's all we can see and write. Objects are for programming languages.

> In your story, they'd be better of writing a proper program.

Sure, on Linux, where your only common options bash or "software".

On Windows, with PowerShell, I can don't have to write a software program. I can write a script that reads like a hypothetical C# Shell would, but oriented towards interactive shells.

(Note that there is a CS-Script, but it's a different thing intended for different use-cases.)

I'm kind of with the OP that it would be nice if linux shells started expanding a bit. I think the addition of the `/dev/tcp` virtual networking files was an improvement, even if it now means my shell has to talk TCP and UDP instead of relying on nc to do that
> What's the point of the shell, if not to manage your databases, your REST APIs, files, and mail? Is it something you use for playing games on, or just for fun?

To call other programs to do those things. Why on earth would I want my shell to directly manage any of those things?

I think you're forgetting something: *nix tools are built by a community, PowerShell is built by a company. Much like Apple, Microsoft can insist on and guarantee that their internal API is consistent. *nix tooling cannot (nor would it ever try to) do the same.

> It means that "ps" has a built-in sort command, as do most other UNIX standard utilities, but they all do it differently.

I haven't done an exhaustive search, but I doubt that most *nix tooling has a built-in sort. Generally speaking, they're built on the assumption that you'll pipe output as necessary to other tools.

> This also means that you just "need to know" how to convince each and every command to output machine-readable formats that other tools on the pipeline can pick up safely.

No, you don't, because plaintext output is the lingua franca of *nix tooling. If you build a tool intended for public consumption and it _doesn't_ output in plaintext by default, you're doing it wrong.

Here's a one-liner with GNU awk; you can elide the first `printf` if you don't want headers. Similarly, you can change the output formatting however you want. Or, you could skip that altogether, and pipe the output to `column -t` to let it handle alignment.

    netstat -nA inet | gawk -F':' 'NR > 2 { split($2, a, / /); pc[a[1]]++ } END { printf "%-5s     %s\n", "PORT", "COUNT"; PROCINFO["sorted_in"]="@val_num_desc"; c=0; for(i in pc) if (c++ < 10) { printf "%-5s     %-5s\n", i, pc[i] } }'
Example output:

    PORT      COUNT
    6808      16
    3300      8
    6800      6
    6802      2
    6804      2
    6806      2
    60190     1
    34362     1
    34872     1
    38716     1

Obviously this is not as immediately straight-forward for the specific task, though if you already know awk, it kind of is:

    Set the field separator to `:`
    Skip the first two lines (because they're informational headers)
    Split the 2nd column on space to skip the foreign IP
    Store that result in variable `a`
    Create and increment array `pc` keyed on the port
    When done, do the following
    Print a header
    Sort numerically, descending
    Initialize a counter at 0
    For every element in the pc array, until count hits 10, print the value and key
You can also chain together various `grep`, `sort`, and `uniq` calls as a sibling comment did. And if your distro doesn't include GNU awk, then you probably _would_ have to do this.

You may look at this and scoff, but really, what is the difference? With yours, I have to learn a bunch of commands, predicates, options, and syntax. With mine, I have to... learn a bunch of commands, predicates, options, and syntax (or just awk ;-)).

> This kind of thing is a challenge with UNIX tools

It's only a challenge if you don't know how to use the tools.

> Any change to the output format of netstat breaks scripts in fun and create ways

The last release of `netstat` was in 2014. *nix tools aren't like JavaScript land; they tend to be extremely stable. Even if they _do_ get releases, if you're using a safe distro in prod (i.e. Debian, RedHat), you're not going to get a surprise update. Finally, the authors and maintainers of such tools are painfully aware that tons of scripts around the world depend on them being consistent, and as such, are highly unlikely to break that.

> Silently. In production.

If you aren't thoroughly testing and validating changes in prod, that's not the fault of the tooling.

And for anyone who might be open to trying powershell, the cross platform version is pwsh.

Pythonistas who are used to __dir__ and help() would find themselves comfortable with `gm` (get-member) and get-help to introspect commands.

You will also find Python-style dynamic typing, except with PHP syntax. $a=1; $b=2; $a + $b works in a sane manner (try that with bash). There are still funny business with type coercion. $a=1; $b="2"; $a+$b (3); $b+$a ("21");

I also found "get-command" very helpful with locating related commands. For instance "get-command -noun file" returns all the "verb-noun" commands that has the noun "file". (It gives "out-file" and "unblock-file")

Another nice thing about powershell is you can retain all your printf debugging when you are done. Using "Write-Verbose" and "Write-Debug" etc allows you to write at different log levels.

Once you are used to basic powershell, there are bunch of standard patterns like how to do Dry-Runs, and Confirmation levels. Powershell also supports closures, so people create `make` style build systems and unit test suites with them.

The big problem with trying to move on from BASH is that it's everywhere and is excellent at tying together other unix tools and navigating the filesystem - it's at just the right abstraction level to be the duct tape of languages. Moving to other languages provides a lot more safety and power, but then you can't rely on the correct version being necessarily installed on some machine you haven't touched in 10 years.

I'm not a fan of powershell myself as the only time I've tried it (I don't do much with Windows), I hit a problem with it (or the object I was using) not being able to handle more than 256 characters for a directory and file. That meant that I just installed cygwin and used a BASH script instead.

I am Microsoft hater. I cannot stand Windows and only use Linux.

PowerShell blows bash out of the water. I love it.

except for the fact that it is slower than hell and the syntax is nuts. I don't really understand the comparison, bash is basically just command glue for composing pipelines and pwsh is definitely more of a full-fledged language... but to me, I use bash because its quick and dirty and it fits well with the Unix system.

If I wanted the features that pwsh brings I would much rather just pick a language like Golang or Python where the experience is better and those things will work on any system imaginable. Whereas pwsh is really good on windows for specifically administrative tasks.

The fact that it is "basically just command glue for composing pipelines" makes it even more regrettable that it takes more knowledge and mental focus to avoid shooting my foot off in bash than it does in any other programming language I use.
If you're trying to write a full fledged program in it, it's going to be a pain as there are only strings (and arrays, I think). Bash is for scripting. If you have complex logic to be done, use another programming language like perl, ruby, python, $YOUR_PREFERRED_ONE,...
You’re arguing that the power of PowerShell is pointless because you’ve resorted to alternatives to bash… because it’s not good enough for common scenarios.

This is Stockholm Syndrome.

You’ve internalised your limitations and have grown to like them.

No. bash as a shell is for interactive use or for automating said interactions. I want the computer to do stuff. The “everything is a file” and text oriented perspective in the unix world is just one model and bash is very suitable for it. Powershell is another model, just like lisp and smalltalk. I’m aware of the limitations of bash, but at the end of the day, it gets the job done and easily at that.
I’m curious. How useful is Powershell outside of a windows environment? I use it on Windows since much of the admin side of things requires it.
My issue with powershell is that it’s niche language with a niche “stdlib” which cannot be used as general purpose. The same issue I have with AHK. These two are languages that you use for a few hours and then forget completely in three weeks.

Both of them should be simply python and typescript compatible dlls.

You can “cd” into IIS, Exchange, and SQL and navigate them like they’re a drive. Try that with bash!

This exists.

   PowerShell can be embedded into a process as a library... and used to build an entire GUI that just wraps the CLI commands.
Sounds pretty interesting. Can you tell me what search terms I'd use to learn more about the GUI controls? Are they portable to Linux?
It doesn’t have GUI capabilities per-se. Instead, it is designed to be easy to use as the foundation of an admin GUI.

The .NET library for this is System.Management.Automation.

You can call a PowerShell pipeline with one line of code: https://learn.microsoft.com/en-us/dotnet/api/system.manageme...

Unlike invoking bash (or whatever) as a process, this is much lighter weight and returns a sequence of objects with properties. You can trivially bind those to UI controls such as data tables.

Similarly the virtual file system providers expose metadata programmatically such as “available operations”, all of which adhere to uniform interfaces. You can write a generic UI once for copy, paste, expand folder, etc and turn them on or off as needed to show only what’s available at each hierarchy level.

As an example, the Citrix management consoles all work like this. Anything you can do in the GUI you can do in the CLI by definition because the GUI is just some widgets driving the same CLI code.

Bash is crap and powershell an abomination with a few good ideas.

fish, Python, and oilshell (ysh) are ultimately on better footing.

Or just the old Perl. Any Bash/AWK/Sed user can be competent with it in days.