When it comes to infra, someone's preferred language may not always be available. Infra and devops is more about plumbing, and plumbing gets messy.
It can go the other way as well. tcl was originally conceived as an embeddable scripting language that can drive GUI applications, and has been very successful in that use-case. It has a few flaws, but I would have preferred it to been embedded in the browser instead of making up Javascript back in the day.
I encountered bugs and had to fix glaring problems in shell scripts literally everywhere I worked.
Most devs are not deeply steeped in Unix, and don't really know what they're doing with this as it's not their main job. They kinda clobber something together that "kinda works" – right up to the point it doesn't.
There is no shame in that; I do that with some stuff too – we all do – because there is no point learning something in-depth if you use it once a year. This is not a "zomg devs are stoopid" rant, it's just an observation of fact.
I've written tons of shell scripts and and I love how it enables you to do something useful very quickly in very little code. Last week I hacked together a "ncurses"-type terminal music player in less than 200 lines which actually works fairly well, which I thought wasn't too bad for an evening of work.
But in general I distrust other people's shell scripts, and I found that 9 times out of 10 that's warranted. I tend to push for zsh as it fixes the most egregious footguns and limitations, but there's still plenty of things to do wrong, and the syntax can be unfamiliar.
Something that “kinda works” actually does work. Fixing the edge case when it is actually a problem may actually save time compared to spending extra time up front on things that might never be a problem.
I should have been clearer about that: but I'm not talking about theoretical edge cases. I'm talking about "I tried to run it and it doesn't work", or worse (e.g wrong behaviour).
Shell scripts are often the peak of "works on my machine".
The problem mostly stems from the fact that MacOS folks don't have compatible Unix utils to the Linux folks, and the majority of developers are on MacOS but everything deploys on Linux. The easiest way to fix this is to install GNU utils on mac so you're using the same bash and system tools everywhere.
That is one problem, but there are others: assuming things like current working directory, assuming certain tools are installed and failing badly when it's not (e.g. doesn't do anything but no error, has the wrong behaviour and no warning), and things like that.
I’d agree: every critical shell script should work in all the places it currently needs to run. But, I think it’s reasonable to iterate here towards perfection: every time a new employee shows up or new environment comes up is a chance to uncover these problems and fix them.
There’s a balance here between solving the problem you actually have (YAGNI) and responsible engineering. Obviously you should test that your code works as expected in all the current environments but, sometimes, ensuring that all the edge cases are handled is just over-engineering.
People who say stuff like this must not have worked on enough Linux environments. Bash scripts aren't fully portable, and there are absolutely differences in environments which prevent portability of bash scripts. Since using Nix, I've realized the number of bash scripts which are only superficially portable are drastically higher than the number which are actually portable. As one trivial example, how many bash scripts use the more portable shebang of `#!/usr/bin/env bash`? Extremely few. And every one which hard codes `#!/bin/bash` has created a less portable script most likely without ever having realized it. Guess they are all muppets though right?
If you standardize the Unix utils you use and the bash version you use, it's really not hard to avoid a lot of the issues you're talking about. To get all developers in an org working on compatible ubuntu-like systems, just mandate use of WSL on Windows and GNU unix commands on mac so everything is the same for everyone.
That being said, if you and your team just don't have a lot of bash/unix experience, that is probably a good argument for writing your utility scripts in whatever you can actually write without a ton of bugs.
Would you complain about node bugs if your org wasn't standardizing the node version people used and some people were still on node 12??
Not true. Bash is filled with foot guns. In fact even HN has articles hitting the front page every other month about another bash footgun and how to avoid it. Any Muppet can write a bash script to do the thing, but only somebody with a lot of bash and unix experience can make sure it's dodging the hundreds of common and obvious but incorrect way of doing things.
But I worked with an extremely bash-skilled guy at IBM many years ago who went out of his way to write his scripts in the most arcane way possible. On purpose.
It was his form of job security, from rough memory when I asked wtf?. ;)
I wasn't project lead on that one, so we just had to put up with it... :(
It can go the other way as well. tcl was originally conceived as an embeddable scripting language that can drive GUI applications, and has been very successful in that use-case. It has a few flaws, but I would have preferred it to been embedded in the browser instead of making up Javascript back in the day.