Hacker News new | ask | show | jobs
by mjd 329 days ago
I've been writing scripts in Bourne shell since the 1980s and in Bash since whenever it came along, and I feel like the most important thing I've learned about it is: don't. Sure, it can be done, and even done well, but why? There are better languages.

Every time I write a shell script that grows to more than about 20 lines I curse myself for not having written it in Python. The longer I have waited before throwing it away and redoing it, the more I curse.

This article says nothing to change my mind. I could build logging and stack traces in Bash. I admire the author's ingenuity. But again, why?

7 comments

As someone who has been writing shell scripts for a few decades (though not as long as you), I’d instead recommend “learn what your tools are appropriate for and use them that way”. There are plenty of cases where shell scripts are the right tool for the job.

I can’t even tell how many times I’ve seen multi-line Python scripts which could instead have been a shell one-liner. Shorter and faster.

I have also written shell scripts with hundreds of lines, used by thousands of people, which work just fine and would be more complicated and slower in other languages.

I firmly disagree with the all too pervasive blanket statement of “there are better languages”. It depends. It always does.

I'll put a point on the "it depends" bit.

If you have a standard-ish environment, you'll have an array of Unix tools to compose together, which is what a shell is best at. Even a minimal image like busybox will have enough to do serious work. Golfing in shell can be a pipeline of tools: lately "curl | jq | awk" does a lot of lifting for me in a one-liner.

As soon as you say "switch to (favorite decent scripting environment)", you're committing to (a) many megs of its base install, (b) its package management system, (c) whatever domain packages you need for $work, and (d) all the attendant dependency hells that brings along. Golfing in a scripting environment is composing a bunch of builtin operations.

> Golfing in shell can be a pipeline of tools: lately "curl | jq | awk" does a lot of lifting for me in a one-liner.

> As soon as you say "switch to (favorite decent scripting environment)", you're committing to (a) many megs of its base install, (b) its package management system, (c) whatever domain packages you need for $work, and (d) all the attendant dependency hells that brings along.

OK, but isn't jq just an example of a favorite scripting environment with a multi-meg install and a dependency system? What are you doing that's different from what you're advising everyone else not to do?

> Golfing in a scripting environment is composing a bunch of builtin operations.

Neither curl nor jq is a builtin operation.

> OK, but isn't jq just an example of a favorite scripting environment with a multi-meg install and a dependency system?

No? jq is a single binary a little over half a MB with no runtime dependencies. You can simply download it and use it. And you only need that if it’s not already included in whatever system you’re using, which it likely is. It even comes with macOS these days, which is more than what you can say for Python.

https://news.ycombinator.com/item?id=44408432

> Neither curl nor jq is a builtin operation.

Pretty sure they meant it as in builtin with the system, not the language. As per their first paragraph:

> Even a minimal image like busybox will have enough to do serious work.

> No? jq is a single binary a little over half a MB with no runtime dependencies.

I just downloaded it to see what size it was, and it's 2200 KB.

> Pretty sure they meant it as in builtin with the system, not the language. As per their first paragraph:

>> Even a minimal image like busybox will have enough to do serious work.

Busybox doesn't include curl or jq. They aren't builtins by any standard.

This becomes obvious every time you set up a new machine and try to curl something, and then realize you have to install curl.

> I just downloaded it to see what size it was, and it's 2200 KB.

We both saw different versions. You looked at the Linux one, but I looked at the macOS one. The version which ships with macOS is smaller than the one on the website, but even so the website version is under one MB.

I’m intrigued by what causes the large difference.

> Busybox doesn't include curl or jq.

Thank you for the correction. In that case I don’t know what the other user meant. Perhaps they’ll come back and can clarify.

Fair enough point but for many years I wasn't aware of what bash COULD do. I mean one should get to learn more about [[]] and how it does regexps and while read loops:

ls *.txt | { while read FILENAME; do <something> to $FILENAME; done; }

and so on. Once you know, you can get a lot done on e.g. a docker image, without having to install lots of other things first.

When it comes to shell scripting, I personally avoid golf at all costs. I'll take an extra verbose, easy script to parse (for a human) any day of the week when it comes to operations.

Yes it's a tradeoff. Every line of code is a liability. Powershell or python are probably "slower" which in my use case is negligible and almost never relevant. On the other hand, I can't help but view the often esoteric and obscurely clever bash mechanisms as debt.

I’m not talking about code golf. Verbosity and clarity are not directly correlated. The examples I’m talking about are also often easier to read as shell scripts.

For example, let’s take a file as input, filter for every "mypattern" line, then output them sorted.

Python example:

  import sys
  print(*sorted(line for line in open(sys.argv[1]) if 'mypattern' in line), sep='')
Shell example:

  grep 'mypattern' "${1}" | sort
The shell version is shorter, easier to read, easier to understand, easier to search for, and an order of magnitude faster. You can certainly make the Python version more verbose, yet it’ll never reach the same level of clarity and immediacy as the shell version.
It won't be surprising since I wrote [1], but I mostly write bash when I want to create complicated pipelines with fzf, and I don't want to write Go code to go the same thing.

[1]: https://andrew-quinn.me/fzf/

This is an excellent point about pipes. There seems to be no other language which lets you stitch together pipes like Bash does. It's incredibly powerful, and worth putting up with all of Bash's warts.

Thanks for fzf, by the way. Always one of the first things I install in a new environment.

They are not the author of fzf
Tbh the sentence + the link URL also tricked me into thinking that initially.
Argh, not my intention at all. fzf was written by https://github.com/junegunn , I merely wrote a tutorial on it that got unexpectedly popular on here some years back.

I'm sorry Junegunn! I would never dream of stealing that kind of valor. I'll remember to flag [1] as a tutorial I wrote explicitly in the future.

I didn't mean it was intentional, but the URL and the context ("I wrote this ", where "this" is not clear) made me think you were the author of fzf.
https://www.nushell.sh/ is next level when it comes to pipes:

    ls | where size > 10mb | sort-by modified
You have my admirations for fzf, it helps me dozens of times every day. And I do understand that authors of such prominent tools will want to have tamed integrations with people's shells, makes perfect sense.

That being said, as a guy who does not have big prominent OSS tools under his belt, I am slowly but surely migrating away from shell scripts and changing them to short Golang programs. Already saved my sanity a few times.

Nothing against the first cohort of people who had to make computers work; they are heroes. But at one point the old things only impede and slow everyone else and it's time to move on.

Sorry, I was accidentally unclear in my writing. fzf was written by https://github.com/junegunn , I merely wrote a tutorial on it that got unexpectedly popular on here some years back.

I'm sorry Junegunn! I would never dream of stealing that kind of valor. I'll remember to flag [1] as a tutorial I wrote explicitly in the future.

Seems we both screwed up. :D

Thanks for the clarification.

> the most important thing I've learned about [bash] is: don't. Sure, it can be done, and even done well, but why? There are better languages.

This. Bash gives you all the tools to dig a hole and none to climb out. It's quick and easy to copy commands from your terminal to a file, and it beats not saving them at all.

Support for digging: once you have a shell script, adding one more line conditioned on some env var is more pragmatic than rewriting the script in another language. Apply mathematical induction to grow the script to 1000 lines. Split into multiple files when one becomes too large and repeat.

Missing support for climbing out: janky functions, no modules, user types, or tests; no debugger and no standard library. I've successfully refactored messy python code in the past, but with bash I've had no idea where to even start.

There is hope that LLMs can be used to convert shell scripts to other languages, because they can make the jump that experienced devs have learned to avoid: rewriting from scratch. What else do you do when refactoring in small steps is not feasible?

I wrote a powershell script to run an ffmpeg workflow. I'm confident that this was a better idea than either of the other two approaches that you seem to be advocating for:

(a) instead of writing a shell script to operate a shell-operated tool, write a python script with a bunch of os.system('shell out') commands.

(b) instead of just invoking ffmpeg to do the things you want done, install an ffmpeg development library, and call the functions that ffmpeg itself calls to do those things.

What would be the argument for either of those?

> There is hope that LLMs can be used to convert shell scripts to other languages, because they can make the jump that experienced devs have learned to avoid: rewriting from scratch. What else do you do when refactoring in small steps is not feasible?

There were some languages shown in HN that compile to sh/bash (like oilshell[0]). I would think that's also a viable vector of attack but not sure how viable it actually is i.e. maintainers might have moved on for various reasons.

[0] https://github.com/oils-for-unix/oils

> no modules

Ish. You can source whatever files you want, so if you split up your functions into logical directories / files, you can get modules (-ish).

> no tests

BATS [0].

[0]: https://github.com/bats-core/bats-core

> I've successfully refactored messy python code in the past, but with bash I've had no idea where to even start.

I say this with all kindness: you probably need to know more bash before you can safely refactor it. It is a very pointy and unforgiving language.

Because it's better at the task than Python is.
That's just the problem! It is better at the task. Until it isn't, and "isn't" comes much too soon.
I've seen this sentiment a lot here. "Once shell is >n lines, port to python". My experience has been different. Maybe half of the scripts I write are better off in python while the other half are exponentially longer in python than bash.

For example, anything to do with json can be done in 1 line of readable jq, while it could be 1, 5, or 20 lines in python depending on the problem.

I'd just like to put that out there because half of the time, the >n metric does not work for me at all. My shell scripts range from ~5-150 lines while python are 100+

Same. It’s mostly because if I have a shell script, while I’ll add some comments for tricky bits if needed, and maybe a `-h` function, that’s about it. In Python, though, I use the language’s features to make it as readable and safe as possible. Types, docstrings, argparse, etc. My thinking is that if I’m going to take the time to use a “proper” language, then I should make it bulletproof, otherwise I’d just stick with shell.

My personal decision matrix for when to switch to Python usually involves the relative comfort of the rest of my team in both languages, the likelihood that future maintenance or development of the script will be necessary, and whether or not I’m dealing with inputs may change (e.g. API responses, since sadly versioning isn’t always a guarantee).

> For example, anything to do with json can be done in 1 line of readable jq, while it could be 1, 5, or 20 lines in python depending on the problem.

I don't agree that there exists such a thing as "readable jq" to start with. It's very arcane and difficult to follow unless you live and breathe the thing (which I don't). Furthermore, jq may or may not be present on the system, whereas the json package is always there in Python. Finally, I don't think having more lines is bad. The question is, what do you get for the extra lines? Python might have 5 lines where bash has 1, but those 5 lines will be far easier to read and understand in the future. That's a very worthwhile trade-off in my opinion.

> It's very arcane and difficult to follow unless you live and breathe the thing

I used to think this before I actually read how it worked. If you know shell, jq is extremely easy to pick up. It acts the exact same way, but pipes JSON entities instead of bytes ("text") like shell does.

Like the Unix philosophy, every filter does exactly one thing very well. Like shell, you write it incrementally, one filter at a time.

Genuinely, I do not blame you for thinking it's complex. I have never seen a concise, correct explanation of how jq works that builds an intuitive understanding. I have a near-complete one, and it's on my todo list to eventually publish it.

Anyway, I don't mean to say more lines is always worse, but that it is worse about half the time. Python is certainly more readable, but I'd rather spend 60 seconds making a long pipeline than 10 minutes making it work in python.

Want to count lines in a file? wc -l. Compress a directory? tar -zcf. Send that compressed file somewhere? Pipe it to ssh. Each of those is an ordeal in python and it's around 10 keystrokes in shell.

The only thing bash is better at than Python is very short scripts, like 10ish lines. Everything else it sucks at, due to the horrible syntax and various bash footguns.
If you have to call out to many external programs, might as well use Bash. I use Bash in such cases.
An excellent use case for AI coders today is, "change this she'll script that's gotten too big into python". :)
i once wrote a whole data processing library in bash because i didn’t want people at my then workplace to extend and continue developing it. it was needed for a narrow purpose which it served well (details lost). ultimately people ported it to python and kept developing it anyway.