Hacker News new | ask | show | jobs
by angarg12 3175 days ago
At what point it pays off to switch to python/node instead of keep growing a bash script?
7 comments

At what point should one switch from an interpreted to a compiled language?

There probably aren't any hard and fast answers. Take Dracut [1], for example. It's a successful utility completely written as a collection of bash scripts. The answer probably depends on the specifics of your needs and specific benchmarks.

Bash gets a lot of hate, but if you take the time to really learn it as a language and use good coding practices---like linting, verbose warnings, and unit testing---then it's not too difficult to write long bash scripts well. I don't think it's really much trickier or dirtier than JavaScript.

One thing that helps is putting the spiritual equivalent of "use strict" at the top of your Bash script:

    shopt -s -o errexit pipefail nounset noclobber
Take a look at `help set` for info on what those settings do. Here's a list of resources that have helped me feel confident when writing bash:

    * Bash Hacker's Wiki [2]
    * ShellCheck [3]
    * Debugging Bash Scripts [4]
Also, this blog post gives some advice that I found useful:

    * Shell Scripts Matter [5]
[1] https://dracut.wiki.kernel.org/index.php/Main_Page

[2] http://wiki.bash-hackers.org/

[3] https://www.shellcheck.net/

[4] http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_02_03.htm...

[5] https://dev.to/thiht/shell-scripts-matter

The beauty of bash scripts is that anything with u+x in your PATH, aliases and user-defined functions become a first-class citizen with the same interface. The usual plumbing with pipes and redirection -- and sometimes some more advanced stuff like process substitution -- is often all you need.

Moving to Python, say, makes the control flow and the "software engineering" easier, for sure. However, don't underestimate the power of grep, sed and especially awk. I wouldn't want to reimplement a half-arsed, presumably non-bug-free version of awk in Python when I could have just used awk in the first place.

There's no reason you can't orchestrate awk and other tools from python though. (I doubt I'm the only one that's done that before).
I wrote a tool to do that:

https://github.com/ianmiell/shutit

Among many other things I use it to test Kubernetes/OpenShift clusters in Vagrant for Chef recipes:

https://github.com/ianmiell/shutit-openshift-cluster/blob/ma...

and here's some others in a more 'native' python format:

https://github.com/ianmiell/shutit-scripts/

https://github.com/ianmiell/shutit-scripts/blob/master/logme...

https://github.com/ianmiell/shutit-scripts/blob/master/gnupl...

> crazy talk about pushing awk through python

> oh! I wrote a tool to do that!

sniffles

That might just be my favorite HN exchange ever. Y'all remind me of emacs users. Between this and the unusual number of Perl mentions on HN today, my cockles are suitably heated.

That is a compliment. Nary a day passes where HN doesn't give me a reason to keep clicking links and learning.

I started off giggling and now I'm moved from my tablet to a laptop so that I can investigate your shutit - as it looks like a handy tool for learning. The scales look the most interesting.

I have been here for a while, but only recently (past couple of months) decided to comment. I lurked for like 12 months, just to see if I'd fit in. Why? HN continually has commentary about things I haven't yet learned.

In short, this is my awkward attempt to thank you. I'll be spending the afternoon trying to enjoy your shutit scales.

Great, thanks!

You can find more info here on my blog:

https://zwischenzugs.wordpress.com/

do ping me with your experience.

No, of course not, but it’s way easier in Bash because of the first-class-ness of commands. In Python, you’d have to mess about with the subprocess module, etc.
I guess it comes down to which school of wizardy you went to. I'd rather muck around in python, if only because then I wouldn't dread rereading/editing the whole thing a couple months later.
In my experience it tends to be at around 50 to 100 lines. At this point you usually encounter at least one "gotcha" or difficultly that bash can't handle in an elegant way and realize that it would probably be a lot faster if you just used another language instead.
As soon as I start adding ifs and battling to remember the exact syntax I give up and move to Python and argparse. Recently I've started using the Begins[1] library to make useful command line tools and scripts even more quickly. I appreciate it's less and less portable at this stage but so much more productive. Most of the time these scripts are very specific anyway so it's not a concern.

The other thing I've noticed amongst peers is that I'm usually the only one (or one of a limited few) that remotely understand the shell script too. So I'd serve the team better writing it with Python. YMMV with that of course.

[1] http://begins.readthedocs.io/en/latest/

There is a lot of historical arcana baked into the command line and bash, but I've found it worth learning like any other language.

Anyway, just for future reference you can use bash's `help` command to get documentation about builtins. In the case of conditionals, we want to loo at `test`:

    $ help test
I find this a lot easier than wading through the bash man page.
Holy crap, xelxebar. Thank you! This is a game-changer. I've been wading through the bash man pages like a fool.
Thanks for pointing to Begins!

I suspect some of this is just a difference in background or culture; shell conditionals look completely obvious and intuitive to me, and I'm pretty sure my team can handle shell better than python.

Begins looks awesome, thanks for sharing! :)
Quite a bit after the point where it pays to convert sed and awk one-liners into Perl one-liners and build a program around it.

Perl may be as dead as sed and awk are now, but that was the initial use case for Perl before CGI (as in Common Gateway Interface - the first primitive interface for web apps) came along.

Long after this particular case. I'd be surprised if it could be done in fewer than twice the lines of code the author used in any language other than Perl (which might make the case for using Perl at around this point, if you like Perl). But, Python and Node would surely be a lot longer.

Shell with awk/sed/grep is very powerful and very composable for small parsing tasks like this. (It could have been done with even less code than this, but I don't blame the author for not figuring out all the necessary incantations to do so, as these tools can be fiddly if you don't use them every day.)

My thresholds: a) over 50 lines b) when you want to fork a pipe (get two different results in one pass over the data)
My rule is one line, I don't want to write any bash script to a file, one-liners are as far as it goes for me.