Hacker News new | ask | show | jobs
by AndyKelley 3934 days ago
The problem with this project is that somebody might use it. I argue that if you need to do something in bash that involves more than a sequence of commands, including a simple if statement, then you should switch from bash to a proper scripting language (python, perl, ruby, nodejs, take your pick.)
1 comments

I disagree. Shell scripts are great for automating system tasks — for sysadminy type stuff, including tasks with high complexity with nested loops and conditionals.

A scripting language like those you mentioned is great for if you need to run your script on systems you don't control, or you need it to be cross platform, or you'd describe what you're doing as creating software rather than automating tasks.

No scripting language can ever integrate with the host system quite as well as shell scripts can. Sure, shell scripts make it easy to shoot yourself in the foot, but then can't you say the same of at least perl and ruby?

I don't really see many good use cases for this library though. By the time you'd use it you should probably move on to something else.

> No scripting language can ever integrate with the host system quite as well as shell scripts can.

What is the basis for this? Shell script is a scripting language (more precisely, a set of scripting languages with similar features.) The difference between shell scripting languages and other scripting languages is that the former are optimized around the need to scale down to a convenient line-by-line way to work with the system in a REPL; while the others may support work in a REPL it is not what they are optimized for.

There's no real reason why other scripting languages can't integrate with the system as well as shell languages.

>> No scripting language can ever integrate with the host system quite as well as shell scripts can.

> What is the basis for this? Shell script is a scripting language (more precisely, a set of scripting languages with similar features.) The difference between shell scripting languages and other scripting languages is that the former are optimized around the need to scale down to a convenient line-by-line way to work with the system in a REPL; while the others may support work in a REPL it is not what they are optimized for.

Disagree. The defining characteristic of shell scripting languages is that they are a shell that can be scripted. What's a shell? It's a program designed to be a layer around the OS, exposing all of the capabilities of the OS to the user in a convenient form.

So the only way a non-shell scripting language can become that powerful is to become a shell scripting language.

>What's a shell? It's a program designed to be a layer around the OS, exposing all of the capabilities of the OS to the user in a convenient form.

I think that's what they were getting at when they said:

>optimized around the need to scale down to a convenient line-by-line way to work with the system in a REPL

The only real difference is optimization for the REPL. Many of the things you might consider part of the experience aren't even shell built-ins, they're utilities maintained separately from your shell. Powershell is a good example of a shell scripting language that has a lot more going on than your traditional shell. Native object pipelines, .net libraries, etc. but the most important thing about it is that it's optimized for the REPL.

In addition to AnimalMuppet's point, shell scripts make no attempt to be cross platform, so you don't get any awkward leaky abstractions when trying to interact with low-level system facilities.

The primitives in shell scripts are the primitives of the operating system. The fundamental building blocks of your language are strings and files and processes, which makes it really convenient to work with strings and files and processes, AKA the operating system.

You can easily invoke platform-specific binaries from within a scripting language. You can even invoke them in a shell if you want. Check out this API: https://docs.python.org/3/library/subprocess.html#subprocess...
If most of your scripting language code is calls to other programs and to the shell, there's a simpler way...
Well of course you can. Scripting languages aren't useless stunted toys. The point is that if that's what most of your program is doing, you'll have a lot more success using a language where that functionality is first class.
Can you give a shell script example of something that perl/python/ruby/nodejs can't do?
No such example; after all, you can spawn a shell subprocess in any of those.

There's just certain types of tasks where you can more clearly express your intent in a shell script. If, for example, you need to spawn a ton of subprocesses, you can do that in any scripting language, but shell is designed from the ground up to launch subprocesses — it's most basic purpose is to launch programs.

And so while I can't give an example of something that can only be done in a shell script, here's a shell script that I wrote a few weeks ago that would have been a pain in the neck to express in any other language:

    #!/bin/sh
    cd "$(dirname "$0")"
    for test in *.in; do
        test="$(basename "$test" .in)"
        infile="$test.in"
        outfile="$test.out"
        output="$(./whofrom "$infile" 2>&1)"
        expected="$(cat "$outfile")"
        if [ "$output" != "$expected" ]; then
                echo "Failed test $test"
                echo "Expected: $expected"
                echo "Actual:   $output"
                echo
        fi
        if ! valgrind --error-exitcode=1 --leak-check=full ./whofrom "$infile" 2>/dev/null >/dev/null; then
                echo "Failed test $test"
                valgrind -q --leak-check=full ./whofrom "$infile"
                echo
        fi
    done
I don't think it's a big pain in python. I imagine ruby wouldn't be terrible either:

    #!/usr/bin/env python

    import sys, os, subprocess, glob 

    os.chdir(os.path.dirname(sys.argv[0]))
    for test in glob.glob("*.in"):
        test = os.path.splitext(test)[0]
        infile = test + ".in"
        outfile = test + ".out"
        output = subprocess.check_output(["./whofrom", infile], stderr=subprocess.STDOUT)
        expected = open(outfile, 'r').read()
        if output != expected:
            print("Failed test", test)
        try:
            devnull = open(os.devnull, 'w')
            subprocess.check_call(["valgrind", "--error-exitcode=1", "--leak-check=full", "./whofrom", infile], stderr=devnull, stdout=devnull)
        except subprocess.CalledProcessError:
            print("Failed test", test)
            subprocess.check_call(["valgrind", "-q", "--leak-check=full", "./whofrom", infile])
(since it's a quick script, I'm ignoring stuff like closing files - they'll get GCd on each iteration anyway)

Apart from the header, the whole script is pretty much the same when comparing line-by-line.

One issue is the problem of indirect documentation, and it comes down to this one line: "import sys, os, subprocess, glob".

How do you find which module you need? Search on the web may be the best answer. In shell at least you have "man -k".

Once I know the module, how do I get its documentation? Here at least there is an answer: import glob; help(glob);

But how good is this documentation? If I do it I get:

    glob(pathname)
    Return a list of paths matching a pathname pattern.
    
    The pattern may contain simple shell-style wildcards a la fnmatch.
Already python is telling me that the "man" documentation is going to be better :-).
> How do you find which module you need? Search on the web may be the best answer.

Experience, stdlib reference, web searches. Same as with bash.

> In shell at least you have "man -k".

I really disagree here. Since you took `glob` as an example, how do you get to the explanation of `for test in *.in`? Go on, try that with "man -k".

> Already python is telling me that the "man" documentation is going to be better

I'm not trying to say python is good here (well, the stdlib documentation on the web is actually pretty good, it's just not easily available from the console). But the idea that bash/man is more discoverable is just wrong... You can find the glob under "Pattern Matching" section which is all right, but you need to understand most of the expansion mechanism of shell to know it applies to "for". Then again "for" itself has a definition that belongs more to a CS material, than to a usage guide.

I'd argue that that's significantly worse at conveying the intent than the shell script version.

The shell script started as a workflow that was executed repeatedly, manually, from the shell, and then automated. Naturally, the shell script resembles very closely they commands typed in at the shell, with some added code to encode the part of the routine that was executed in the user's head.

The python script doesn't look anything like the routine that was previously entered at the shell. That makes it harder to tell at a glance that this script does the same thing.

It it horrible? No absolutely not. If you're working on a team where everyone knows python but not everyone is comfortable with shell scripts, then writing the script in python is the clear best option. But if you're working on a team where everyone is comfortable with both python and shell scripts, the shell script probably wins out.

> Can you give a shell script example of something that perl/python/ruby/nodejs can't do?

Install perl/python/ruby and nodejs -- requiring just the shell to be installed?

Snark aside (it's not only snark - pretty much every system will have something like a posix shell), proper posix portable shell is hard - it's an old gnarly language -- but it's what we've got. And with a bit of discipline and good practices -- it doesn't have to be bad. That said, a lot of real-world shell scripts are bad.

I tend to agree with the overall sentiment; shell isn't a great language. As soon as you start to mix awk (which awk is that, do you need GNU awk?), sed and perhaps a bit of egrep (or grep -E -- are both available? Does it accept only BSD-style parameters?) -- one should consider moving "up".

And for eg: setting up a python package/program -- I'd generally prefer a python script -- hopefully one that handles different file-paths (eg: / vs \ ) and other cross-platform stuff. If you already depend on python, why add dependency on shell?

> Install perl/python/ruby and nodejs -- requiring just the shell to be installed?

I just meant using any one of them, not all of them. And you can pretty much always rely on python and perl being installed.

But my point is that if you're doing something other than a one-off thing, then it belongs to some project and you're probably going to commit that script to that project's codebase. That means it has to be maintained. Do future you and whoever else has to maintain the project a favor and use one of the popular scripting languages that has reasonable syntax and semantics.

Many minimal distributions have only the shell installed - which make it (still) relevant for provisioning/bootstrap etc.

Personally I'd much rather maintain a shell script than a perl script - but that's just because I know shell better. Maybe shell is the first language people would program without learning it (js being the second)?

>Install perl/python/ruby and nodejs -- requiring just the shell to be installed?

You can run several of those languages (python for sure, but also perl IIRC) as a shell.

As a system shell? In theory, perhaps. But migrating a typical Linux/bsd distribution away from having any dependency in the shell would be a major undertaking. While some distributions already ship with only shell/busybox.
Set environment variables in the calling shell. Because you cannot run Python etc via "source".