Hacker News new | ask | show | jobs
by twooster 1863 days ago
You should almost always be running Bash in `-e` (exit-on-error) mode. This necessitates precisely the construct mentioned in this article.

For example:

  set -e

  var="$( false )"
  if [ $? -eq 0 ] ; then
    echo Ok: "$var"
  else
    echo Not ok: $?
  fi
If you run this program, neither "Ok" or "Not ok" will be echoed, because the program will exit with an error on the `var=` line. (Not to mention the $? on the not-ok line won't work because it will be the exit code of the `[` test command in the conditional, not the exit code of the captured subshell command).

Instead the following will work:

  set -e

  if var="$( false )" ; then
    echo Ok: "$var"
  else
    echo Not ok: $?
  fi
Note that this will _not_ work:

  if ! var="$( false )"; then
    echo Not ok: $?
  fi
Your output will be "Not ok: 0". This is because negation impacts the exit code of the previous command.
4 comments

No that is dangerous, consider this:

    set -e
    myfunc() {
      date %x  # syntax error; returns 1, should be +%x
      echo 'should not get here'
    }

    if var=$(myfunc); then
      echo $var   # erroneously prints 'should not get here'
    else
      echo failed
    fi
Then you will ignore failure, which is bad.

This is a variant of the issue that the sibling comment brought up -- error handling is disabled inside "if" conditions.

In Oil the whole construct is unconditionally disabled by strict_errexit. It's too subtle.

Oil has 2 ways of capturing the exit code, including

    run --assign :status -- mycommand  # exit 0 but assign the status to a var
and

    shopt --unset errexit {  # explicitly disable error handling if you want
      mycmd
      var status = $?
    }

I'm looking for feedback to make sure that Oil has indeed fixed all of this stuff: https://www.oilshell.org/blog/2020/10/osh-features.html

Basically the situation is "damned if you do and damned if you don't" in Bourne shell, so you need language/interpreter changes to really fix it. The rules are too tricky to remember even for shell experts -- there are persistent arguments on POSIX behavior that is over 20 years old, simply because it's so confusing.

https://github.com/oilshell/oil/wiki/Where-To-Send-Feedback

> The rules are too tricky to remember even for shell experts -- there are persistent arguments on POSIX behavior that is over 20 years old, simply because it's so confusing.

I don't know, I find it easier to not use set -e. I find it significantly easier to just explicitly handle all my errors. Having my script exit at some arbitrary point is almost never desirable.

I find chaining && and || pretty intuitive.

  var=$(myfunc) &&
    echo OK ||
    {
      echo Not OK
      exit 1
    }
This is pretty contrived. I'd probably put the error handling in a function and then only handle the failure scenario:

  var=$(myfunc) || die 'Not OK'
  echo OK
I never run into problems, this always works as expected, I don't need any language or interpreter changes to fix it. Once you realize `if` is just syntactic sugar and [ is just `test` then the world gets pretty simple.
The rules without -e are definitely less hairy than the rules with -e.

Are you regularly writing shell scripts that check all their errors? I'd be curious to take a look if any of them are public.

It's not impossible to do this -- git's shell scripts seem to do a decent job. However I think the vast majority of shell scripts don't check errors, so "set -e" is "closer to right", if not right. (And I claim it's nearly impossible to be "right" with state of the art with -e -- fixes in the shell itself are needed.)

I'll also note that Alpine Linux's apk package manager switched to "set -e" a few years ago. They are shell experts and even they found it difficult to check all their errors without it. apk is more than 1000 lines of shell IIRC.

I'm not your parent comments OP, but I don't like to use -e for similar reasons, and I do have public bash scripts written with explicit error handling [1].

I've tried using -e on multiple occasions, but always had to disable it again because it lead to even worse classes of errors. It hurt me more than it helped, but I also consider my explicit error handling quite pedantic and probably not the norm.

I feel that instead of fixing the problem it just shifts it around, because an exit code != 0 does not generally indicate an error:

* `((i++))` must now be written as `((i++)) || true` or it will probably kill your script, and short hand conditions become foot-guns:

* Statements such as `[[ $i -gt 3 ]] && continue` must now be written in long form in their own ifs, or `|| true` must be appended.

* Want to save the return code? Guess you have to use `command && ret=$? || ret=$?` now.

* Subshells with errors may exit independently and won't notify you of the error at all, because ylur parent is still alive afterwards.

The list goes on and on. Lots of strange edge cases appear. I recommend reading http://mywiki.wooledge.org/BashFAQ/105 which highlights some of them.

In the end, set -e requires you to think just as much as not using it to not randomly crash your program. It just shifts the problem to other statements. I already learned to handle errors in normal bash, and the only thing set -e requires me to do is to change my error handling style in those cases, which to me generally seem more obscure.

If I still have to use set -e, I always use something like the following snippet, so that I at least get a message:

``` set -e function eerr() { echo "error: $0:$1: command failed but status was never checked!" }; trap 'eerr "$LINENO"' ERR``` ```

[1] https://github.com/oddlama/gentoo-install/blob/develop/scrip...

Yeah I'm not surprised by that and don't disagree with it: as mentioned I think neither situation is ideal!
That is a pretty subtle and nasty sharp edge! I consider myself quite proficient with bash and best practices, and it still took me a moment to think through this to understand how where things went 'wrong' in your example.

Thanks for your work on Oil shell, I don't yet use it regularly but hope it becomes mainstream. I'm definitely rooting for you, chubot!

It's worse than that actually, `set -e`/"errexit" is disabled for function calls in ifs. Meaning this:

    set -e
    fn() {
        false
        echo "didn't exit"
    }
    if fn; then
        echo "fn() succeeded"
    fi
will output

    didn't exit
    fn() succeeded
Yep, likewise it’s also disabled in function calls and sub shells that are invoked in an && or || block (i.e., in the above case of you change the if statement to “fn && echo ...” you’ll see the same behavior).

Even worse, you can add the line “set -e” inside the function explicitly re-enabling it and it still won’t change the outcome because errexit wasn’t technically unset!

Yes, Oil fixes this with strict_errexit:

sibling comment: https://news.ycombinator.com/item?id=27166719

blog post: https://www.oilshell.org/blog/2020/10/osh-features.html

It needs some official documentation, but if you download Oil and see any other problems I'm interested! I think I fixed all of them.

https://github.com/oilshell/oil/wiki/Where-To-Send-Feedback

`set -e` has a couple of surprising corner cases and in the details is pretty hard to understand. The documentation in `man bash` for the `set -e` flag is 28 lines in my terminal, and the other flags are 2 or 3 lines.

One such corner case is in pipelines. Only the last command in a pipeline can cause the script to terminate.

Another corner case is `foo && bar`, often used as an abbreviated `if`, will not exit when `foo` fails.

It is not a significant task to just add `|| exit $?` after any command whose failure should cause an abort.

My scripts tend to start with

set -euo pipefail

This addresses the pipeline case you mention and also notices use of unitialized variables.

Unfortunately Ubuntu's new bash replacement "dash" doesn't support "-o pipefail" and will error if it's present
dash is not a bash "replacement". dash is an implementation of the Bourne/POSIX shell.

bash, when called as sh, also implements the Bourne shell as well (but badly, because it leaks bash-isms), but when bash is called as bash it is a super-set of Bourne.

* https://en.wikipedia.org/wiki/Almquist_shell#dash

* https://en.wikipedia.org/wiki/Bourne_shell

* https://en.wikipedia.org/wiki/Comparison_of_command_shells

If you want to use super-set functionality adjust your shebang accordingly.

It replaced what they were using previously, which was bash. I am not saying that it is a superset of bash's functionality or that it should be expected to support bash features.

EDIT: I see what you are saying now. Bash is still installed by default despite it not being aliased to /bin/sh. So it's still possible to rely on bash features if you use it explicitly. For some reason I was under the impression bash had also been aliased to dash in the default installation. Thanks for the information.

To be more precise, dash replaced /bin/sh. Ubuntu still includes bash in the default installation, IIRC.
For simple cases, I'll do that, but for longer commands, I've taken to doing

rc=0 long command || rc = $? if (( rc == 0 ))...