Hacker News new | ask | show | jobs
by stouset 729 days ago
I love this example, because it highlights how absolutely cursed shell is if you ever want to do anything correctly or robustly.

In your example, newlines and spaces in your filenames will ruin things. Better is

    find … -print0 | while read -r -d $'\0'; do …; done
This works in most cases, but it can still run into problems. Let's say you want to modify a variable inside the loop (this is a toy example, please don't nit that there are easier ways of doing this specific task).

    declare -a list=()

    find … -print0 | while read -r -d $'\0' filename; do
        list+=("${filename}")
    done
The variable `list` isn't updated at the end of the loop, because the loop is done in a subshell and the subshell doesn't propagate its environment changes back into the outer shell. So we have to avoid the subshell by reading in from process substitution instead.

    declare -a list=()

    while read -r -d $'\0' filename; do
        list+=("${filename}")
    done < <(find … -print0)
Even this isn't perfect. If the command inside the process substitution exits with an error, that error will be swallowed and your script won't exit even with `set -o errexit` or `shopt -s inherit_errexit` (both of which you should always use). The script will continue on as if the command inside the subshell suceeded, just with no output. What you have to do is read it into a variable first, and then use that variable as standard input.

    files="$(find … -print0)"
    declare -a list=()

    while read -r -d $'\0' filename; do
        list+=("${filename}")
    done <<< "${files}"
I think there's an alternative to this that lets you keep the original pipe version when `shopt -s lastpipe` is set, but I couldn't get it to work with a little experimentation.

Also be aware that in all of these, standard input inside the loop is redirected. So if you want to prompt a user for input, you need to explicitly read from `/dev/tty`.

My point with all this isn't that you should use the above example every single time, but that all of the (mis)features of shell compose extremely badly. Even piping to a loop causes weird changes in the environment that you now have to work around with other approaches. I wouldn't be surprised if there's something still terribly broken about that last example.

1 comments

You have really proven your point even more than you meant to. Unfortunately none of these examples are robust.

The "-r" flag allows backslash escaping record terminators. The "find" command doesn't do such escaping itself, so that flag will cause files with backslashes at the end to concatenate themselves with the next file.

Furthermore, if IFS='' is not placed before each instance of read, or set somewhere earlier in the program, than each run of white-space in a filename will be converted into a single space.

EDIT: I proved your point even more. The "-r" flag does the opposite of what I thought it did, and disables record continuation. So the correct way to use read would be with IFS='' and the -r flag.

Love it. And I wouldn’t be surprised in the least if even this fell apart in some scenarios too.