| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by alganet 361 days ago

If the script does a lot of unstreamable replacements, you're right. But there are still ways out of bash.

I prefer no fork, no bashism, reusable functions:

    set -euf

    replace () {
        local t=
        REPLY=$1
        t="${REPLY#*"$2"}"
        test "$t" != "$REPLY" || return
        REPLY="${REPLY%"$2$t"}$3$t"
    }

    replace_all () {
        REPLY=$1
        shift
        while replace "$REPLY" "$@"; do :; done
    }

    input="foo bar foo baz"
    replace_all "$input" "foo" "HELLO"
    echo $REPLY

Not exactly easy to write, but now that they're functions, it doesn't matter since I can reuse them.

Regarding performance, it is slower than bash, but not significantly.

Times for 1000 calls:

    9.908s -- sed (one fork per replacement)
    0.015s -- bash x//str/replace
    0.088s -- sh replace_all

Times for 50.000 calls:

    0.351s -- bash x//str/replace
    0.631s -- sh replace_all

Also, you can get further performance by inlining replace inside replace_all instead of making one call another.

Note that I could have done several replacements inside a single sed pipe, but I decided to count the performance for doing it like you suggested x=$(echo $x | sed s/str/replace/). The same goes for my functions, one invokation per replacement (in fact, they are tuned for that scenario).

sed can absoltely beat the shell in scenarios where you can make one fork do lots and lots of replacements. It depends on the scenario, and how proficient you are in writing sed (which can do branching, keep state, all sorts of things portably).

From an architectural point of view, it makes sense to have a simpler `sh` and keep a sort of standard library of functions, instead of feature creeping the interpreter with weird arcana. It makes shells like dash easier to maintain, easier to debug, easier to port and safer.