Hacker News new | ask | show | jobs
by stewartbutler 1673 days ago
I've always had trouble getting `for` loops to work predictably, so my common loop pattern is this:

    grep -l -r pattern /path/to/files | while read x; do echo $x; done
or the like.

This uses bash read to split the input line into words, then each word can be accessed in the loop with variable `$x`. Pipe friendly and doesn't use a subshell so no unexpected scoping issues. It also doesn't require futzing around with arrays or the like.

One place I do use bash for loops is when iterating over args, e.g. if you create a bash function:

    function my_func() {
        for arg; do
            echo $arg
        done
    }
This'll take a list of arguments and echo each on a separate line. Useful if you need a function that does some operation against a list of files, for example.

Also, bash expansions (https://www.gnu.org/software/bash/manual/html_node/Shell-Par...) can save you a ton of time for various common operations on variables.

4 comments

Once I discovered functions are pipeline-friendly, I started pipelining all the things. Almost any `for arg` can be re-designed as a `while read arg` with a function.

Here's what it can look like. Some functions I wrote to bulk-process git repos. Notice they accept arguments and stdin:

  # ID all git repos anywhere under this directory
  ls_git_projects ~/src/bitbucket |
      # filter ones not updated for pre-set num days    
      take_stale |
      # proc repo applies the given op. (a function in this case) to each repo
      proc_repos git_fetch
Source: https://github.com/adityaathalye/bash-toolkit/blob/master/bu...

The best part is sourcing pipeline-friendly functions into a shell session allows me to mix-and-match them with regular unix tools.

Overall, I believe (and my code will betray it) functional programming style is a pretty fine way to live in shell!

> I've always had trouble getting `for` loops to work predictably, so my common loop pattern is this:

for loops were exactly the pain point that lead me to write my own shell > 6 years ago.

I can now iterate through structured data (be it JSON, YAML, CSV, `ps` output, log file entries, or whatever) and each item is pulled intelligently rather than having to conciously consider a tonne of dumb edge cases like "what if my file names have spaces in them"

eg

    » open https://api.github.com/repos/lmorg/murex/issues -> foreach issue { out "$issue[number]: $issue[title]" }
    380: Fail if variable is missing
    379: Backslashes and code comments
    378: Improve testing facility documentation
    377: v2.4 release
    361: Deprecate `swivel-table` and `swivel-datatype`
    360: `sort` converts everything to a string
    340: `append` and `prepend` should `ReadArrayWithType`

Github repo: https://github.com/lmorg/murex

Docs on `foreach`: https://murex.rocks/docs/commands/foreach.html

Powershell is also a good option nowadays (although a lot of people on HN seem to dismiss it for various, imo rather superficial, reasons).

  PS> (irm https://api.github.com/repos/lmorg/murex/issues) | % { echo "$($_.number): $($_.title)" }
  380: Fail if variable is missing
  379: Backslashes and code comments
  378: Improve testing facility documentation
  377: v2.4 release
  361: Deprecate `swivel-table` and `swivel-datatype`
  360: `sort` converts everything to a string
  340: `append` and `prepend` should `ReadArrayWithType`
Or just

  PS> (irm https://api.github.com/repos/lmorg/murex/issues) | format-table number, title

  number title
  ------ -----
     380 Fail if variable is missing
     379 Backslashes and code comments
     378 Improve testing facility documentation
     377 v2.4 release
     361 Deprecate `swivel-table` and `swivel-datatype`
     360 `sort` converts everything to a string
     340 `append` and `prepend` should `ReadArrayWithType`
Or even `(irm https://api.github.com/repos/lmorg/murex/issues) | select number, title | out-gridview`, which would open a GUI list (with sorting and filtering), but I think that only works on Windows.
The reason I dismissed Powershell was that it doesn't always play nicely with existing POSIX tools, which is very much not a superficial reason :)

Murex aims to give Powershell-style types but still working seamlessly with existing CLI tools. An attempt at the best of both worlds. But I'll let others be the judge of that.

It's also worth noting that Powershell wasn't available for Linux when I first built murex so it wasn't an option even if I wanted it to be.

> I've always had trouble getting `for` loops to work predictably, so my common loop pattern is this:

The problem with the pipe-while-read pattern is that you can't modify variables in the loop, since it runs in a subshell.

It’s a trade off. I had to use a piped loop earlier this year to extract fallout new Vegas mods and textures since they all had spaces in their file names. For this it was perfect to pipe a list of the names to a loop, but for 99% of things I just use a for loop
Yup. Nearly everything has tradeoffs.

BTW, the problem I mentioned earlier can be avoided by using `< <()`:

  $ x=1
  $ seq 5 | while read n; do (( x++ )); done
  $ echo $x
  1
  $ while read n; do (( x++ )); done < <(seq 5)
  $ echo $x
  6
Almost makes me wonder what the benefit of preferring a pipe here is. I guess it's just about not having to specify what part of the pipeline is in the same shell.
It’s funny I’ve been using Linux for a decade and a half, professionally for about half that time, and yet I still go to python when arithmetic is involved. I’ve been learning a lot about the shell lately it’s like I did the bare minimum with bash just to be able to run programs and slightly automate things and it took this long for it to click with me that it’s a productive programming language in its own right (and probably faster than python.)
I use "python calc" for quick calculations at the command line:

  pc () { python3 -c "print($*)" ; } 

  $ pc 3.5**2.4
  20.219169193375105
Or "awk calc", which seems much faster: (and ^ and ** both work for powers)

  calc () { awk "BEGIN{print $*}" ; }

  $ calc 3.5**2.4
  20.2192
With the original Borne shell, the exper command was used for arithmetic.

The POSIX shell allows the $(( form for native shell arithmetic, but not the (( alternative found in bash and Korn.

BTW you can make it work in bash by setting shopt -s lastpipe. It runs the last part of the pipeline in the main shell, so the mutation variables of will persist.

Both OSH and zsh behave that way by default, which was tangential to a point in the latest release notes https://news.ycombinator.com/item?id=29292187

Another trick I've seen in POSIX shell is to add a subshell after the pipeline until the last time you want to read the variable. Like

    cat foo.txt | ( while read line; do
      f=$line
    done

    echo "we're still in the subshell f=$f"
    )
For me I always used for loops and only recently (after a decade of using Linux daily) have learned about the power of piped-loops. It’s strange to me you are more comfortable with those than for loops, but I think it does make sense as you’re letting a program generate the list to iterate over. A pain point in for loops is getting that right, e.g. there isn’t a good way to iterate over files with spaces in them using a for loop (this is why I learned about piped loops recently.)
> A pain point in for loops is getting that right, e.g. there isn’t a good way to iterate over files with spaces in them using a for loop

If those files came as arguments, you can use a for-loop as long as they're kept in an array:

  for f in "${files[@]}";
That handles even newlines in the filenames, while I'm not sure if you can handle that with a while-read-loop. IFS=$'\0' doesn't seem to cut it.

for-loops seem preferable for working with filenames. If a command is generating the list, then something like `xargs -0` is preferable.

My problem was that I had a directory with probably 200+ subdirectories each one, and the files and subdirectories below them, had a couple of spaces in the name. I typically use

    for f in ‘ls’; 
For operations like that but it was obviously built on windows (but I run steam on Ubuntu) and I never interact with windows so tbh I had never thought of this problem before.
The GNU way for handling files that have inconvenient characters in their names is:

    find ... -print0 | xargs -0 ...
It makes all the problems go away.
Also you can use readarray to store the found filenames in a bash array (to use with a for loop).
You can also have your script change the bash file seperator.

https://bash.cyberciti.biz/guide/$IFS

Something I wish I'd learned 23 years ago instead of 3 years ago :(