| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by joosters 3717 days ago

Once again, you're getting too hung up on my dumb little example, which I spent exactly 0 seconds thinking about. It's the general problem that's interesting (and annoying), the 'command1 | command2' general case.

If you want a difficult example, then take a more real-world example: e.g. the workflow of a 'find [some stuff] |grep [some other stuff]' is one to consider. That's where horrid workarounds like -print0 and -z have to come in, but the simple 'find|grep' works fine up until a file has a space in it.

As I said, there's no simple fix, even for the re-organised form of 'grep [some other stuff] `find [some stuff]` because the shell can't tell the difference between a filename and just a stream of text in the output of one program.

2 comments

kps 3717 days ago

The (ex) AT&T Research command tw(1) has pretty much replaced find(1) for me (particularly with some canned search selectors for particular projects).

link

joosters 3717 days ago

I'm struggling to find any information on this command (it's not an easy name to search for!) Do you have any links you could share please?

link

kps 3717 days ago

Apologies, I forgot to include a link: The toolkit is now at https://github.com/att/ast since AT&T laid off the group a couple years ago.

About half the package consists of evolutions of traditional Unix commands. The parts I use regularly are ksh and tw. tw ('tree walk') is sort of a 'find --exec' replacement with a C-like selector syntax. It's a bit verbose, so for interactive use I generally set up project-specific shell aliases with selection expressions, e.g.

  alias cctw=$'tw -e "select: return (type == REG) && ((name == \'*.c\') || (name == \'*.h\') || (name == \'*.cpp\') || (name == \'*.cc\') || (name == \'*.h\') || (name == \'*.hpp\') || (name == \'*.mm\') || (name == \'*.inc\'));" '

and then use those, e.g.

  cctw egrep -w MyIdentifier

link

joosters 3716 days ago

Thanks, this looks interesting!

link

txutxu 3717 days ago

If I could need to combine a find|grep right now (this is, if the directory recursion and filters of grep by itself, weren't enough, which maybe a corner case too...) I could do it like this:

    while IFS= read -rd '' file; do
      echo "do whatever with: $file"
      grep whatever -- "$file"
    done < <(find ~/whatevers -print0)

It's like natural language if you do it daily.

Will handle not only spaces, but also

new

lines

file

names.

Have a nice day.

link

joosters 3717 days ago

You miss my point. Of course, for every example I give, it's possible to build a workaround to handle the spaces. My point is, it's the very fact that you need a workaround that makes it so irritating.

link

txutxu 3716 days ago

For me, the code I did give, is not a workaround.

It's the canonical way of do it.

Other ways, even if they are "expected to work by inexperienced occasional users"... are simply flawed a first eye view.

A workaround is to ditch shell script, as soon as you face a problem, and blame shell script, and turn to do it in a "more advanced language" that has the same or more caveats. That could be a workaround.

Delimiting file names with null bytes, in case they could be split by any of the $IFS values, is NOT a workaround, is pure logic.

link