Hacker News new | ask | show | jobs
by q_revert 5036 days ago
I was expecting to see the first comment in here complaining about his use of 'cat', as in all of the examples his second argument could've easily taken a filename argument..

  sort order.*
is surely more elegant than

  cat order.* | sort
which is fair enough, however, as it happens, i generally do end up using 'cat' in the way he's used it.. for such small jobs nobody can be genuinely worried about the overhead, and it comes down to a matter of taste..

personally, i find that using 'cat output | $command' helps to separate out the 'logic' of what i'm doing, if that makes sense..

also, again, purely as a matter of taste

i'd prefer

  egrep 'Hardcover|Kindle'
over

  grep "\(Kindle\|Hardcover\)"
EDIT: (as alexfoo has pointed out, this isn't a proper AND as it worries about the order.. my bad, still useful though :D )

and as a sidenote, something i only found recently, but which is quite useful, a logical AND with egrep looks like

  egrep 'Hardcover.*Kindle'
5 comments

[ EDIT - Two replies as original post had been edited by the time I posted the first. ]

> and as a sidenote, something i only found recently, but which is quite useful, a logical AND with egrep looks like > > egrep 'Hardcover.Kindle'

That's not a true logical AND since it won't pick up an entry with the text "Kindle Hardcover". Only entries with the word "Hardcover" eventually followed by "Kindle". To cover both cases you'd need:-

  egrep 'Hardcover.*Kindle|Kindle.*Hardcover'
(Of course, someone will now show how this can be done in even fewer characters).
How about:

    grep Hardcover | grep Kindle
Yup, but I had meant in one command though, i.e.

  sed -n '/Kindle/{/Hardcover/p}'
awk '/Kindle/ && /Hardcover/'

awk doesn't have to be complicated ;)

If you really prefer to see things in order then

    <foo sort | ...
is an alternative to

    cat foo | sort | ...
though I wouldn't particularly recommend it. Instead the overhead of cat(1) should be omitted and it written in the normally accepted form of

    sort foo | ...
Providing a filename rather than re-directing stdin allows the program more choice over its method of access.
I find the "cat foo | .." in the beginning and "| cat > bar" at the end form more regular.

While iterating on a command line, it keeps things uniform, rather than switching between "sort foo" and "tai64nlocal < foo" and "ffmpeg -i foo", by which I mean: different programs take their input in different ways. You can normalize by making each take standard input, and feed the chain with a "cat".

I can understand liking the regularity but in production code or web examples it shouldn't be done because of the overhead. However, your example doesn't make sense.

If sort, tai64nlocal, and ffmpeg are all happy to read stdin so you can do

    cat foo | sort ...
    cat foo | tai64nlocal ...
    cat foo | ffmpeg ...
then they can all have their stdin redirected instead by the shell:

    <foo sort ...
Similarly with stdout:

    ... | sort | cat >foo
becomes

    ... | sort >foo
In both cases regularity of having the filename at the start and end is preserved.
Minor nitpick from the man page: "egrep is the same as grep -E. fgrep is the same as grep -F. Direct invocation as either egrep or fgrep is deprecated, but is provided to allow historical applications that rely on them to run unmodified."
Indeed, I don't see why people get so upset (or pedantic) about what are, effectively, NOPs in command-lines.

However, things change if you start adding certain options to sort:-

  sort -m order.*
and

  cat order.* | sort -m
are definitely not the same thing (for most input files at least).
Perhaps because sending GiBs through read(2) and write(2) unnecessarily isn't a NOP?
You can do that sending while you're waiting for the disk to provide said GiBs. I also believe that useless uses of cat are often acceptable for readability (many novices are not familiar with redirecting standard input, particularly not as the first thing on a command line).
Who said the GiBs need to be fetched from disk; they could already be in RAM. Even if not, it's still adding many system calls and context switches when the CPU could be doing other things; the machine isn't running just this one thing.
Who said NOPs are free? NOPs still take at least one clock cycle.
It's not that simple anymore... In modern CPUs, NOPs are discarded by the decoding units so they never occupy the exeuction units. If decoding BW is not saturated (and most often, it's not), NOPs are indeed "free".
Moreover if you use cat, it can be easily replaced with e.g. zcat to do whatever you want with gzipped files.