Hacker News new | ask | show | jobs
by thaumasiotes 2311 days ago
I mean, 'cat' does something so simple (apply the identity function to the input) that it has no need to be reusable because there's no point using it in the first place. If you have input, processing it with cat just means you wasted your time to produce something you already had.
2 comments

The point of cat(1), short for concatenate, is to feed a pipeline multiple concatenated files as input, whereas shell stdin redirection only allows you to feed a shell a single file as input.

This is actually highly flexible, since cat(1) recognizes the “-“ argument to mean stdin, and so you can `cat a - b` in the middle of a pipeline to “wrap” the output of the previous stage in the contents of files a and b (which could contain e.g. a header and footer to assemble a valid SQL COPY statement from a CSV stream.)

But that is a case where you have several filenames and you want to concatenate the files. The work you're using cat to do is to locate and read the files based on the filename. If you already have the data stream(s), cat does nothing for you; you have to choose the order you want to read them in, but that's also true when you invoke cat.

This is the conceptual difference between

    pipeline | cat       # does nothing
and

    pipeline | xargs cat # leverages cat's ability to open files
Opening files isn't really something I think of cat as doing in its capacity as cat. It's something all the command line utilities do equally.

    pipeline | cat    # does nothing
This is actually re-batching stdin into line-oriented write chunks, IIRC. If you write a program to manually select(2) + fread(2) from stdin, then you’ll observe slightly different behaviour between e.g.

    dd if=./file | myprogram
and

    dd if=./file | cat | myprogram
On the former, select(2) will wake your program up with dd(1)’s default obs (output block size) worth of bytes in the stdin kernel buffer; whereas, on the latter, select(2) will wake your program up with one line’s worth of input in the buffer.

Also, if you have multiple data streams, by using e.g. explicit file descriptor redirection in your shell, ala

    (baz | quux) >4
...then cat(1) won’t even help you there. No tooling from POSIX or GNU really supports consuming those streams, AFAIK.

But it’s pretty simple to instead target the streams into explicit fifo files, and then concatenate those with cat(1).

> Also, if you have multiple data streams, ...then cat(1) won’t even help you there.

I've been thinking about this more from the perspective of reusing code from cat than of using the cat binary in multiple contexts. Looking over the thread, it seems like I'm the odd one out here.

In addition to what the other commenters pointed out about cat being able to concatenate, even using cat as the identity function is useful. Just as the number zero is useful.