If all software is built to protect against all possible future anticipated use cases, your software will take longer to make, perform worse, and be more likely to have bugs.
If all software is built only to solve the problem at hand, it will take less time to develop, be less likely to have bugs, and perform better.
It isn't clear that coding for reuse is going to get you a net win, especially since computing platforms, the actual hardware, is always evolving, such that reusing code some years later can become sub-optimal for that reason alone.
There's a middle ground. Eg the classic Unix 'cat' (ignoring all the command line switches) does something really simple and re-usable, so it makes sense to make sure it does the Right Thing in all situations.
I mean, 'cat' does something so simple (apply the identity function to the input) that it has no need to be reusable because there's no point using it in the first place. If you have input, processing it with cat just means you wasted your time to produce something you already had.
The point of cat(1), short for concatenate, is to feed a pipeline multiple concatenated files as input, whereas shell stdin redirection only allows you to feed a shell a single file as input.
This is actually highly flexible, since cat(1) recognizes the “-“ argument to mean stdin, and so you can `cat a - b` in the middle of a pipeline to “wrap” the output of the previous stage in the contents of files a and b (which could contain e.g. a header and footer to assemble a valid SQL COPY statement from a CSV stream.)
But that is a case where you have several filenames and you want to concatenate the files. The work you're using cat to do is to locate and read the files based on the filename. If you already have the data stream(s), cat does nothing for you; you have to choose the order you want to read them in, but that's also true when you invoke cat.
This is the conceptual difference between
pipeline | cat # does nothing
and
pipeline | xargs cat # leverages cat's ability to open files
Opening files isn't really something I think of cat as doing in its capacity as cat. It's something all the command line utilities do equally.
This is actually re-batching stdin into line-oriented write chunks, IIRC. If you write a program to manually select(2) + fread(2) from stdin, then you’ll observe slightly different behaviour between e.g.
dd if=./file | myprogram
and
dd if=./file | cat | myprogram
On the former, select(2) will wake your program up with dd(1)’s default obs (output block size) worth of bytes in the stdin kernel buffer; whereas, on the latter, select(2) will wake your program up with one line’s worth of input in the buffer.
Also, if you have multiple data streams, by using e.g. explicit file descriptor redirection in your shell, ala
(baz | quux) >4
...then cat(1) won’t even help you there. No tooling from POSIX or GNU really supports consuming those streams, AFAIK.
But it’s pretty simple to instead target the streams into explicit fifo files, and then concatenate those with cat(1).
In addition to what the other commenters pointed out about cat being able to concatenate, even using cat as the identity function is useful. Just as the number zero is useful.
For sure, if you can apply a small amount of effort for a high probability of easy re-usability, do it. But if you start going off into weird abstract design land to solve a problem you don't have yet, while it might be fun, probably you should stop. At least if it is a real production thing you are working on.
I guess it depends a bit on the shape of your abstract design land. Sometimes it can give you hints about how your API should look like, or what's missing.
Remember when the CIA contracted with Netezza to improve their predator drone targeting, who then went and reverse-engineered some software from their ex business partner IISI and shipped that?
IISi’s lawyers claimed on September 7, 2010 that “Netezza secretly reverse engineered IISi’s Geospatial product by, inter alia, modifying the internal installation programs of the product and using dummy programs to access its binary code [ … ] to create what Netezza’s own personnel reffered to internally as a “hack” version of Geospatial that would run, albeit very imperfectly, on Netezza’s new TwinFin machine [ … ] Netezza then delivered this “hack” version of Geospatial to a U.S. Government customer (the Central Intelligence Agency) [ … ] According to Netezza’s records, the CIA accepted this “hack” of Geospatial on October 23, 2009, and put it into operation at that time.”
Reality is always more absurd, government agencies remain inept and corrupt even when shrouded in secrecy to cover up their missteps, and by the way, Kubernetes now flies on the F16.
I think one of big problems in software development is that nobody measures the half-life of our assumptions. That is the amount of time it takes for half of the original assumptions to no longer hold.
In my limited experience assumptions half-life in software could be easily as low as around one year. Meaning that in 5 years only 1/32 of original architecture would make sense if we do not evolve it.
If all software is built only to solve the problem at hand, it will take less time to develop, be less likely to have bugs, and perform better.
It isn't clear that coding for reuse is going to get you a net win, especially since computing platforms, the actual hardware, is always evolving, such that reusing code some years later can become sub-optimal for that reason alone.