Hacker News new | ask | show | jobs
by asicsp 1836 days ago
>In most Unix utilities, “long lines are silently truncated”. This is not acceptable in a GNU utility.

Would be interested to know examples for this.

4 comments

Old versions of sort: http://man.cat-v.org/unix_7th/1/sort

Unix utilities in the old days weren't all that great. One example I've posted about here before is how mv would refuse to move files across filesystem boundaries (because it's not a 'move', it's a 'copy and delete', so you had to use cp and rm instead).

Thanks.

>BUGS Very long lines are silently truncated.

And didn't know that about `mv`, good to know.

It is a good practice to use cp + (verify) + rm instead of mv. Especially when the data is important.
or rsync --remove-source-files
mv doesn’t do that anymore. Is there any reason to use cp and rm instead of mv today?
paranoia... you could do cp + diff md5/sha1sum src/dst + rm
Just a few weeks ago, I fixed a problem in an internal tool (that still also runs on an ancient Solaris 10 box) by switching from `awk` to `gawk`, since the former would briefly whine on stderr and quit whenever a line in its input would contain too many fields. A recent change had managed to break that undocumented barrier.

In this case there's at least no silent breakage involved, but the badly written shell script that called it did not bother to check for that condition, and a fair number of heads were scratched for a while as a consequence.

Yep. When we did Solaris installs GNU coreutils and friends were high on the list. Incidentally nice to see sunfreeware.com is still going!
Some current vendor versions of awk have astonishingly short line length limitations. See e.g. (especially the latter):

https://github.com/samtools/samtools/pull/1165 https://github.com/samtools/htscodecs/pull/22

which work around limitations in the default system awk on Solaris/OpenIndiana/whatever the remnants of SunOS are called these days…

The whole paragraph:

> Avoid arbitrary limits on the length or number of any data structure, including file names, lines, files, and symbols, by allocating all data structures dynamically. In most Unix utilities, “long lines are silently truncated”. This is not acceptable in a GNU utility.

... goes against MISRA C, which certainly is preferable in the domain I work, embedded systems - because dynamic allocations all over the place are a recipe for CVEs.

GNU is about making software for the end-user, that's the opposite of what MISRA is about
https://www.cvedetails.com/vulnerability-list/vendor_id-72/G...

A significant number of these CVEs are related to dynamic memory allocation (double-free, use-after-free).

Probably not all are the result of that piece of advice and probably some of those memory allocations were necessary, but since this class of errors is common in C/C++, I believe it is really not a good idea to encourage people to point the gun right to their feet.

On a side note, please explain to me how this is end-user oriented in a system where the convention is that a program ends silently when everything went smoothly:

> In error checks that detect “impossible” conditions, just abort. There is usually no point in printing any message [...] Explain the problem with comments in the source.

if everything went smoothly likely the program had some useful output (e.g. grep, awk, sed). If it failed then I'd just run `coredumpctl gdb` ? (and ... abort isn't silent ? here's what I get if something aborts here: https://imgur.com/a/69eF73w)