Hacker News new | ask | show | jobs
by koala_man 3448 days ago
This is undefined behavior: while `head -1` will only output a single line, it may read more.

It happens to work on GNU head when stdin is seekable file, because GNU head specifically rewinds the stream before exiting:

    $ (strace  -e read,write,lseek head -1 > /dev/null; cat -) < file
    ...
    read(0, "hello\nworld\n", 8192)         = 12
    lseek(0, -6, SEEK_CUR)                  = 6    # <-- here
    write(1, "hello\n", 6)                  = 6
    +++ exited with 0 +++
If not for that explicit `lseek`, `head -1` would have skipped the entire 8k buffer.

As far as I know, this is exclusive to GNU cat. Neither Busybox nor OSX cat will do this, and will therefore throw away an entire buffer instead of just the first line. You can try it out:

(busybox head -1 > /dev/null; cat -) < file

1 comments

Interesting. Is this true of `tail -n +2` as well? (On mobile, can't test at the moment).
Tail employs a large read buffer as well, but it does not matter because you wouldn't use it in the same manner.

Tail is the right tool for the job here. But if you wish to stick with your idiom, read will reliably consume a single line of input, regardless of how it is implemented:

  (read -r; cat) < file
If anyone is wondering why `read` can reliably read a single line while head can't, it's because it reads byte by byte.

This is just as inefficient as it sounds, but it doesn't matter much in practice since you rarely read a lot with it.

That always reads to eof, so it can't be used in the same way.