Hacker News new | ask | show | jobs
by masklinn 3299 days ago
Of course you could just pipe from /dev/zero which will easily do 10GB/s on every machine.
2 comments

Just tried it,

    cat /dev/zero|pv >/dev/null 
gives me roughly 2/3 the speed of yes. (5 GB/s vs 7 GB/s) Plus yes gives you an arbitrary string, instead of only zeroes.
Tobik already wrote it

but here's todays Useless Use of Cat Award http://porkmail.org/era/unix/award.html

Well, sometimes using cat is faster, as I discovered to my surprise recently:

https://news.ycombinator.com/item?id=14414610

But in that case one could probably say that Gnu awk is just not very good at input handling (as Mawk doesn't appear to benefit from an extra "cat").

That's because you're using cat and a pipe. Try this instead:

  pv > /dev/null < /dev/zero
For anyone who is like me and finds it uncomfortable that things are now out of order, note that you can still put the input redirection in front:

  </dev/zero pv >/dev/null
Or even this:

pv </dev/zero >/dev/null

which is a common way of doing it (for any command with any inputs and outputs, not just the above ones), i.e.:

command < input_source > output_dest

All three pv command invocation variants, the one here and the two above, work. And it becomes more clear why they are the same, when you know that the redirections are done by the shell + kernel, and the command (such as pv) does not even know about it. So in all three cases, it does not see the redirections, because they are already done by the time the command starts running (with its stdin and stdout redirected to / from those respective sources). And that is precisely why the command works the same whether it is reading from the keyboard or a file or pipe, and whether it is writing to the screen or a file or pipe.

I believe /dev/zero writes data one byte at a time; that's likely the reason why.

[edit] That's actually inaccurate (and badly expressed), see comments below.

/dev/zero doesn't "write" anything in the sense that yes writes, since it's a character device and not a program. The Linux kernel's implementation of /dev/zero does not write one byte at a time.
You're right of course; and actually I believe the kernel will simply provide as many bytes as the read() requested; so the speed should mostly depend on how you access /dev/zero. IE, the user above was using cat and I think with dd and a proper block size it'd be much faster.
I was under the impression that cat automatically used a sane size for reading. Now that I think of it I cannot think of a source, other that to point out my own anecdotal experience.

When I was writing raspbian images to SD cards to use on a raspabery Cat and DD took within a few seconds of each other on an operation longer then a minute. Since then I have been using cat where I could, but I didn't think to right down the numbers though.

For something somewhat related, see parts of this thread:

https://news.ycombinator.com/item?id=14414610

Note that cat+gnu awk was faster than just gnu awk - but mawk was faster still (reading a not entirely small file).

And in a similar vein of gp comparing Gnu and Openbsd, note that openbsd cat is a little more convoluted than the simplest possible implementation (at least to my eyes):

https://github.com/openbsd/src/blob/master/bin/cat/cat.c

https://github.com/coreutils/coreutils/blob/master/src/cat.c

(That is, Gnu "cat" and OpenBSD "cat" are less different than Gnu "yes" and OpenBSD "yes").

yes will give repeated data though, not just zeros, seems like it's more useful here - as others have pointed out, also uses less syscalls than /dev/zero.