|
|
|
|
|
by caro11ne
1529 days ago
|
|
I find this scary: $ export LC_ALL=C
$ $tm xargs -0P1 grep $lb t < ../f |sort |md5sum
7.05user 9.88system 0:24.53elapsed 69%CPU (0avgtext+0avgdata 2344maxresident)k
0inputs+0outputs (0major+4960minor)pagefaults 0swaps
8ef2c658a70bb38438e59421231246b9 -
$ $tm xargs -0P16 grep $lb t < ../f |sort |md5sum
10.16user 36.62system 0:18.30elapsed 255%CPU (0avgtext+0avgdata 2332maxresident)k
0inputs+0outputs (0major+4980minor)pagefaults 0swaps
c8ebf840e54ec8b5a49e159eda09e63f -
$ $tm parallel -X0P16 grep $lb t < ../f |sort |md5sum
16.97user 33.94system 0:16.36elapsed 311%CPU (0avgtext+0avgdata 51624maxresident)k
0inputs+2069296outputs (0major+169409minor)pagefaults 0swaps
8ef2c658a70bb38438e59421231246b9 -
It greps for lines containing t, sorts the lines and computes a hash.Note how "xargs -P16 grep" gives the wrong answer. The output from parallel matches exactly the lines from "xargs -P1". With "-k" the lines are even in the same order (sorting removed): $ $tm xargs -0P1 grep $lb t < ../f |md5sum
7.03user 9.30system 0:16.32elapsed 100%CPU (0avgtext+0avgdata 2332maxresident)k
0inputs+0outputs (0major+5023minor)pagefaults 0swaps
d89b45188602c9bb08026dc2892cfa75 -
$ $tm parallel -kX0P16 grep $lb t < ../f |md5sum
18.21user 36.03system 0:10.26elapsed 528%CPU (0avgtext+0avgdata 65396maxresident)k
0inputs+2069344outputs (0major+154929minor)pagefaults 0swaps
d89b45188602c9bb08026dc2892cfa75 -
I have not analyzed the output but I think the error is caused by the issue described here: https://mywiki.wooledge.org/BashPitfalls#Non-atomic_writes_w...How anyone would ever use "xargs -P16 grep" is beyond me. I honestly do not care how fast I can get an answer, if I cannot trust the answer is correct. I can see someone claimed they could build a safe parallel grep, but seemed not to do so: https://news.ycombinator.com/item?id=30890780#30913304 It would have been interesting to see. |
|
[1] https://unix.stackexchange.com/questions/449224/how-can-i-ge...