Hacker News new | ask | show | jobs
by Aissen 1032 days ago
GNU parallel is great for the kind of tasks highlighted in the post. Note that being written in Perl, it's slower than its simpler C counterpart moreutils parallel. And that in many uses cases xargs --max-procs=$(nproc) can replace it.
2 comments

`xargs` has you covered in more cases than most realize.
This really is true and you may be understating with "most". Here are a couple:

    mkdir /tmp/g
    seq 1 10 | tr \\n \\0 |
      xargs -0n2 -P4 bash -c 't=$EPOCHREALTIME; sleep $((RANDOM%5)); echo "$@" >/tmp/g/$t' d0
    cat /tmp/g/*
Another one is

    xargs -P "$(nproc)" --process-slot-var=s sh -c 'grep X "$@" >>/tmp/g.$s' d0
    cat /tmp/g.*
You can also cobble together that second style with a custom config setup wherein a command is given $s and responds with some host names and there might be an `ssh` in front of the `grep`, for example. That `d0` argument (for $0) is a bit janky and there can be shell quoting issues, of course. But then again, you may not have hostile filenames/whatever. Remote loadavg adaptation might be nice, but then again, maybe you control all the remotes. Similarly, I could not get back-to-back executions of the EPOCHREALTIME thing closer than 250 microseconds. So, collision basically will not happen even though it probably could in theory.
I also recommend checking out `xe`: https://github.com/leahneukirchen/xe

It’s like xargs with sane defaults and a couple tricks of its own.