| Just today, I used xargs instead of spending a lot of time building a batching script in Python.
I wanted to launch a bunch of processes in a queue but only execute 10 of them in parallel at any time. Here is a skeleton of what I came up with. find $(pwd) -mindepth 1 -maxdepth 1 -type d -name ".zfs" -prune -o -type d -print0|xargs -0 -P 2 -I {} echo {}
where,$(pwd) indicates the starting point of the listing of directories -mindepth 1 makes sure current directory is not listed once again. -maxdepth 1 makes sure the list does not get recursive -type d -name - only directories and list names ".zfs" -prune - makes it ignore .zfs (snapshot directories) -print0 - makes sure to print results without newlines. just -print will print one result per line xargs -0 will take care of processing out spaces or newlines in the input stream -P 2 — run two processes at once in parallel -I {} says that replace {} in teh subsequent command from stdin piped into xargs
echo {} will be echo dir1 and then echo dir2 etc That's just an example to show that we can do a lot with standard Unix tools before bringing in the external sophistication for data related tasks. |