Hacker News new | ask | show | jobs
by ketanmaheshwari 1032 days ago
GNU Parallel has been one of my go to tool to accomplish more on the terminal. Generate test data, transferring data from one node to another using rsync, run many-task, embarrassingly parallel jobs on HPC, pipelines with simple data dependencies but run over hundreds or files are some of the places where I use GNU Parallel.

Many thanks to Ole Tange for developing the wonderful tool and helping the users on Stack Overflow sites to this day.

Shameless plug, I am developing a tutorial on GNU Parallel to be presented at eScience conference in Cyprus this year: https://www.escience-conference.org/2023/tutorials/gnu_paral...

1 comments

I'm surprised the CPU would in any way be the bottleneck for transferring data. Is it really faster to parallelize that?
It's more GNU Parallel has host groups in a config so you can send files for a job to the right one where its going to execute and bring things back. Essentially it can turn a local xargs type job into any kind of remote task execution including dealing with files locally needing to be remote.