GNU Parallel: A Design For Life
ML: Ole, GNU Parallel is a specialist tool for running multiple jobs at the command line at the same time. Why did you develop it, and what are your uses for it?
OT: I often get in the situation that I need to run a script on each
line of a bunch of lines, so back in 2001 I made a wrapper script for
make -j
to run command lines in parallel. This was the first basic
version of parallel that later became GNU Parallel. The full history
of GNU Parallel is at http://www.gnu.org/s/parallel/history.html
Today I use GNU Parallel even for tasks that do not really need to be run in parallel, simply because of its ease of replacing arguments on the command line. Like emptying all tables in a database:
sql -n mysql:///
'show tables' |
parallel sql mysql:///
DELETE FROM {};
To me it has become a bit of a sport to see if the tasks I do can be done more efficiently using GNU Parallel. When you have gotten used to it, a lot of the once-off scripts can often be written on a single line using GNU Parallel -- and they are often even easier to read. As an example, if you wanted to convert all *.mp3 to *.ogg running one process per CPU core on local computer and server2 you could simply do:
parallel --trc {.}.ogg
-j+0 -S server2,: 'mpg321
-w - {} | oggenc -q0 - -o
{.}.ogg' ::: *.mp3
I encourage my users to share their smartest command lines on the email list parallel@gnu.org, so new uses can be found.
ML: Like many people, I'm sure, I often forget I'm using GNU Parallel. Has this ubiquity hurt your development efforts at all?
OT: A good tool is a tool that does not get in your way, but tries to support your work by providing reasonably defaults while remaining configurable for you own needs. GNU Parallel strives to accomplish this. This often also means that you do not really think about GNU Parallel as the tool is simply a step to accomplish your task.
The role that GNU Parallel plays will never be more than a supporting role and thus the best GNU Parallel can hope for is to be an integral part of every UNIX user's toolbox, so I would love to see people mentioning GNU Parallel when someone uses xargs or while-read loops for tasks that was better done with GNU Parallel.