HPC fundamentals
Quick on the Uptake
When I was preparing for my qualification exams in graduate school, I talked with fellow grad students about their experiences. I remember one fellow student said something like, "When in doubt, focus on the fundamentals"; that is, go back to first principles to solve problems. I have always remembered his comment, and I try to apply it when I can.
For HPC, one of the fundamentals is being able to run a command across multiple nodes in a cluster. A parallel shell is a simple but powerful tool that allows you to do just that on designated (or all) nodes, so you do not have to log in to each node and run the same command over and over again. This single tool has an infinite number of ways to be useful, but I like to use it when performing administrative tasks, such as:
- quickly discover the status of nodes in a cluster,
- checking the versions of particular software packages on each node,
- checking the OS version on all nodes,
- checking the kernel version on all nodes,
- searching the system logs on each node (if you do not store them centrally),
- examining the CPU usage on each node,
- examining local I/O (if the nodes do any local I/O),
- checking whether any nodes are swapping,
- spot-monitoring the compute nodes, and
- debugging.
This list is just the short version; the complete list is extensive. Anything you want to do on a single node can be done on a large number of nodes using a parallel shell tool. However, for those that might be asking if they can use parallel shells on their 50,000-node clusters, the answer is that you can, but the time skew in the results will be large enough that the results might not be useful (which is a completely different subject). Parallel shells are more practical when used on a smaller number of nodes, on specific nodes (e.g., those associated with a specific job in a resource manager), or for gathering information that varies somewhat slowly. However, some techniques will allow you to run parallel commands on a large number of nodes.
Among the parallel shells available, many are written in Python, which has become a very popular DevOps tool. Some of the tools are perhaps not as appropriate or useful for HPC but may be good for other tasks. The shell I typically use – and that I have found a large number of other people using – is pdsh
.
Introduction to pdsh
The pdsh
tool [1] is arguably one of the most popular parallel shells. It allows you to run commands on multiple nodes using only SSH [2], so the data transmission is encrypted. (It is a good practice to encrypt all data, whether it is "on the wire" or "at rest," or within the cluster or from outside the cluster.) Only the client nodes need to have ssh
installed, which is pretty typical for HPC systems. However, you need the ability to SSH to any node without a password (i.e., passwordless SSH). Using ssh
inside the cluster should alleviate your fears about not using passwords.
Build and Install
Building and installing pdsh
is really simple if you have built code using GNU's autoconfigure before:
./configure --with-ssh --without-rsh make make install
These three lines put the binaries into /usr/local/
, which is fine for testing purposes. For production work, I would put them in /opt
or the like; just be sure the directory is in your path. Also, to make life easier, I put the directory on a filesystem that is shared with the compute nodes, which allows pdsh
to run regardless of what system I am using.
You might notice that I used the --without-rsh
option in the configure
command. By default, pdsh
uses rsh
, which is not secure and should never be used. Notice the available rcmd modules
at the bottom of Listing 1 (rcmd
is the remote command used by pdsh
) states that only ssh
and exec
are available. If rsh
was not excluded, it would be listed here, too, and it would be the default; however, it is highly recommended that you not build pdsh
with rsh
because it is such a security hole.
Listing 1
pdsh Options
$ pdsh -v pdsh: invalid option -- 'v' Usage: pdsh [-options] command ... -S return largest of remote command return values -h output usage menu and quit -V output version information and quit -q list the option settings and quit -b disable ^C status feature (batch mode) -d enable extra debug information from ^C status -l user execute remote commands as user -t seconds set connect timeout (default is 10 sec) -u seconds set command timeout (no default) -f n use fanout of n nodes -w host,host,... set target node list on command line -x host,host,... set node exclusion list on command line -R name set rcmd module to name -M name,... select one or more misc modules to initialize first -N disable hostname: labels on output lines -L list info on all loaded modules and exit available rcmd modules: ssh,exec (default: ssh)
If you happened to build pdsh
with rsh
and do not or cannot rebuild it, you can override rsh
and make ssh
the default by adding the following line to your .bashrc
file:
export PDSH_RCMD_TYPE=ssh
Be sure to source
your .bashrc
file (e.g., source .bashrc
) to set the environment variable. You can also log out and log back in.
If for some reason you see the following when you try running pdsh
, then you have built it with rsh
:
$ pdsh -w 192.168.1.250 ls -s pdsh@home4: 192.168.1.250: rcmd: socket: Permission denied
You can either rebuild pdsh
without rsh
or use the environment variable in your .bashrc
file (or both).
First Commands
A quick test ensures that pdsh
is working correctly. This simple test gets the kernel version of a different node using the IP address of the other node.
$ pdsh -w 192.168.1.250 uname -r 192.168.1.250: 2.6.32-431.11.2.el6.x86_64
The -w
option means that the IP address of the target's node(s) is specified. In this case, the IP address of the target remote node is listed (192.168.1.250
). Specifically, uname -r
is the command to be run. Finally, notice that pdsh
output starts the with the node name 192.168.1.250
followed by the output of the command or, as in this case, an error message.
In the off chance you need to mix rcmd
modules in a single command, you can specify which module to use on the pdsh
command line. For example, the command below uses ssh
:
$ pdsh -w ssh:laytonjb@192.168.1.250 uname -r 192.168.1.250: 2.6.32-431.11.2.el6.x86_64
You just put the specific rcmd
module before the node name. In this case, ssh
.
Buy this article as PDF
(incl. VAT)