« Previous 1 2 3 Next »
HPC fundamentals
Quick on the Uptake
Specifying Hosts
A very common use of pdsh
is to use the WCOLL
environment variable that points to the file with a list of hosts. If you do not specify a hostlist on the command line, pdsh
will use the hostlist indicated by WCOLL
. As an example, I created a pdsh
subdirectory with a file named hosts
that lists the default hosts:
$ mkdir PDSH $ cd PDSH $ vi hosts $ more hosts 192.168.1.4 192.168.1.250
For this specific case, only two nodes are listed: 192.168.1.4
and 192.168.1.250
. The first node is the cluster head node, and the second is the first test compute node. The hosts in the file can be specified in a comma-separated list, if desired. Just be sure not to put a blank line at the end of the file because pdsh
will think it is a host and try to connect to it. You can put the WCOLL
environment variable in the .bashrc
file:
export WCOLL=/home/laytonjb/PDSH/hosts
As before, you can source .bashrc
, or you can log out and log back in to activate it after you create it.
Ideally, your /home
directory is shared with all nodes in the cluster, which means you only have to edit .bashrc
once; otherwise, you have to log in to every node and update the local .bashrc
. (What a pain!)
Specifying a list of nodes can be done in several ways, which I will not list here. The pdsh
site [1] discusses virtually all of them, some of which are pretty handy. The simplest way to specify the nodes is on the command line with the -w
option, separated by commas:
$ pdsh -w 192.168.1.4,192.168.1.250 uname -r 192.168.1.4: 2.6.32-696.30.1.el6.x86_64 192.168.1.250: 2.6.32-431.11.2.el6.x86_64
Or, you can use a range of hosts on the command line:
$ pdsh -w host[1-11] uname -r $ pdsh -w host[1-4,8-11] uname -r
In the first case, pdsh
expands the host range to host1
, host2
, etc., through host11
. In the second case, the hostlist expands to host1
, host2
, host3
, host4
, host8
, etc., through host11
. The pdsh
website has more information on hostlist expressions [3]. Being able to specify a range of hosts is very convenient if you have a large number of nodes.
Even if you use a range of hosts, specifying the exact host targets can result in a long command. Another option is to have pdsh
read the hosts from a file other than the WCOLL
environment variable:
$ pdsh -w ^/tmp/hosts uptime 192.168.1.4: 15:51:39 up 8:35, 12 users, load average: 0.64, 0.38, 0.20 192.168.1.250: 15:47:53 up 2 min, 0 users, load average: 0.10, 0.10, 0.04 $ more /tmp/hosts 192.168.1.4 192.168.1.250
This command tells pdsh
to take the hostnames from the file listed after -w ^
(no space between the ^
and the file name), which is /tmp/hosts
here.
Using files that contain lists of nodes is a great way to examine subsets of nodes, especially for larger systems. For example, you could create a list of hosts in one file for the first rack, a second list in a different file for the second rack, and so on. A command could then be run that uses the specific file that lists the hosts in a specific rack.
The subsets do not have to be divided on a rack basis. The lists can be based on node type or cluster function, such as putting all I/O nodes or all login nodes into a single file; you are not limited as to what you can do with these files. This method allows you to use pdsh
with very large systems by breaking it into subclusters that are more manageable; then, you can run the same command on each of the subclusters.
The following example shows how to use several host files at once:
$ more /tmp/hosts 192.168.1.4 $ more /tmp/hosts2 192.168.1.250 $ pdsh -w ^/tmp/hosts,^/tmp/hosts2 uname -r 192.168.1.4: 2.6.32-696.30.1.el6.x86_64 192.168.1.250: 2.6.32-431.11.2.el6.x86_64
Notice that both nodes respond to the command. Hosts can also be excluded from a list:
$ pdsh -w -192.168.1.250 uname -r 192.168.1.4: 2.6.32-696.30.1.el6.x86_64
The hyphen in front of 192.168.1.250
means it is excluded from the command, so the output is only from 192.168.1.4
. Nodes listed in a file can also be excluded from the pdsh
command:
$ pdsh -w -^/tmp/hosts2 uname -r 192.168.1.4: 2.6.32-696.30.1.el6.x86_64
By using the hyphen in front of the file name, the nodes listed in the file are excluded. In this case /tmp/hosts2
, which contains 192.168.1.250
, is not included in the output.
Alternatively, you can use the -x
option combined with a hostname or a list of hostnames to be excluded:
$ pdsh -x 192.168.1.4 uname -r 192.168.1.250: 2.6.32-431.11.2.el6.x86_64 $ pdsh -x ^/tmp/hosts uname -r 192.168.1.250: 2.6.32-431.11.2.el6.x86_64 $ more /tmp/hosts 192.168.1.4
The first command excludes 192.168.1.4
from the default list of hosts. The second command excludes the host in /tmp/hosts
from the output.
The combinations of these options are very powerful, allowing you to set up files with hosts for specific subsets of nodes, such as racks, or to create node lists on the basis of node function, such as login or storage. If nodes are down, you can exclude them, if you like. The point is, you should think about common node combinations and include them in specific files. Simple bash scripts could also allow you to specify combinations of node files.
More Useful Options
The previous examples are fairly simple. To better understand the breadth of options to be used with pdsh
, I will show you some different commands. Listing 2 illustrates how to run a complex pdsh
command. Before getting into the details, notice that each output line from pdsh
lists the node followed by the response from the command.
Listing 2
Complex pdsh Command 1
$ pdsh `cat /proc/cpuinfo | grep bogomips` 192.168.1.4: bogomips : 6998.13 192.168.1.4: bogomips : 6998.13 192.168.1.4: bogomips : 6998.13 192.168.1.4: bogomips : 6998.13 192.168.1.4: bogomips : 6998.13 192.168.1.4: bogomips : 6998.13 192.168.1.4: bogomips : 6998.13 192.168.1.4: bogomips : 6998.13 192.168.1.250: bogomips : 5624.23 192.168.1.250: bogomips : 5624.23 192.168.1.250: bogomips : 5624.23 192.168.1.250: bogomips : 5624.23
Notice that the entire command is in backquotes, meaning the entire command is run on each node. This includes the first part, cat /proc/cpuinfo
, the output of which is piped to the second part of the command, grep bogomips
. Using the backquotes allows you to run complex commands on the target nodes.
For the particular command in Listing 2, the value of bogomips
differs for each node because the nodes are different: The first node has eight cores (four cores and four Hyper-Threading cores), whereas the second node has four cores. Consequently, the bogomips
value should be different for the two nodes.
Listing 3 is a variation of the command in Listing 2. Notice that the entire command is not contained in backquotes; rather, only the first part of the command is contained in single quotes. When you run this pdsh
command, the part in the quotes is run first on all targeted nodes. After the command returns, the output is piped through the second part of the command, grep bogomips
, which is executed on the node where pdsh
was run. The point of these command variations is to show that you need to be careful how you construct a command so you understand the output.
Listing 3
Complex pdsh Command 2
$ pdsh 'cat /proc/cpuinfo' | grep bogomips 192.168.1.4: bogomips : 6998.13 192.168.1.4: bogomips : 6998.13 192.168.1.4: bogomips : 6998.13 192.168.1.4: bogomips : 6998.13 192.168.1.4: bogomips : 6998.13 192.168.1.4: bogomips : 6998.13 192.168.1.4: bogomips : 6998.13 192.168.1.4: bogomips : 6998.13 192.168.1.250: bogomips : 5624.23 192.168.1.250: bogomips : 5624.23 192.168.1.250: bogomips : 5624.23 192.168.1.250: bogomips : 5624.23
A very important item to note is that pdsh
does not guarantee that the output is returned in any certain order. If 20 nodes are targeted in the list, the output from pdsh
will not necessarily start with node 1 and increase incrementally to node 20. Listing 4 is an example of a vmstat
command run on two nodes. The command should run twice on each node in one-second intervals.
Listing 4
Commingled Output
$ pdsh vmstat 1 2 192.168.1.4: procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- 192.168.1.4: r b swpd free buff cache si so bi bo in cs us sy id wa st 192.168.1.4: 1 0 0 30198704 286340 751652 0 0 2 3 48 66 1 0 98 0 0 192.168.1.250: procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- 192.168.1.250: r b swpd free buff cache si so bi bo in cs us sy id wa st 192.168.1.250: 0 0 0 7248836 25632 79268 0 0 14 2 22 21 0 0 99 0 0 192.168.1.4: 1 0 0 30198100 286340 751668 0 0 0 0 412 735 1 0 99 0 0
At first glance, it looks like the first lines of output are from the first node, but then the output from the second node creeps in. A command with multiple output lines cannot guarantee the order of the output. If you really have to run a command across the target nodes with multiple lines of output, the only real choice is to put all of the output into a file and edit it to rearrange the lines so they are in the correct order.
A word of caution, though: If the command produces multiple output lines, for example three lines, it is possible the output lines from a single node will arrive out of order; for example line 3 could arrive before line 2. Ideally having a tag on each line of output would allow it to be reassembled much more easily.
Another technique for using pdsh
is to run scripts on each node. For example, in previous articles on processor and memory metrics [4] and on process, network, and disk metrics [5], scripts were designed to create the metrics. With some simple modifications, these scripts can return a single line of output. Putting the scripts in a central location for each node allows pdsh
to run the script on the target nodes.
Modules
Earlier I mentioned how pdsh
uses rcmd
modules by default to access nodes. The developers have extended this to create modules for various specific situations. The pdsh
modules page [6] lists other modules that can be built as part of pdsh
, which currently includes:
rcmd/rsh
rcmd/ssh
rcmd/mrsh
(usesmunge
[7] authentication)rcmd/xcpu
misc/genders
(node selection usinglibgenders
)misc/nodeupdown
(usesnodeupdown
library)misc/machines
(provides an option for a flat-file list of hosts)- Slurm (list of targets built from
SLURM_JOBID
or-j jobid
) misc/dshgroup
(list of targets built fromdsh
-style "group" files)netgroup
(list of targets to be built from netgroups)
These modules allow pdsh
to do specific things. For example, the Slurm module allows you to run the command only on nodes specified by currently running Slurm jobs. When pdsh
is run with the Slurm module, it will read the list of nodes from the SLURM_JOBID
environment variable. You can also run pdsh
with the -j jobid
option, and it will get the list of hosts from the job ID specified.
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)