Lead Image © Vasyl Nesterov, 123RF.com

Lead Image © Vasyl Nesterov, 123RF.com

Command-line tools for the HPC administrator

Line Items

Article from ADMIN 42/2017
By
Several sophisticated command-line tools can help you manage and troubleshoot HPC (or other) systems.

The HPC world has some amazing "big" tools that help administrators monitor their systems and keep them running, such as the Ganglia and Nagios cluster monitoring systems. Although they are extremely useful, sometimes it is the small tools that can help debug a user problem or find system issues. Here are a few favorites.

ldd

The introduction of sharable objects [1], or "dynamic libraries," has allowed for smaller binaries, less "skew" across binaries, and a reduction in memory usage, among other things. Users, myself included, tend to forget that when code compiles, we only see the size of the binary itself, not the "shared" objects.

For example, the following simple Hello World program, test1, uses PGI compilers (16.10):

PROGRAM HELLOWORLD
write(*,*) "hello world"
END

Running the ldd command against the compiled program produces the output in Listing 1. If you look at the binary, which is very small, you might think it is the complete story, but after looking at the list of libraries linked to it, you can begin to appreciate what compilers and linkers do for users today.

Listing 1

Show Linked Libraries (ldd)

$ pgf90 test1.f90 -o test1
$ ldd test1
    linux-vdso.so.1 =>  (0x00007fff11dc8000)
    libpgf90rtl.so => /opt/pgi/linux86-64/16.10/lib/libpgf90rtl.so (0x00007f5bc6516000)
    libpgf90.so => /opt/pgi/linux86-64/16.10/lib/libpgf90.so (0x00007f5bc5f5f000)
    libpgf90_rpm1.so => /opt/pgi/linux86-64/16.10/lib/libpgf90_rpm1.so (0x00007f5bc5d5d000)
    libpgf902.so => /opt/pgi/linux86-64/16.10/lib/libpgf902.so
...
Use Express-Checkout link below to read the full article (PDF).

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Small Tools for Managing HPC

    Several very sophisticated tools can be used to manage HPC systems, but it’s the little things that make them hum. Here are a few favorites.

  • More Small Tools for HPC Admins

    We look at  some additional tools that you might find useful when troubleshooting HPC systems .

  • pdsh Parallel Shell

    The pdsh  parallel shell tool lets you run a command across multiple nodes in a cluster.

  • HPC fundamentals
    The pdsh parallel shell is a fundamental HPC tool that lets you run a command across multiple nodes in a cluster.
  • Sharing Data with SSHFS

    Sharing data saves space, reduces data skew, and improves data management. We look at the SSHFS shared filesystem, put it through some performance tests, and show you how to tune it.

comments powered by Disqus