Managing cluster software packages

Package Deal

Article from ADMIN 14/2013

By Douglas Eadline

Adding and subtracting software from a running cluster can be tricky; however, many application packages can be added or removed easily with a few tools and some simple tricks.

Setting up and configuring an HPC cluster is not as difficult as it used to be; some nice provision tools allow almost anyone to get a cluster working in short order. An issue worth considering, however, is how easily you can change things once the cluster is working. For example, if you get a cluster set up and then a user comes to you and says, "I need package XYZ built with library EFG version 1.23," do you re-provision things to meet your user's needs, or is there an easy way to add and subtract software from a running cluster that is minimally intrusive?

The short answer is "yes." Before I describe how you can organize a cluster to be more malleable, some mention of provisioning packages will be helpful. Three basic methods are offered by various toolsets:

Image Based – A node disk image is propagated out to nodes on boot. Different "rolls" (images) can be constructed for different packages. An example is Rocks Clusters [1].
NFS Root – Each node boots and installs everything as NFS root except for things that change for each node (e.g., /etc, /var). This system can be run disk-less or disk-full. An example is oneSIS [2].
RAM Disk – A RAM disk is created on each node that holds a running system image. The RAM disk system can be created in hybrid mode, wherein some files are available via NFS, and it can run disk-less or disk-full. An example is Warewulf [3]. (A good description of Warewulf can be found in the HPC Admin series on Warewulf [4].)

Regardless of the provisioning system, the goal is to make changes without having to reboot nodes. Not all changes can be made without rebooting nodes (i.e., changing the underlying

...

Use Express-Checkout link below to read the full article (PDF).