![© victoroancea, 123RF.com © victoroancea, 123RF.com](/var/ezflow_site/storage/images/archive/2013/14/managing-cluster-software-packages/po-23822-123rf-victoroancea_123rf-pakete_resized.png/95945-1-eng-US/PO-23822-123RF-victoroancea_123RF-Pakete_resized.png_medium.png)
© victoroancea, 123RF.com
Managing cluster software packages
Package Deal
Setting up and configuring an HPC cluster is not as difficult as it used to be; some nice provision tools allow almost anyone to get a cluster working in short order. An issue worth considering, however, is how easily you can change things once the cluster is working. For example, if you get a cluster set up and then a user comes to you and says, "I need package XYZ built with library EFG version 1.23," do you re-provision things to meet your user's needs, or is there an easy way to add and subtract software from a running cluster that is minimally intrusive?
The short answer is "yes." Before I describe how you can organize a cluster to be more malleable, some mention of provisioning packages will be helpful. Three basic methods are offered by various toolsets:
- Image Based – A node disk image is propagated out to nodes on boot. Different "rolls" (images) can be constructed for different packages. An example is Rocks Clusters [1].
- NFS Root – Each node boots and installs everything as NFS root except for things that change for each node (e.g.,
/etc
,/var
). This system can be run disk-less or disk-full. An example is oneSIS [2]. - RAM Disk – A RAM disk is created on each node that holds a running system image. The RAM disk system can be created in hybrid mode, wherein some files are available via NFS, and it can run disk-less or disk-full. An example is Warewulf [3]. (A good description of Warewulf can be found in the HPC Admin series on Warewulf [4].)
Regardless of the provisioning system, the goal is to make changes without having to reboot nodes. Not all changes can be made without rebooting nodes (i.e., changing the underlying
...