New Release of Lmod Environment Modules System
One of the key tools for any cluster is environment modules, which allow you to define your user environment and the set of tools you need or want to build your application. The module feeds into a resource manager (job scheduler), where you can re-create the same environment that you used to build the application to run the application.
One environment module, Lmod, is under constant development and has some unique features. I've written about Lmod before, but recently a new version 6.0 was announced that has some new tools that make it worth reviewing.
Fundamentals of Environment Modules
Programmers use a number of compilers, libraries, MPI libraries/tools, and other tools to write applications. For example, someone might code with OpenACC, targeting GPUs and Fortran, whereas another person might use PETSc to solve their problem. Tools that allow users and developers to specify the set of tools they want or need is key to operating an effective HPC system. “Effective” can mean better performance (choosing the tools that allow your code to run as fast as possible), more flexibility (user choice of tools that match their specific case), or ease of configuration of the environment for specific tools.
For example, assume you have three versions of the GNU compiler – 4.8, 4.9, and 5.1 – and the latest Intel and PGI compilers, along with the latest MPICH (3.1.4) and OpenMPI (1.8.5). Altogether you have 10 possible combinations (five compilers, two MPI libraries). At this point you have two choices: You can reduce the number of combinations (e.g., get rid of some of the compilers and perhaps one of the MPI libraries), or you can use environment modules so users can choose the combination of compiler and MPI library they want or need.
As a user, I might build a parallel application using GNU 4.8 and MPICH; however, the GNU 5.1 compilers have some unique features, so I might want to try building the same application with it. Environmental modules allow the user to select the tools used for production, while also allowing them to use a different tool set for development.
The secret to environment modules is manipulating the environment variables. Users can manipulate environment variables such as $PATH , $LD_LIBRARY_PATH , and $MANPATH and make changes to these variables according to the tool combinations desired. Changing the tool set changes these environment modules accordingly. It's fairly simple conceptually, but it's not always easy in practice.
Lmod
Lmod is an environment module tool that provides simple commands for manipulating your tool selection. For example, you can list available modules, load and unload modules, purge all modules, swap modules, list loaded and avail able modules, query modules, ask for help on modules, show modules, and perform many other tasks related to modules. Other options aren't used as frequently but are there if you need them.
One of the coolest features of Lmod is its ability to handle a module hierarchy, so that Lmod will only display modules that are dependent on loaded modules, preventing you from loading incompatible modules. This feature can help reduce unusual errors with mismatched modules that are sometimes very difficult to diagnose. I'll explain more about module hierarchy in a later section, because it is a very important feature in Lmod.
One of the first widely used environment module tools is Environment Modules TCL/C, so-called because the code is written primarily in C and the modules in TCL. Lmod retains the ability to read and use modules written in TCL, but it adds the ability to read and use modules written in Lua, a popular language in its own right and a very embeddable language for applications.
Module files written in either TCL or Lua tell Lmod how to change environment variables for a particular tool. You place these files in a directory hierarchy and add a couple of commands in the module so that Lmod knows the tool dependencies.
Next, I discuss Lmod Hierarchical Modules and explain how to organize module files and how to limit the visibility of dependent module files. I'll use an example from my own cluster to help illustrate this process. Additionally, I've tried to add comments about Lmod best practices, some of which I’ve gathered from email discussions with Dr. McLay on the Lmod-users mailing list and others from Lmod documentation and presentations. I hope these help with your Lmod deployment.