Environment Modules Using Lmod
The indispensable Lmod high-performance computing tool allows users to control their build and execution environment.
One of the key tools for any cluster is environment modules, which allow you to define your user environment and the set of tools want to build and execute your application. They provide a simple way to control dynamically the huge number of combinations that result from various versions of tools and libraries.
The original environment module, Tcl (tool command language)/C, has been around since the early 1990s. High-performance computing (HPC) sites have been using them to allow users to specify the combination of tools and libraries they want to use.
One implementation of environment modules, Lmod, is under constant development and has some unique features that can be very useful and can even be useful on your desktop if you write code and want to use a variety of tools and libraries. I use Lmod on my desktop and laptop to try new compilers or new versions of compilers, as well as new versions of libraries.
Fundamentals of Environment Modules
Programmers use a number of compilers; MPI, compute, and other libraries; and various tools to write applications. For example, someone might code with OpenACC to target GPUs and Fortran for PGI compilers, along with Open MPI, whereas another person might use the GNU 8.1 compilers with MPICH. One user might use the Portable, Extensible Toolkit for Scientific Computation (PETSc) to solve their problem, and another user might use OpenBLAS.
Environment module tools allow users and developers to specify the exact set of tools they want or need and is key to operating an effective HPC system. “Effective” can mean better performance (tools that allow your code to run as fast as possible), more flexibility (tools that match a specific case), or ease of environment configuration.
To explain a little deeper, assume you have five versions of the GNU compiler (4.8, 5.4, 6.2, 7.3, and 8.1), the latest Intel compiler, the last three community versions of the PGI compilers, three versions of MPICH (3.2.1, 3.1.4, and 3.1), and three versions of Open MPI (2.1, 3.0, and 3.1). Furthermore, assume your users need two versions of PETSc and two versions of OpenBLAS. Altogether you have 216 possible combinations (nine compilers, six MPI libraries, two PETSc versions, and two OpenBLAS versions). This is a very large number of combinations, and forcing users to configure their environment, primarily paths, to use the combination they want or need is bordering on HPC torture. At this point, you have two choices: You can reduce the number of combinations (e.g., get rid of some of the compilers and MPI libraries), or you can use environment modules so users can choose the combination of compiler, MPI library, and computational libraries they want or need.
As a user, I might build a parallel application using GNU 4.8, the default compiler for CentOS 7.3, and MPICH; however, the GNU 6.2 compilers have some unique features, so I might want to try building the same application with it. Environment modules allow me to select the tools used for production and allow me to use a different tool set for development.
The secret to environment modules is manipulating the environment variables (e.g., $PATH, $LD_LIBRARY_PATH, and $MANPATH) and make changes to these variables according to the tool combinations desired. Changing the tool set changes these environment modules accordingly. It is fairly simple conceptually and just takes a bit of coordination.
You can use either the “classic” Environment Modules Tcl/C or Lmod; they both provide the same basic functionality. For the purposes of this article, I will be using Lmod, because it has some features I tend to use fairly often.
Lmod
Lmod is an environment module tool that provides simple commands for manipulating your environment for tool and library selections. A set of “modules,” which are really just text files, are written to modify $PATH, $LD_LIBRARY_PATH, and $MANPATH and possibly create other needed environment variables for the specific tool or library. A great feature of Lmod is that you can write these modules to define dependencies, as well, so that you do not use conflicting tools. By “loading” or “unloading” these modules, you can change your environment to use what you need. If you “purge” all of your modules, they are all unloaded and your $PATH, $LD_LIBRARY_PATH, and $MANPATH are returned to the values they had when you logged in to the system.
Lmod provide a complete set of tools for using and manipulating these module files. For example, you can:
- list available modules (only the modules compatible with the currently loaded modules appear) – module avail
- load and unload modules – module load [x] and module unload [x]
- purge all modules – module purge
- swap modules – module swap
- list loaded modules – module list loaded
- query modules
- ask for help on modules
- show modules
You can also perform many other related tasks with other commands that are not listed here (and not commonly used).
One of the coolest Lmod features is its ability to handle a module hierarchy, so it will only display or list available modules that are dependent on the currently loaded modules, preventing you from loading incompatible modules. This feature can help reduce unusual errors with mismatched modules that are sometimes very difficult to diagnose and is accomplished through a “module hierarchy” that I will explain in a later section because it is a very important and useful Lmod feature.
The first widely used environment module tool was Environment Modules Tcl/C, so called because the code is written primarily in C, with the module files in Tcl. Lmod is a different implementation of environment modules and uses module files written in Lua, a popular language in its own right and an embeddable language for applications. Lmod also retains the ability to read and use modules written in Tcl.
Module files are easy to create with the use of just a few functions for manipulating $PATH, $LD_LIBRARY_PATH, and $MANPATH values for the targeted tool, application, or library. Other environment variables also can be created easily in the module file as they are needed. The module file contains the dependencies, so Lmod can track that information to screen out incompatible modules from being listed or loaded. It is also highly recommended that you include help information in the module along with lots of documentation, such as what version of the tool the module file covers and when the module file was last modified (check the dates on the file to see if they match the modified date in the files).
These module files are placed in a directory hierarchy (tree) in a specific location in the filesystem. Ideally, this filesystem is shared with the other nodes in the cluster; otherwise, you have to copy the module files in the specific hierarchy to all compute nodes and maintain them – including making sure they are all identical.
In the next section, I explain Lmod hierarchical modules, focusing on how to organize module files and how to limit the visibility of dependent module files. I will use an example from my own laptop to help illustrate this process. Additionally, I have tried to add comments about Lmod best practices, some of which I have gathered from email discussions with Robert McLay on the Lmod users mailing list and from the Lmod documentation and presentations of others.
Lmod Hierarchical Modules
One of the key capabilities of Lmod is module hierarchy. Using this capability, Lmod does not let you see or load modules according to the modules that are currently loaded, which helps prevent the loading of conflicting modules, resulting in problems. It can also help you understand what combinations of modules are available, because admins might not build every possible combination. However, if you want to see all possible modules, the Lmod module spider command lets you see all modules.
Figure 1 illustrates the module hierarchy of the module files on my laptop. Anything marked (f) is a file. Everything else is a directory. At the top of the diagram, /usr/local/modulefiles is the directory where all module files are stored. This is the default for Lmod, which is fine for single systems. For clusters, the directory /usr/local would need to be NFS-exported to all of the compute nodes, or you could install Lmod to a different NFS-exported directory.
Below the root directory are three main subdirectories: Core, compiler, and mpi. These directories indicate the dependencies of the various modules. For example, everything in the compiler directory depends on a specific compiler (e.g., GCC 8.1). Everything in the mpi directory is dependent on a specific MPI and compiler combination. Everything in the Core directory does not depend on anything except the operating system.
By default, Lmod reads module files in /usr/local/modulefiles/Core as the first level of available modules. It is a best practice to put any module files in this directory that do not depend on either a compiler or an MPI library, which means you also put the compiler module files in the Core directory.
The gcc subdirectory under Core is where all of the module files for the GNU family of compilers are stored. A best practice from the developer of Lmod, Robert McLay at the University of Texas Advanced Computing Center (TACC), is to make all subdirectories beneath Core, compiler, and mpi lowercase. In McLay’s own words, “Lmod is designed to be easy to use interactively and be easy to type. So I like lower case names where ever possible.” He continues: “I know of some sites that try very hard to match the case of the software: Open MPI, PETSc, etc. All I can say is that I’m glad I don’t have to work on those systems.”
The gcc subdirectory has a module file named 8.1, and you will have a module file corresponding to every GCC compiler version that you want to expose to users. (You could hide some versions from users by not having a module file for it.) For example, if you have versions 5.1, 6.2, 7.1, and 8.1 of the GCC compilers, then you should have four module files in /usr/local/modulefiles/Core/gcc corresponding to these versions. In the case of GCC version 8.1, as shown In Figure 1, the module file 8.1 (f) is actually 8.1.lua, which contains the details for version 8.1 of the GNU compilers. The extension .lua, although not shown in Figure 1, indicates that the module file is written in Lua; however, it could be written in Tcl and called 8.1.tcl.
Notice that for a different set of compilers (e.g., those from PGI), you would create a directory named pgi under /usr/local/modulefiles/Core and then place the module files corresponding to the specific PGI compiler versions in this subdirectory.
All Lmod commands start with module followed by options. For example, you can find what modules are available with the module avail option (Listing 1).
Listing 1: module avail Command
$ module avail --------------------- /usr/local/modulefiles/Core ---------------------- gcc/6.2 gcc/7.1 gcc/8.1 (D) pgi/16.10 pgi/18.4 (D) ---------------- /usr/local/lmod/lmod/modulefiles/Core ----------------- lmod/6.5 settarg/6.5 Where: D: Default Module Use "module spider" to find all possible modules. Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".
Notice that compiler modules are “available” to be loaded because they are “visible.” If you load the gcc/8.1 module and and use module list,
$ module load gcc/8.1 $ module list Currently Loaded Modules: 1) gcc/8.1
you will see that the GCC 8.1 compiler is loaded.
The compiler module files modify the Lmod environment variables to point to the appropriate compiler. It also uses some commands to tell Lmod what MPI libraries are available that have been built with the loaded compiler. Therefore, only the MPI tools that depend on the loaded compiler are available to the user.
If you now type module avail, you get the response shown in Listing 2. Notice the two subdirectories under /usr/local/modulefiles/compiler, one for each compiler “family.” Under the GCC compiler family is another subdirectory for each version of the GCC compiler that have modules. In this case, it is version 8.1 only.
Listing 2: Viewing Loaded MICH Module
$ module avail --------------- /usr/local/modulefiles/compiler/gcc/8.1 ---------------- mpich/3.2 openmpi/3.1 --------------------- /usr/local/modulefiles/Core ---------------------- gcc/6.2 gcc/7.1 gcc/8.1 (L,D) pgi/16.10 pgi/18.4 (D) ---------------- /usr/local/lmod/lmod/modulefiles/Core ----------------- lmod/6.5 settarg/6.5 Where: L: Module is loaded D: Default Module Use "module spider" to find all possible modules. Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".
Under that subdirectory lie all applications dependent on the GCC 8.1 compiler. For example, Figure 1 shows two MPI library families, mpich and openmpi. Under these directories are the module files corresponding to the specific MPI library version. These modules are denoted with an (f) next to their name.
Try loading an MPI module to see how Lmod “screens out” modules that are not compatible with the loaded modules. The command module avail lists the modules that are available; however, Lmod is smart enough to show only the modules that are available on the basis of currently loaded modules. For example, you can load the mpich/3.2 modules and then module list the currently loaded modules.
Loading the mpich/3.2 module should modify the $PATH, $LD_LIBRARY_PATH, and $MANPATH environment variables, as well as add some environment variables specific to MPICH, which you can check by looking at the paths to the mpicc and mpif77 scripts (Listing 3). Notice that mpicc and mpif77 point to the correct scripts (you can tell by the path).
Listing 3: module list Command
$ module load mpich/3.2 $ module list Currently Loaded Modules: 1) gcc/8.1 2) mpich/3.2 $ which mpicc ~/bin/gcc-8.1-mpich-3.2.1/bin/mpicc $ which mpif77 ~/bin/gcc-8.1-mpich-3.2.1/bin/mpif77
An important key to making everything work correctly is in the module files. I will take a deeper look at these module files for a better understanding.
Under the Module File Hood
Everything works just great with Lmod so far. Modules can be loaded, unloaded, deleted, purged, and so on. however, without good module files, Lmod would execute whatever commands you put in the file, which could cause problems. To understand what is happening with module files, the GCC 8.1 compiler module (it, too, is written in Lua) is shown in Listing 4.
Listing 4: GCC 8.1 Compiler Module
-- -*- lua -*- ------------------------------------------------------------------------ -- GCC 8.1 compilers - gcc, g++, and gfortran. (Version 8.1) ------------------------------------------------------------------------ help( [[ This module loads the gcc-8.1.0 compilers (8.1.0). The following additional environment variables are defined: CC (path to gcc compiler wrapper ) CXX (path to g++ compiler wrapper ) F77 (path to gfortran compiler wrapper ) F90 (path to gfortran compiler wrapper ) FC (path to gfortran compiler wrapper ) See the man pages for gcc, g++, gfortran (f77, f90). For more detailed information on available compiler options and command-line syntax. ]]) -- Local variables local version = "8.1" local base = "/home/laytonjb/bin/gcc-8.1.0/" -- Whatis description whatis("Description: GCC 8.1.0 compilers") whatis("URL: www.gnu.org") -- Take care of $PATH, $LD_LIBRARY_PATH, $MANPATH prepend_path("PATH", pathJoin(base,"bin")) prepend_path("PATH", pathJoin(base,"sbin")) prepend_path("PATH", pathJoin(base,"include")) prepend_path("LD_LIBRARY_PATH", pathJoin(base,"lib")) prepend_path("LD_LIBRARY_PATH", pathJoin(base,"lib64")) prepend_path("MANPATH", pathJoin(base,"share/man")) -- Environment Variables pushenv("CC", pathJoin(base,"bin","gcc")) pushenv("CXX", pathJoin(base,"bin","g++")) pushenv("F77", pathJoin(base,"bin","gfortran")) pushenv("FORT", pathJoin(base,"bin","gfortran")) pushenv("cc", pathJoin(base,"bin","gcc")) pushenv("cxx", pathJoin(base,"bin","g++")) pushenv("f77", pathJoin(base,"bin","gfortran")) pushenv("fort", pathJoin(base,"bin","gfortran")) pushenv("FC", pathJoin(base,"bin","gfortran")) pushenv("fc", pathJoin(base,"bin","gfortran")) -- Setup Modulepath for packages built by this compiler local mroot = os.getenv("MODULEPATH_ROOT") local mdir = pathJoin(mroot,"compiler/gcc", version) prepend_path("MODULEPATH", mdir) -- Set family for this module family("compiler")
The GNU 8.1 compilers were installed in my home directory on my laptop, so I am not too worried about where exactly the compilers are installed. However, for clusters, I would install them on an NFS-shared filesystem, such as /usr/local/ or /opt/. Installing them individually on each node is ripe for problems.
The module file can be broken down into several sections. The first part of the file is the help function, which is printed to stdout when you ask for module help. The next section defines the major environment variables $PATH, $LD_LIBRARY_PATH, and $MANPATH. Notice that the function prepend_path is used to put the compiler “first” in these environment variables.
The third major section of the module file is where the specific environment variables for the compiler are defined. For this module, the variables are pretty straightforward: CC, cc, f77, F77, and so on. These variables are specific to the compiler and are defined with the pushenv function, which pushes the variables into the environment. It also uses the pathJoin function, which helps creates the correct paths for these variables.
The last section is key to Lmod, with the definition of two environment variables: $MODULEPATH and $MODULEPATH_ROOT. The line
local mdir = pathJoin(mroot,"compiler/gcc", version)
creates a local variable named mdir, which is a concatenation of the mroot variable ($MODULEPATH_ROOT) and compiler/gcc. It tells Lmod that subsequent module avail commands should look at the compiler/gcc subdirectory under the main module directory corresponding to the compilers just loaded (gcc/8.1). As the writer of the modules, you control where the module files that depend on the compilers are located. This step is the key to module hierarchy. You can control what modules are subsequently available by manipulating the mdir variable. This Lmod attribute gives you great flexibility.
The very last line in the module file, the statement family("compiler"), although optional, simplifies everything for users (i.e., it is a best practice). The function family tells Lmod to which family the module belongs. A user can only have one module per family loaded at a time. In this case, the family is compiler, so that means no other compilers can be loaded. (You would hope all other compiler modules also use this family statement.) Adding this line helps users prevent self-inflicted problems. Even though the statement is somewhat optional, I highly recommend using it.
If the GCC 8.1 compiler is loaded, then the diagram of the module layout should look something like Figure 2. The green labels indicate the compiler module that is loaded. The red labels indicate the path to the modules that depend on it (the MPI modules). Note that the MPI modules are under the compiler directory because they depend on the compiler module that is loaded.
In the previous section, I loaded the mpich/3.2 module associated with the GCC 8.1 compiler. Listing 5 for the mpich/3.2 module file was built with the GCC 8.1 compiler.
Listing 5: MPICH 3.2 Module File
-- -*- lua -*- ------------------------------------------------------------------------ -- mpich-3.2 (3.2.1) support. Built with gcc-8.1 (8.1.0) ------------------------------------------------------------------------ help( [[ This module loads the mpich-3.2 MPI library built with gcc-8.1. compilers (8.1.0). It updates the PATH, LD_LIBRARY_PATH, and MANPATH environment variables to access the tools for building MPI applications using MPICH, libraries, and available man pages, respectively. This was built using the GCC compilers, version 8.1.0. The following additional environment variables are also defined: MPICC (path to mpicc compiler wrapper ) MPICXX (path to mpicxx compiler wrapper ) MPIF77 (path to mpif77 compiler wrapper ) MPIF90 (path to mpif90 compiler wrapper ) MPIFORT (path to mpifort compiler wrapper ) See the man pages for mpicc, mpicxx, mpif77, and mpif90. For more detailed information on available compiler options and command-line syntax. Also see the man pages for mpirun or mpiexec on executing MPI applications. ]]) -- Local variables local version = "3.2" local base = "/home/laytonjb/bin/gcc-8.1-mpich-3.2.1" -- Whatis description whatis("Description: MPICH-3.2 with GNU 8.1 compilers") whatis("URL: www.mpich.org") -- Take care of $PATH, $LD_LIBRARY_PATH, $MANPATH prepend_path("PATH", pathJoin(base,"bin")) prepend_path("PATH", pathJoin(base,"include")) prepend_path("LD_LIBRARY_PATH", pathJoin(base,"lib")) prepend_path("MANPATH", pathJoin(base,"share/man")) -- Environment Variables pushenv("MPICC", pathJoin(base,"bin","mpicc")) pushenv("MPICXX", pathJoin(base,"bin","mpic++")) pushenv("MPIF90", pathJoin(base,"bin","mpif90")) pushenv("MPIF77", pathJoin(base,"bin","mpif77")) pushenv("MPIFORT", pathJoin(base,"bin","mpifort")) pushenv("mpicc", pathJoin(base,"bin","mpicc")) pushenv("mpicxx", pathJoin(base,"bin","mpic++")) pushenv("mpif90", pathJoin(base,"bin","mpif90")) pushenv("mpif77", pathJoin(base,"bin","mpif77")) pushenv("mpifort", pathJoin(base,"bin","mpifort")) -- Setup Modulepath for packages built by this compiler/mpi local mroot = os.getenv("MODULEPATH_ROOT") local mdir = pathJoin(mroot,"mpi/gcc", "8.1","mpich","3.2") prepend_path("MODULEPATH", mdir) -- Set family for this module (mpi) family("mpi")
If you compare this module file to the compiler module file, you will see many similarities. The classic environment variables, $PATH, $LD_LIBRARY_PATH, and $MANPATH, are modified and certain environment variables are defined. Because you want the MPI tools that are associated with the module to be “first” in $PATH, the Lmod module command prepend_path is used again.
Toward the end of the file, examine the code for Modulepath. The local variable mdir points to the “new” module subdirectory, which is mpi/gcc/8.1/mpich/3.2. (Technically, the full path is /usr/local/modulefiles/mpi/gcc/8.1/mpich/3.2 because $MODULEPATH_ROOT is /usr/local/modulefiles.) In this subdirectory, you should place all modules that point to tools that have been built with both the gcc/8.1 compilers and the mpich/3.2 tools. Examples of module files that depend on both a compiler and an MPI tool are applications or libraries such as PETSc. Although not shown here, it is not too difficult to extend the MPI module file to depend on both a compiler and an MPI library.
Also notice that the mpich/3.2 module uses the family() function so that the user cannot load a second MPI module. You could even have a family() function for libraries such as PETSc.
Module Usage for the Admin
In an article from a couple of years ago, I presented a way to gather logs about Tcl/C environment module usage. It was a bit of kludge, but it did allow me to gather data. With Lmod, this ability was brought to the forefront.
Tracking module usage is conceptually fairly easy, but a number of steps are involved. Having this information can be amazingly important, because it allows you to track which tools are used the most. (I associate one tool with one module.) If you have various versions of a specific tool, it allows you to track the usage of each, so you can either deprecate an older version or justify keeping it around and maintaining it. You can also see which modules are used as a function of time, which helps you understand when people run their jobs and what modules they use.
Summary
Although I have written about Lmod before, I continue to come back to it because it is so useful. It greatly helps users sort out their environment so they do not accidentally load conflicting libraries and tools. The first time you have to debug a user’s code when they have mixed MPI implementations, you will be thankful for Lmod.
Environment modules in general, and Lmod specifically, allow you to keep multiple versions of the same package on a system to service applications that have been built with older versions of a compiler, MPI, or library, or even old libraries that are needed. I even saw a somewhat recent posting to the Open MPI mailing list asking about LAM-MPI, even though it basically has been dead for a decade. You would be surprised how long applications stick around and bring their dependencies with them.
Because Lmod can read Tcl/C module files in addition to Lua (the preferred language), you can move easily from Tcl/C Environment Modules to Lmod. As you can see from the Lua module file examples here, the syntax is very clean and simple, making them very easy to read.
Finally, Lmod is developing tools that allow you to collect module usage and put it into a database that you can mine – which is very cool stuff, indeed.
The Author
Jeff Layton has been in the HPC business for almost 25 years (starting when he was 4 years old). He can be found lounging around at a nearby Frys enjoying the coffee and waiting for sales.