We review how to accomplish simple, obvious tasks that are usually not addressed in HPC, starting with how to add users.

KISS – Adding a New User

KISS. Occam’s razor. Betty Crocker’s Cookbook. A little black dress. A handshake. Four gigabytes of RAM. These examples exemplify the power of basics and simplicity. The idea of simplicity popped into my brain a couple of weeks ago when I was talking with several people about how manuals don’t include instructions for simple, obvious tasks because they either assume you can figure it out on your own or that the execution of the tasks is common knowledge. The problem with this thinking is that it assumes the world is homogeneous, and it's not.

I then started thinking about these ideas in relation to high-performance computing (HPC). People have different levels of experience and knowledge about HPC and Linux, so the world of HPC is definitely not homogeneous. One day I asked the question, “How do you add a user to HPC systems?” The answers I got back were enlightening.

Some people, primarily admins or users, simply execute the useradd command and are done, often with variations and some nuance. On the other hand, some responses were along the lines of, “I’m not sure because I’ve never done that.” To me, the overall responses meant the people who answered (including myself) had a wide range of experience.

I immediately had the idea of applying basics and simplicity to adding users to the system. At first glance it seems straightforward, but HPC systems might involve a few more steps. To begin, I want you to assume you have a new user named Sam to add to an HPC system, and I’ll walk you through the steps of adding him to the system.

Talk to the New User

Before creating a new user sam on the system, setting a password, and wishing them good luck, shouldn’t you talk to Sam and find out a bit about his experience with HPC and whether he has some requests? A simple question would be whether he has a preferred username? I have a preference for a login name that I’ve been using for years simply out of habit, but I’m flexible if it doesn’t match the policy of the HPC system administrator. Perhaps Sam has a preferred username. Don’t assume, ask.

To make the onboarding experience easy, however, you should ask some other questions. One that pops into my head is whether they have a shell preference. The answer to this question can reveal the level of the user’s experience. If they can answer quickly (e.g., they like Zsh), you know they have a pretty good amount of experience with Linux and possibly HPC. However, if Sam looks at you funny and isn’t quite sure what “shell” means, you can set up their account with the Bash shell and tell them not to worry about what a shell is.

You don’t want to turn your interaction into an inquisition, but some other questions come to mind. I tend to stop after the question, “Do you need to be a part of any groups?” Again, the response can give you some information to help them. If Sam says he doesn’t know, you have a better understanding of his level of experience, and you can ask his manager or perhaps a colleague whether Sam should belong to any groups. If Sam answers “no,” you have an indication that he might have a good level of experience, but it’s perhaps not complete.

These three simple questions can give you the information you need to create an account for him and to give some indication of his level of experience. Don’t make the mistake of using just these three answers to ascertain his experience level, though, but use them as an indicator to what information you might be able to skip over and what level of help might be needed.

Create the User

At this point, I like to create the new account as a starting point. This is the time you will use the answers to your previous questions to get the account started. The useradd syntax is straightforward, but I like to specify several options and not rely on defaults, because they change over time. To add a new user named sam, I would use the command:

$ sudo useradd -m -d /home/sam -s /bin/bash sam

The -m -d options allow you to specify a home directory for the added user that isn't the default (/home/user); in this case, I specify /home as the home directory. The -s option allows you to specify a shell for the user; in this case, Bash.

The useradd command has other options to specify the user id (uid) and the group id (gid); to add the user to additional groups; or to add a comment for the user, an expiration date for the account, and more. I find that these options are not needed when a new user is first added because you can change them at any time. The command, as it stands, assumes no additional groups need to be assigned at this time.

Set a Temporary Password

Note that I did not specify the user's password when the user was created. I like to keep the step that sets their initial password separate because I want to force the user to change it when they log in for the first time:

$ sudo passwd -e sam

The -e option indicates that the password has expired, so the user needs to change it when they log in. You can also require that passwords use specific patterns (e.g., a minimum password length, password complexity, etc), so you should have a policy in place before you start adding new users.

Add the User to Groups

If sam needs to be added to other groups, I use the usermod command. I tend to do this after the basic account is created. If I want to add sam to a group named horovod, the command would be:

$ sudo usermod -a -G horovod sam

If sam needs to be in more than one group, add a comma after horovod and append a comma-delimited list of as many groups as needed.

Note that the usermod command is very powerful for user management because you can use it to change aspects of the user's account (e.g., the shell, the expiration date, groups, user id, group id, etc.).

Quotas for /home

When the account for sam was created, it was assigned the /home/sam home directory. However, if you are administering a cluster with multiple users, you should probably be using storage quotas so that no one user can hog the space. In the early days of Linux clusters, I saw a few hundred gigabytes of storage added to a system (that was quite a bit of storage then). Within minutes it was over 80% used. Ever since then, I insist on assigning quotas for multiuser systems. A good place to start is with an article on the DigitalOcean site. (DigitalOcean has some of the best manuals and how-tos I’ve ever seen. They are really well written.)

Assume you have quotas configured and enabled for /home for users and groups, even if it is an external storage solution, which might have its own quota mechanism. You can use the edquota command to edit a user’s quota or the setquota command to set the user’s quota.

Grant Access to Certain Storage

HPC systems almost always have a “fast” storage layer that is mounted on all the compute nodes. Typically, this is used as scratch space for running your code and is focused on performance; therefore, it usually is not backed up nor has quotas enabled. If it crashes and you have data on the scratch storage, you can assume it is gone. It also means that someone could easily use all of the storage very quickly.

How the scratch storage is configured is determined by cluster policies. A common pattern is for each user to be given a directory on the filesystem where they can create subdirectories and data files and generally use it for storage for their work. Sometimes, groups or teams will share a directory on the scratch space, but this is not as common as each user having their own directory on the scratch space.

When a user is added to the cluster, you will need to create a directory on the scratch space for them. Don’t forget to chown the directory to the user so they can read and write to it. Be sure you tell the new user about this storage and its risks. In particular, you should remind users that scratch space is not infinite, so they need to police themselves or the scratch space will fill up.

Although administering scratch storage has nothing to do with adding new users to the system, one thing you can do every so often is find the top N users of the space in terms of capacity and send them an email that politely asks them to check their storage usage. If this technique isn’t effective, then perhaps once a week you can publish a list of these accounts to everyone using the system, along with the message that the storage is not backed up or unlimited. Sometimes this reminder will encourage these users to move or compress or erase data that has been there a while. If abuse of the storage space gets too bad, you can also create a list of the top 10 oldest files and publish that to all users (nothing like peer pressure).

Although I don’t recommend it, some systems set a time limit on data in scratch storage. If a file is older than that time, a “sweeper” cron job moves their data to slower speed storage and chown it to root so that no one except root can access the file. If a user really has to have that data, they will have to ask for it, and you, the admin, can take the opportunity to have a discussion about what “limited time” and “limited space” mean.

Communicate Details About the Account

The final step seems obvious, but it is the most important: Give details to the new user about their account. I like to send a list of things they need with links to more information, such as:

  • the account name;
  • the initial password, along with instructions on changing it on first login, with a link to any password policies the system might have and the command to change their password;
  • their current shell and the command to change it;
  • a list of groups to which the user has been granted access;
  • the home directory location;
  • any quota on their home directory and how much space is available, along with a link to the system policy on /home space and how to request more; and
  • the location of their high-speed scratch space and general policies on using it, with a link to the policy.

Finally, I would add some information on how to get help and any general policies that apply to the system.

Summary

Revisiting seemingly simple tasks from time to time that are fundamental to the operation of HPC systems is a good idea. I took the theme of simplicity and basics and applied it to adding a user to an HPC system. I hope this review helps.