Resource Management with Slurm

Slurm Job Scheduling System

Configuring Slurm

Slurm is very flexible, and you can configure it for almost any scenario. The first configuration file is slurm.conf (Listing 1).

Listing 1

slurm.conf

01 #
02 # Example slurm.conf file. Please run configurator.html
03 # (in doc/html) to build a configuration file customized
04 # for your environment.
05 #
06 #
07 # slurm.conf file generated by configurator.html.
08 #
09 # See the slurm.conf man page for more information.
10 #
11 ClusterName=compute-cluster
12 ControlMachine=slurm-ctrl
13 #
14 SlurmUser=slurm
15 SlurmctldPort=6817
16 SlurmdPort=6818
17 AuthType=auth/munge
18 StateSaveLocation=/var/spool/slurm/ctld
19 SlurmdSpoolDir=/var/spool/slurm/d
20 SwitchType=switch/none
21 MpiDefault=none
22 SlurmctldPidFile=/var/run/slurmctld.pid
23 SlurmdPidFile=/var/run/slurmd.pid
24 ProctrackType=proctrack/cgroup
25 PluginDir=/usr/lib/slurm
26 ReturnToService=1
27 TaskPlugin=task/cgroup
28 # TIMERS
29 SlurmctldTimeout=300
30 SlurmdTimeout=300
31 InactiveLimit=0
32 MinJobAge=300
33 KillWait=30
34 Waittime=0
35 #
36 # SCHEDULING
37 SchedulerType=sched/backfill
38 SelectType=select/cons_res
39 SelectTypeParameters=CR_Core_Memory,CR_CORE_DEFAULT_DIST_BLOCK,\ CR_ONE_TASK_PER_CORE
40 FastSchedule=1
41 #
42 # LOGGING
43 SlurmctldDebug=3
44 SlurmctldLogFile=/var/log/slurmctld.log
45 SlurmdDebug=3
46 SlurmdLogFile=/var/log/slurmd.log
47 JobCompType=jobcomp/none
48 #
49 # ACCOUNTING
50 JobAcctGatherType=jobacct_gather/cgroup
51 AccountingStorageTRES=gres/gpu
52 DebugFlags=CPU_Bind,gres
53 AccountingStorageType=accounting_storage/slurmdbd
54 AccountingStorageHost=localhost
55 AccountingStoragePass=/var/run/munge/munge.socket.2
56 AccountingStorageUser=slurm
57 #
58 # COMPUTE NODES (PARTITIONS)
59 GresTypes=gpu
60 DefMemPerNode=64000
61 NodeName=linux1 Gres=gpu:8 CPUs=80 Sockets=2 CoresPerSocket=20 ThreadsPerCore=2 RealMemory=515896 State=UNKNOWN
62 PartitionName=debug Nodes=ALL Default=YES MaxTime=INFINITE State=UP

This file offers are a large number of configuration options, and the man pages can help explain them, so the following is just a quick review.

  • Notice the use of ports 6817 and 6818.
  • SchedulerType=sched/backfill tells Slurm to use the backfill scheduler.
  • In several places, GPUs are considered in the configuration:
AccountingStorageTRES=gres/gpu
GresTypes=gpu
NodeName=linux1 Gres=gpu:8 ...

The term gres, capitalized or not, stands for "generic resource." Slurm allows you to define resources beyond the defaults of run time, number of CPUs, and so on, and could include disk space or almost anything you can dream.

Two very important lines in the configuration file define the node names with their configuration and a partition for the compute nodes. For this configuration file, these lines are,

NodeName=slurm-node-0[0-1] Gres=gpu:2 CPUs=10 Sockets=1 CoresPerSocket=10 ThreadsPerCore=1 RealMemory=30000 State=UNKNOWN
PartitionName=compute Nodes=ALL Default=YES MaxTime=48:00:00 DefaultTime=04:00:00 MaxNodes=2 State=UP DefMemPerCPU=3000

Notice that you can use abbreviations for a range of nodes. They tell Slurm how many of the generic resources it contains (in this case, two GPUs); then, you can to tell it the number of cores, number of cores per socket, threads per core, and the amount of memory available (e.g., 30,000MB, or 30GB, here).

CgroupAutomount=yes
CgroupReleaseAgentDir="/etc/slurm/cgroup"
ConstrainCores=yes
ConstrainDevices=yes
ConstrainRAMSpace=yes
#TaskAffinity=yes

This file allows cgroups to restrict the number of cores, the devices, and the memory space (ConstrainRAMSpace), which allows you, the Slurm admin, to control and limit the number of cores and memory.

In the file gres.conf, you can configure the generic resources, which in this case are GPUs:

Name=gpu File=/dev/nvidia0 CPUs=0-4
Name=gpu File=/dev/nvidia1 CPUs=5-9

The first line says that the first GPU is associated with cores 0-4 (the first five cores, or half the cores in the node). The second line defines the second GPU for cores 5-9, or the second half of the cores. When submitting a job to Slurm that uses these resources, you can specify them with a simple option, for example,

$ srun --gres=gpu:1

which submits the job requesting a single GPU.

Common Slurm Commands

Slurm comes with a range of commands for administering, using, and monitoring a Slurm configuration. A number of tutorials detail their use, but to be complete, I will look at a few of the most common commands.

sinfo

The all-purpose command sinfo lets users discover how Slurm is configured. Listing 2 lists the status, time limit, node information, and node list of the p100 partition.

Listing 2

sinfo

$ sinfo -s
PARTITION  AVAIL  TIMELIMIT  NODES(A/I/O/T)  NODELIST
p100       up     infinite   4/9/3/16        node[212-213,215-218,220-229]

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus