« Previous 1 2 3 4
Resource Management with Slurm
Slurm Job Scheduling System
squeue
To print a list of jobs in the job queue or for a particular user, use squeue
. For example,
$ squeue -u akitzmiller
lists the jobs for a particular user.
sacct
The sacct
command displays the accounting data for all jobs and job steps in the Slurm job accounting log or Slurm database, and you can run the command against a specific job number:
$ sacct -j 999999
Summary
A resource manager is one of the most critical pieces of software in HPC. It allows systems and their resources to be shared efficiently, and it is remarkably flexible, allowing the creation of multiple queues according to resource types or generic resources (e.g., GPUs in this article). Slurm also has job accounting by default.
The Slurm resource manager is one of the most common job schedulers in use today for very good reasons, some of which I covered here. Prepare to be "Slurmed."
Infos
- "pdsh Parallel Shell" by Jeff Layton: http://www.admin-magazine.com/HPC/Articles/pdsh-Parallel-Shell
- "Environment Modules Using Lmod" by Jeff Layton: http://www.admin-magazine.com/HPC/Articles/Environment-Modules-Using-Lmod
- "Shared storage with NFS and SSHFS" by Jeff Layton: http://www.admin-magazine.com/HPC/Articles/Shared-Storage-with-NFS-and-SSHFS
- Slurm: https://slurm.schedmd.com/
- SchedMD: https://www.schedmd.com/
- Groupe Bull: https://atos.net/en/products
- Slurm's three functions: https://slurm.schedmd.com/overview.html
- Installing Slurm binaries on Ubuntu 16.04: https://github.com/mknoxnv/ubuntu-slurm
- MUNGE: https://dun.github.io/munge/
« Previous 1 2 3 4
Buy this article as PDF
(incl. VAT)
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.