Resource Management with Slurm

Slurm Job Scheduling System

squeue

To print a list of jobs in the job queue or for a particular user, use squeue. For example,

$ squeue -u akitzmiller

lists the jobs for a particular user.

sacct

The sacct command displays the accounting data for all jobs and job steps in the Slurm job accounting log or Slurm database, and you can run the command against a specific job number:

$ sacct -j 999999

Summary

A resource manager is one of the most critical pieces of software in HPC. It allows systems and their resources to be shared efficiently, and it is remarkably flexible, allowing the creation of multiple queues according to resource types or generic resources (e.g., GPUs in this article). Slurm also has job accounting by default.

The Slurm resource manager is one of the most common job schedulers in use today for very good reasons, some of which I covered here. Prepare to be "Slurmed."

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus