Slurm Job Scheduling System
One way to share HPC systems among several users is to use a software tool called a resource manager. Slurm, probably the most common job scheduler in use today, is open source, scalable, and easy to install and customize.
An Introduction to SymPy
The Python SymPy library for symbolic mathematics allows you to create complex mathematical expressions.
Shared Storage with NFS and SSHFS
HPC systems require shared filesystems to function effectively. Two really good choices for both small and large systems are NFS and SSHFS.
Environment Modules Using Lmod
The indispensable Lmod high-performance computing tool allows users to control their build and execution environment.
pdsh Parallel Shell
The pdsh parallel shell tool lets you run a command across multiple nodes in a cluster.
Building Containers with HPC Container Maker
Building HPC applications for production systems is never easy, especially when containers are involved, but with Python and HPC Container Maker, you can describe the container you want quickly and easily without having to worry about the details.
pyamgx – Accelerated Python Library
Sometimes your Python programs need a little more speed. The pyamgx library can help you speed up your Python code.
Autonomous File Recovery
Let users recover a deleted file without admin intervention by aliasing the rm command with mv or by writing your own script that moves the data to another location.
Linux I/O Schedulers
The Linux kernel has several I/O schedulers that can greatly influence performance. We take a quick look at I/O scheduler concepts and the options that exist within Linux.
What to Do with System Data: Think Like a Vegan
What do you do with all of the HPC data you harvested as a lumberjack? You think like a Vegan.