TUIs, a Smoke-Jumping Admin’s Best Friend

Sys admins are like smokejumpers who parachute into fires, fighting them until they are out, or at least under control. When you jump into the fire, you only have the tools you brought with you.

Slurm Job Scheduling System

One way to share HPC systems among several users is to use a software tool called a resource manager. Slurm, probably the most common job scheduler in use today, is open source, scalable, and easy to install and customize.

An Introduction to SymPy

The Python SymPy library for symbolic mathematics allows you to create complex mathematical expressions.

Shared Storage with NFS and SSHFS

HPC systems require shared filesystems to function effectively. Two really good choices for both small and large systems are NFS and SSHFS.

Environment Modules Using Lmod

The indispensable Lmod high-performance computing tool allows users to control their build and execution environment.

pdsh Parallel Shell

The pdsh parallel shell tool lets you run a command across multiple nodes in a cluster.

Building Containers with HPC Container Maker

Building HPC applications for production systems is never easy, especially when containers are involved, but with Python and HPC Container Maker, you can describe the container you want quickly and easily without having to worry about the details.

pyamgx – Accelerated Python Library

Sometimes your Python programs need a little more speed. The pyamgx library can help you speed up your Python code.

Autonomous File Recovery

Let users recover a deleted file without admin intervention by aliasing the rm command with mv or by writing your own script that moves the data to another location.

Linux I/O Schedulers

The Linux kernel has several I/O schedulers that can greatly influence performance. We take a quick look at I/O scheduler concepts and the options that exist within Linux.