Warewulf 4 – Time and Resource Management
Warewulf installed with a compute node is not really an HPC cluster; you need to ensure precise time keeping and add a resource manager.
Analyzing Logs
Log analysis can be used to great effect in HPC systems. We present an overview of the current log analysis technologies.
Log Management
One of the more mundane, perhaps boring, but necessary administration tasks is checking system logs – the source of knowledge or intelligence of what is happening in the cluster.
Parallel I/O Chases Amdahl Away
Scalability abhors serial computation, but parallel I/O can defeat those limitations.
Sharing a Linux Terminal Over the Web
The ability to share a terminal over the web could multiply the effectiveness of admins and users. The tty-share tool might be the answer.
Sharing Linux Terminals
Sometimes sharing a screen between two users is enormously helpful. We look at two terminal sharing tools: screen and tmux.
Performance Health Check
Many HPC systems check the state of a node before running an application, but not very many check that the performance of the node is acceptable before running the job.