Rocky Reaches for a Role in HPC
When Red Hat announced that it would stop supporting CentOS as a free Linux distro based on Red Hat Enterprise Linux (RHEL), the whole Linux IT world had to scramble, but nowhere was that crisis more acute than within the HPC community. CentOS, which offered RHEL-level stability, testing, and performance without the expensive licensing fees, was a cornerstone of the world’s HPC infrastructure.
One reason for the importance of CentOS to the HPC market was cost. The largest HPC systems on the TOP500 list are created with huge budgets where the operating system is less of a factor in the overall cost, but on all the other HPC systems, running in universities, research institutions, and corporate server rooms around the world, economy really does matter. Although Red Hat would love to sell you a paid license for every system in your HPC cluster, the cost is often prohibitive, and, in many cases, unnecessary. CentOS, which was 100 percent compatible with RHEL, could run on the same systems without the licensing cost. Some HPC systems ran all CentOS; others put RHEL on the head nodes and CentOS on the compute nodes. Either way, the sudden loss of CentOS was a big deal for the HPC community. CentOS gave institutions the flexibility to balance software licensing cost versus hardware cost, and it was clear that something new would have to emerge to fill the niche. According to many HPC users and developers, that something is Rocky Linux. Although other RHEL clones have appeared since Red Hat retired CentOS, Rocky appears to be the one that is most invested in filling the vacancy in the HPC space.
Gregory Kurtzer, who founded Rocky Linux, was one of the founders of CentOS and has long-standing ties with the CentOS community. Kurtzer also has close ties with HPC and is credited with founding several high-profile HPC initiatives, such as Warewulf and Apptainer (formally Singularity). In parallel with launching Rocky Linux, Kurtzer started a company called CIQ to support Rocky development, consolidate ongoing work for the various HPC projects he is associated with, and drive development of a new generation of tools for compute-intensive workloads. The Rocky Linux project was announced within hours after Red Hat announced it was moving CentOS upstream (see the box entitled “Where Did CentOS Go?”).
Did Red Hat make CentOS disappear? It depends on how you look at it. The important thing is that CentOS Linux will no longer be based on the final RHEL source code. Red Hat launched a new project called CentOS Stream that is located upstream from RHEL in the development process and thus will not have the tuning, testing, and validation that comes with RHEL.
What Is Rocky
Rocky Linux bills itself as “bug-for-bug compatible with Red Hat Enterprise Linux” (Figure 1). The Rocky system is compiled from the same sources as RHEL. If you’re an HPC developer or user, and you’re wondering what makes Rocky Linux different from RHEL, the answer is in the energy, the approach, and the connections with the HPC community, in addition to a governance structure that eliminates the possibility of a sudden end like the fate that befell CentOS. According to Kurtzer, Rocky had more than 10,000 members in its Slack space within six weeks after the launch, and they eventually got too big to use Slack effectively. The Rocky Linux community now includes many thousands of users through its online forum and Mattermost site.
Rocky’s ties with the HPC community begin with Kurtzer and lead to other Rocky developers working to integrate Rocky Linux with the HPC environment. The Rocky team is currently working closely with the OpenHPC project. (OpenHPC is an effort to consolidate the most important open source HPC components into a single working environment.) And the OpenHPC project is equally interested in building roads back to Rocky. OpenHPC has announced that the testing and installation recipes for CentOS will transition to Rocky Linux instead. According to the OpenHPC project, “With the announcement that CentOS8 is being discontinued at the end of 2021, the OpenHPC project is migrating example installation recipes and associated testing to use Rocky 8.”
OpenHPC is not directly affiliated with any Linux brands, but they do base their development and testing on a few reference distros. The RHEL clones are all compatible, so theoretically, any one of them could run OpenHPC, but the use of Rocky Linux for testing and installation recipes underscore its emergence as the RHEL clone of choice for HPC settings.
The Big Picture
CIQ is currently developing and promoting some tools to expand and extend Rocky within the HPC space. Much of this work centers around open source utilities that Kurtzer developed independently and has now brought into focus through CIQ. These tools are available to everyone, not just CIQ, but they are at the center of CIQ’s support offering, and Kurtzer’s presence as a guiding force for Rocky Linux brings additional synergies.
Warewulf and Apptainer/Singularity have both been around for many years. Warewulf is a cluster-management and provisioning tool that has been used in HPC clusters for more than 20 years for large-scale stateless operating system management. The goal of Apptainer is to bring the benefits of container technology to the HPC space. One of those benefits is portability. By encapsulating the software stack with the application, Apptainer offers a chance for flexibility across a complex hardware environment without compromising performance. Another benefit of a container architecture is a security model that is more appropriately optimized for clustering and supply chain security. CIQ envisions combining Rocky Linux, Warewulf, and Apptainer with OpenHPC to form an integrated HPC stack.”
CIQ’s ambitious Fuzzball project, on the other hand, charts a whole new parallel direction for HPC and could one day serve as a building block for the next-generation HPC platform, HPC-2.0. The Fuzzball developers refer to their project as a “…cloud-native, cloud-hybrid, neutral, meta-orchestration platform for all performance-intensive workloads and data.” Fuzzball, which will debut later this year, is intended as a complete rethink of the traditional concept of an HPC cluster as a flat, monolithic collection of parallel nodes with shared storage and management through SSH. The Fuzzball environment is API-driven and provides for a configurable HPC infrastructure that is well suited for cloud and cloud hybrid settings.