Exploring AMD’s Ambitious ROCm Initiative
Support for Other Languages
The versatile LLVM/Clang compiler infrastructure means ROCm supports a wide range of programmer preferences within the C/C++ language family, from standard C, to standard C++, to STL parallel extensions, to the turbo-charged GPU-based features embodied in C++ AMP. HIP and the HIP conversion options bring CUDA into the mix. Beyond C and C++, the ROCm platform also supports Python Anaconda. Anaconda is a specialized version of Python tailored for scientific computing and large-scale data processing. ROCm also provides native support for the OpenCL (Open Compute Language) framework. OpenCL is an open standard maintained by the nonprofit Khronos group that was originally envisioned as a heterogeneous framework for supporting CPU- and GPU-based computing in parallel programming environments. In other words, OpenCL has some goals that are very similar to ROCm. AMD is a member of the Khronos group and has invested heavily over the years in OpenCL as a framework for GPU-accelerated programming.
The OpenMP parallel programming API supports offloading to Radeon GPUs through Clang, so developers can access the advanced capabilities of Radeon GPUs from within OpenMP.
HSA-Compliant System Runtime
The foundation for the ROCm environment is the ROCm kernel driver and system runtime stack. The ROCr system runtime API, which resides above the kernel driver, is a language-independent runtime that complies with Heterogeneous System Architecture (HSA) specifications. HSA is an industry standard designed to support the integration of GPUs and CPUs with shared tasks and scheduling.
The modular form of the ROCm system runtime stack means the system runtime could one day support additional programming languages and additional hardware acceleration devices. In the true spirit of heterogeneous computing, the kernel layer below is also implemented as a separate module to facilitate easy porting to other kernel environments.
ROCm also supports a number of powerful APIs and libraries to optimize performance for GPU-based HPC scenarios. For instance, the RCCL (pronounced “rickle”) collective library is a powerful tool for supporting multiple GPUs on a single node in single- and multi-process operations, as well as extending the environment to include multi-node communication.