High-Performance Python – Distributed Python
Scale Python GPU code to distributed systems and your laptop.
High-Performance Python – GPUs
Python finally has interoperable tools for programming the GPU – with or without CUDA.
High-Performance Python – Compiled Code and Fortran Interface
Fortran functions called from Python bring complex computations to a scriptable language.
High-Performance Python – Compiled Code and C Interface
Although Python is a popular language, in the high-performance world, it is not known for being fast. A number of tactics have been employed to make Python faster. We look at three: Numba, Cython, and ctypes.
OpenMP – Coding Habits and GPUs
In this third and last article on OpenMP, we look at good OpenMP coding habits and present a short introduction to employing OpenMP with GPUs.
Porting Code to OpenACC
OpenACC directives can improve performance if you know how to find where parallel code will make the greatest difference.
OpenACC Directives for Data Movement
The OpenACC data directive and its associated clauses allow you to control the movement of data between the host CPU and the accelerator GPU.
Parallelizing Code – Loops
OpenACC is a great tool for parallelizing applications for a variety of processors. In this article, I look at one of the most powerful directives, loop.