Navigation
Scale Python GPU code to distributed systems and your laptop.
Python finally has interoperable tools for programming the GPU – with or without CUDA.
Fortran functions called from Python bring complex computations to a scriptable language.
Although Python is a popular language, in the high-performance world, it is not known for being fast. A number of tactics have been employed to make Python faster. We look at three: Numba, Cython, and ctypes.
In this third and last article on OpenMP, we look at good OpenMP coding habits and present a short introduction to employing OpenMP with GPUs.
Diving deeper into OpenMP loop directives for parallel code.
The powerful OpenMP parallel do directive creates parallel code for your loops.
OpenACC directives can improve performance if you know how to find where parallel code will make the greatest difference.
The OpenACC data directive and its associated clauses allow you to control the movement of data between the host CPU and the accelerator GPU.
OpenACC is a great tool for parallelizing applications for a variety of processors. In this article, I look at one of the most powerful directives, loop.
« Previous Next » 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ...22