Benchmarking Memory Bandwidth
One of the key bottlenecks for HPC application performance is memory bandwidth: literally, how fast you can get data from memory to the processor and back. A convenient microbenchmark named Stream measures the memory bandwidth of nodes and reveals a general trend over the last six years that might surprise you.
New OpenMP 4.0 Spec
Introduction of the new OpenMP specification abstracts away many of the thorny issues associated with today’s HPC hardware.
Parallel Tools
Even with tons of cores per node today, the traditional sets of tools are still serial-only, utilizing a single core; however, some of the more popular tools have parallel versions, allowing you to use the extra cores either to run the same command in parallel or to perform the same task across multiple nodes.
The RADOS Object Store and Ceph Filesystem – Part 4
We look into some everyday questions that administrators with Ceph clusters tend to ask: What do I do if a fire breaks out or I run out of space in the cluster?
Understanding I/O Patterns with strace, Part I
The language you choose to use affects I/O patterns and performance. We track a simple write I/O pattern with C and look at how to improve performance.
The YARN Invitation
Hadoop version 2 expands Hadoop beyond MapReduce and opens the door to MPI applications operating on large parallel data stores.
Why Isn’t Your Application Scaling?
Your parallel application is running fine, but you want it to run faster. Naturally, you use more and more cores, and everything is great; however, suddenly performance starts decreasing. What just happened?
The Road to End-of-Scale
The quest for exascale performance by the year 2020 is on. Delivering 1018 FLOPS could be slowed by the speed of light, among other things.
Getting Started with HPC Clusters
Getting started in the HPC world requires learning to write parallel applications and learning to administer and manage clusters. We take a look at some ways to get started.