Embarrassingly parallel computation
Playing Roulette
In a previous article, I examined the basics of parallel programming [1], including key terms such as speedup , defining the ratio between the original, single-threaded implementation and the faster parallel version. Amdahl's Law [2] critically observes that you can parallelize only part of an application, which caps the speedup ratio in most programs – and should make you raise an eyebrow in Vulcan fashion when promised double-digit speedups like 10x by a vendor who has yet to see your code. Accomplishing a 10x speedup requires that 90 percent of the execution time be parallelized.
The span of your code's critical section therefore determines how much the application can actually be accelerated: If one minute of a 20-minute process is single threaded, the maximum speedup theoretically possible is 20x because only 95 percent of the algorithm can execute in parallel (compute the fraction 1/20 from that 5% number). That limitation led to a search for embarrassingly parallel algorithms, characterized by minimal critical sections and with well-known designs already architected (see the "Embarrassingly Parallel Algorithms" box).
Embarrassingly Parallel Algorithms
An algorithm is considered "embarrassingly parallel" [3] where no design effort is required to partition the problem into completely separate parts. If no data dependency exists between the problem sub-parts, no communication coordination (and corresponding computational stall) ever takes place, making the problem trivial to partition. Of course, there is nothing embarrassing about using these algorithms – quite the contrary, it is the hallmark of good architecture. Some examples of
Buy this article as PDF
(incl. VAT)