Planning Performance Without Running Binaries

Blade Runner

Roy Batty's Tears

The power of Amdahl's law is found in its analytical insight. Code is measured in time, not in lines, so some minimal performance testing is still required. If you determine that only 50 percent of an algorithm's critical section can be parallelized, its theoretical speedup can't exceed 2x, as you see in Figure 2. Furthermore, it's not practical to use more than 12 cores to run this code, because it can attain more than 90 percent of the maximum theoretical speedup with 12 cores (a 1.84x speedup). You know this before attempting any optimization, saving you effort if the best possible result is inadequate to achieving your aims.

Figure 2: Maximum theoretical speedup with 50 percent serial code.

In an alternative scenario with only five percent serial code in the bottleneck (Figure 3), the asymptote is at 20x speedup. In other words, if you can successfully parallelize 95 percent of the problem, under ideal circumstances the maximum speedup for that problem is 20x. This handy analysis tool can quickly determine what can be accomplished by accelerating code for a problem of fixed size.

Figure 3: Theoretical speedup with five percent serial code. The lower curve is from Figure 2 for comparison.

Without invoking the C-beams speech or even going near the Tannhäuser Gate [7], one must point out parallelism's massive overhead, already obvious from the examples here. Execution time can be significantly reduced, yet you accomplish this by throwing resources at the problem – perhaps suboptimally. On the other hand, one could argue that idle CPU cores would not be doing anything productive. All those cycles would be lost … like tears in rain.

The Author

Federico Lucifredi (@0xf2) is the Product Management Director for Ceph Storage at Red Hat and was formerly the Ubuntu Server Product Manager at Canonical and the Linux "Systems Management Czar" at SUSE. He enjoys arcane hardware issues and shell-scripting mysteries and takes his McFlurry shaken, not stirred. You can read more from him in the new O'Reilly title AWS System Administration .

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Why Good Applications Don’t Scale

    You  ha ve parallelized your serial application ,  but as you use more cores you are  n o t seeing any improvement  in performance . What gives?

  • Why Good Applications Don't Scale
    You have parallelized your serial application, but as you use more cores you are not seeing any improvement in performance. What gives?
  • Failure to Scale

    Your parallel application is running fine, but you want it to run faster. Naturally, you use more and more cores, and everything is great; however, suddenly performance starts decreasing. What just happened?

  • Improved Performance with Parallel I/O

    Understanding the I/O pattern of your application is the starting point for improving its I/O performance, especially if I/O is a fairly large part of your application’s run time.

  • Improved Performance with Parallel I/O
    Understanding the I/O pattern of your application is the starting point for improving its I/O performance, especially if I/O is a fairly large part of your application's run time.
comments powered by Disqus