« Previous 1 2 3 4
Fundamentals of I/O benchmarking
Measure for Measure
Profiling
In most cases, you will not be planning systems on a greenfield site but will want to replace or optimize an existing system. The requirements are typically not formulated in technical terms (e.g., latency < 1ms, 2,000 IOPS) but reflect the user's perspective (e.g., make it faster).
Because the storage system is already in use, in this case, you can determine its load profile. Excellent tools are built into the operating system for this: iostat
and blktrace
. Iostat has already been introduced. The blktrace
command not only reports the average size of I/O operations, it displays each operation separately with its size. Listing 6 contains an excerpt from the output generated by:
Listing 6
Excerpt from the Output of blktrace
8,0 0 22 0.000440106 0 C WM 23230472 + 16 [0] 8,0 0 0 0.000443398 0 m N cfq591A / complete rqnoidle 0 8,0 0 23 0.000445173 0 C WM 23230528 + 32 [0] 8,0 0 0 0.000447324 0 m N cfq591A / complete rqnoidle 0 8,0 0 0 0.000447822 0 m N cfq schedule dispatch 8,0 0 24 3.376123574 351 A W 10048816 + 8 <- (8,2) 7783728 8,0 0 25 3.376126942 351 Q W 10048816 + 8 [btrfs-submit-1] 8,0 0 26 3.376136493 351 G W 10048816 + 8 [btrfs-submit-1]
blktrace -d /dev/sda -o - | blkparse -i -
Values annotated with a plus sign (+
) give you the block size of the operation: in this example 16, 32, and 8 sectors. This corresponds to 8192, 16384, and 4096 bytes, which allows you to determined which block sizes are really relevant – and the storage performance for these values can be measured immediately using iops
. Blktrace shows much more, because it listens on the Linux I/O layer. Interesting here is that the opportunity to view the requested I/O block sizes and the distribution of read and write access.
The strace
command shows whether the workload is requesting asynchronous or synchronous I/O. Async I/O is based on the principle that a request for a block of data is sent and that the program continues working before the answer has arrived. When the data block then arrives, the consuming application receives a signal and can respond. As a result, the application is less sensitive to high storage latency and depends more on the throughput.
Conclusions
I/O benchmarks have three main goals: Identifying the optimal settings to ensure the best possible application performance, evaluating storage to be able to purchase or lease the best storage environment, and identifying performance bottlenecks.
Evaluation of storage systems – whether in the cloud or in the data center – should never be isolated from the needs of the application. Depending on the application, storage optimized for throughput, latency, IOPS, or CPU should be chosen. Databases require low latency for their logfiles, and many IOPS for the data files. File servers need high throughput. For other applications, the requirements can be determined with programs like iostat
, strace
, and blktrace
.
Once the requirements are known, the exact characteristics of the indicators can be tested using benchmarks. The performance of the I/O stack with or without the filesystem will be of interest, depending on the question. The iops
and iometer
tools bypass the filesystem.
Concentrating on the same metrics helps you investigate performance bottlenecks. Often, you will see a correlation between high latency, long request queues (measurable with iostat
), and negative user perception. Even if the throughput is far from exhausted, IOPS can cause a bottleneck that slows down storage performance.
The write and read caches work in opposite directions. The caching effect causes performance measurements for the read cache first to increase within a few cycles and drop after a short time for write caches. Many caches work with read-ahead for read operations. The effects of I/O elevator scheduling are added for write operations. Neighboring operations are merged in the cache for both reads and writes. Random I/O requests reduce the probability that adjacent blocks are affected – and thus affect access performance. Random I/O operations are at a disadvantage on magnetic hard disks, more so than SSDs.
If you keep an eye on the key figures and the influencing factors, you turn to benchmark tools. In this article, we presented two storage benchmark tools in the form of iometer
and iops
. For pure filesystem benchmarks, however, fio
, bonnie
, or iozone
are fine. Hdparm is useful for measuring the physical starting situation.
The blk-trace
, strace
, and iostat
tools are useful for monitoring. Iostat gives a good overview of all the indicators, whereas blktrace
listens in on individual operations. Strace can indicate whether async I/O is involved, which shifts the focus away from latency to throughput.
In the benchmark world, it's important to know that complexity causes many pitfalls. Only one thing helps: repeatedly questioning the plausibility of the measurement results.
Infos
iops
: https://github.com/cxcv/iopsiometer
: http://www.iometer.org- Fio: https://github.com/axboe/fio
- Bonnie++: http://www.coker.com.au/bonnie++
- Iozone: http://www.iozone.org
- Hdparm: http://sourceforge.net/projects/hdparm
« Previous 1 2 3 4
Buy this article as PDF
(incl. VAT)