« Previous 1 2 3 4 Next »
Fundamentals of I/O benchmarking
Measure for Measure
The dd Command
The dd
command is a universal copying tool that is also suitable for benchmarks. The advantage of dd
is that it works in batch mode, offers the choice between direct I/O and buffered I/O, and lets you select the block size. One disadvantage is that it does not provide any information about the number of I/O operations. You can, however, draw conclusions from the block size and throughput on the number of I/O operations. The following example shows how to use dd
as a filesystem benchmark,
dd if=file of=/dev/null
or when bypassing the filesystem cache:
dd if=file iflag=direct of=/dev/null
You can even completely bypass the filesystem and access the disk directly:
dd if=/dev/sda of=/dev/null
The curse and blessing of dd
is its universal orientation. It is not a dedicated benchmarking tool. Users are forced to think when running it. To illustrate how easily you can measure something other than what you believed to be measuring, just look at the following example. The command
dd if=/dev/sdb
resulted in a value of 44.2MBps on the machine used in our lab. If you conclude on this basis that the read speed of the second SCSI disk reaches this value with the default block, you will be disappointed by the following command
dd if=/dev/sdb of=/dev/null
which returns 103MBps – a reproducible difference of more than 100 percent. In reality, the first command measures how quickly the system can output data (e.g., zeros) at the command line. Simply redirecting the output to /dev/null
corrects the results.
It's generally a good idea to sanity check all your results. Tests should run more than once to see if the results remain more or less constant and to detect caching effects at an early stage. A comparison with a real-world benchmark is always advisable. Users of synthetic benchmarks should copy a large file to determine the plausibility of a throughput measurement.
The iostat Command
Admins can best determine the performance limits of a system with benchmarks first and then use monitoring tools to check the extent to which they are utilized. The iostat
command is a good choice here; it determines the I/O-specific key figures for the block layer. Listing 5 shows a typical call.
Listing 5
iostat Example
# iostat -xmt 1 /dev/sda Linux 3.16.7-21-desktop (tweedleburg) 08/07/15 _x86_64_ (8 CPU) 08/07/15 17:25:13 avg-cpu: %user %nice %system %iowait %steal %idle 1.40 0.00 0.54 1.66 0.00 96.39 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 393.19 2.17 137.48 2.73 2.11 0.07 31.82 0.32 2.32 2.18 8.94 0.10 1.46
In the listing, rMB/s and wMB/s specify the throughput the system achieves for read or write access, and r/s and w/s stand for read or write operations (IOPS) that are passed down to the SCSI layers. The await column, the average wait time for an I/O operation, which increases the service time, svctm , results in the latency. This is closely related to avgqu-sz , the average size of the request queue.
The %util column shows the I/O CPU load. The rrqm/s and wrqm/s columns are merged operations (read requests merged per second or write requests merged per second) requested by the overlying layers .
The average size of the I/O requests in the queue is given by avgrq-sz
. You can use this to discover whether or not the desired block size also reaches storage (more precisely: the SCSI layers). The operating system tends to restrict the block size based on the /sys/block/sdb/queue/max_ sectors_kb
setting. The block size is limited to 512KB in the following case:
# cat /sys/block/sdb/queue/max_sectors_kb 512
The maximum block size can be changed to 1MB using:
# echo "1024" > /sys/block/sdb/queue/max_sectors_kb
In practice, iostat
supports versatile use. If complaints about slow response times correlate with large values of avgqu-sz
, which are typically associated with high I/O await
latency, the performance bottleneck is probably your storage. Purchasing more CPU power, as is typically possible in cloud-based systems, will not have the desired effect in this case.
Instead, you can explore whether the I/O requests are being broken up or hopped or are located near the limit of /sys/block/sdb/queue/max_sectors_kb
. If this is the case, increasing the value might help. After all, a few large operations per second will provide higher throughput than many small operations. If this doesn't help, the measurement at least delivers a strong argument for investing in more powerful storage or a more powerful storage service.
The iostat
command also supports statements on the extent to which storage is utilized in terms of throughput, IOPS, and latency. This allows predictions in the form of "If the number of users doubles and the users continue to consume the same amount of data, the storage will not be able to cope" or "The storage could achieve better throughput, but the sheer number of IOPS are slowing it down."
Other Tools
Other benchmark tools include fio
(flexible I/O tester) [3], bonnie++
[4], and iozone
[5]. Hdparm [6] determines hard disk parameters. Among other things, hdparm -tT/dev/sda
can measure throughput for reading from the cache, which lets you draw conclusions on the bus width.
With respect to I/O, iostat
can do everything vmstat
does, which is why I don't cover vmstat
here. Iotop helps you discover which processes are I/O-generating load. Also cat
and cp
from the Linux toolkit can be used for real-world benchmarks, such as for copying a large file. But, they only provide guidance for throughput.
« Previous 1 2 3 4 Next »
Buy this article as PDF
(incl. VAT)