Linux I/O Schedulers
A Schedule to Keep
The Linux kernel is a very complex piece of software used on a variety of computers, including embedded devices that need real-time performance, hand-held devices, laptops, desktops, servers, database servers, video servers, and very large supercomputers, including all of those in the TOP500 [1]. All of these computers have very different requirements, some of which include responsiveness to user input (e.g., so streaming music, video, or other interactivity is not interrupted). At the same time, the devices require good I/O performance to make sure data is saved properly. Some workloads have very high I/O throughput, so to make sure these requirements are met, the kernel uses schedulers.
Schedulers do exactly what they say: schedule activities within the kernel so that system activities and resources achieve an overall goal for the system. This goal could be low latency for input (as embedded systems require), better interactivity, faster I/O, or even a combination of goals. Primarily, schedulers are concerned with CPU resources, but they could also consider other system resources (e.g., memory, input devices, networks, etc.).
The focus of this article is the I/O scheduler, including I/O scheduler concepts and the various options that are available for I/O tuning.
Intro to I/O Schedulers
Virtually all applications running on Linux do some sort of I/O. Even surfing the web writes a number of small files to disk. Without an I/O scheduler, every I/O request would send an interrupt to the kernel so that the I/O operation could be performed, moving the disk head around different blocks to satisfy the read and write requests. Over time, the disparity between the performance of disk drives and the rest of the system grows very rapidly, making I/O more important to overall system performance. As you can imagine, when the kernel has to address an interrupt, any kind of processing or interactive work is paused. Consequently, the system may appear unresponsive or slow.
How do you schedule the I/O requests to preserve interactivity while also ensuring good I/O performance? The answer, as with most things, depends on the workload. In some cases, it would be nice to be able to do I/O while doing other things. In other cases, doing I/O as fast as possible is required, such as when a distributed application is creating a checkpoint. To balance these two very different workloads or to ensure that one workload is emphasized, an I/O scheduler is used.
The scheduling of I/O events must address many parts. For example, the scheduler might need to store events for some future execution. How it stores the events, the possible reordering of events, the length of time it stores the events, the execution of stored events when some condition is reached, the length of I/O execution, and so on are all crucial aspects of the scheduler. Exactly how these various aspects of the scheduler are implemented can have a huge effect on the overall I/O performance of the system and the perception of users when interacting with the system.
Defining the function or role of the system is probably the best place to start when considering scheduler design or the tuning of existing schedulers. For example, you should know whether the target system is an embedded device, a hand-held device, a laptop, desktop, server, supercomputer, and so on so that you can define what your goals are for the scheduler.
For example, suppose your target system is a desktop user doing some web surfing, perhaps playing a video or music, and maybe even running a game. Although this is a simple and common scenario, this mix of workloads has enormous implications. If you are watching a video or listening to music or playing a game, you don't want it to be interrupted. There's nothing like a video that pauses – plays – pauses – plays to stretch your patience in a hurry. When gaming, if the system pauses while you are about to blow the head off a mutant zombie, you might find that the zombie has removed your character's head when the system returns control to your game. Although "stuttery" music might be a legitimate genre to some, in general, it's quite annoying. Therefore, a desktop target system that requires as little interruption of interactive programs as possible has a great influence on the design of the scheduler.
Disk I/O can be much slower than other aspects of the system. Because I/O scheduling allows you to store events and possibly reorder them, it's possible to produce contiguous I/O requests to improve performance. Newer filesystems are incorporating some of these concepts, and you can even extend these concepts to make the system better adapt to the properties of SSDs.
I/O schedulers typically use the following techniques:
- Request Merging. Adjacent requests are merged to reduce disk seeking and to increase the size of the I/O syscalls (usually resulting in higher performance).
- Elevator. Requests are ordered on the basis of physical location on the disk so that seeks are in one direction as much as possible. This technique is sometimes referred to as "sorting."
- Prioritization. Requests are prioritized in some way. The details of the ordering are up to the I/O scheduler.
Almost all I/O schedulers also take resource starvation into account so that all requests are serviced eventually.
These techniques and others are combined to create an I/O scheduler with a few goals, with three of the top goals being:
- Minimize disk seeks
- Ensure optimum disk performance
- Provide fairness among I/O requests
Balancing these goals or trading them against one another is the essence of the art of creating an I/O scheduler.
The techniques used by I/O schedulers as they apply to SSDs are a bit different. SSDs are not spinning media, so merging requests and ordering them might not have much of an effect on I/O. Instead, I/O requests to the same block can be merged, and small I/O writes can either be merged or adjusted to reduce write amplification (i.e., the need for more physical space than the logical data would imply because of the way write operations take place on SSDs).
Linux I/O Schedulers
Because Linux is an open-source project, developers can submit additions to the kernel for inclusion, and over the years, several I/O schedulers have been proposed. Currently, four are included in the kernel:
- Completely Fair Queuing (CFQ)
- Deadline
- NOOP
- Anticipatory
NOOP
The NOOP I/O scheduler [2] is fairly simple. All incoming I/O requests for all processes running on the system, regardless of the I/O request (e.g., read, write, lseek, etc.), go into a simple first in, first out (FIFO) queue. The scheduler also does request merging by taking adjacent requests and merging them into a single request to reduce seek time and improve throughput. NOOP assumes that some other device will optimize I/O performance, such as an external RAID controller or a SAN controller.
Potentially, the NOOP scheduler could work well with storage devices that don't have a mechanical component (i.e., a drive head) to read data, because it does not make any attempts to reduce seek time beyond simple request merging (which helps throughput). Therefore, storage devices such as flash drives, SSD drives, USB sticks, and the like that have very little seek time could benefit from a NOOP I/O scheduler.
Buy this article as PDF
(incl. VAT)
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.