Lead Image © arsgera, 123RF.com

Linux Storage Stack

Stacking Up

Article from ADMIN 31/2016

By Werner Fischer , By Georg Schönberger

Abstraction layers are the alpha and omega in the design of complex architectures. The Linux Storage Stack is an excellent example of well-coordinated layers. Access to storage media is abstracted through a unified interface, without sacrificing functionality.

In the storage stack context, the end users are typically normal applications (userspace programs/applications). The first component with which Linux programs interact when processing data is the virtual filesystem (VFS). Only through the VFS is it possible to invoke the same system calls for different filesystems on different media. Using VFS, for example, a file is transparently copied for the user from an ext4 to an ext3 filesystem using the cp command.

The variety of filesystems – block-based, cross-network, pseudo-, and even Filesystems in Userspace (FUSE) – demonstrates the numerous possibilities that VFS encapsulation opens up. The aforementioned system calls are unified functions, such as open, read, or write, no matter what filesystem is hidden underneath. The specific filesystem operations are abstracted by the VFS, and caches – including the directory entry (dentry) cache – speed up file access.

The next layer of the storage stack consists of individual filesystem implementations. They provide the VFS with generic methods and translate them into specific calls for accessing the device. The filesystem also performs its primary task – organizing data and metadata for an underlying storage medium. The Linux kernel also speeds up access to these media with a caching mechanism – the Linux page cache.

Block I/O

Flexible block I/O structures (BIOs) are used instead of pages for administration in the kernel. The structures represent block I/O operations or queries that the kernel is currently executing (in-flight BIOs). This applies both to I/O on the page cache and to direct I/O (i.e., access that bypasses the page cache). The advantages of BIOs is in handling multiple segments involved in the current I/O operation.

BIOs consist of a list or a vector of segments that points to different pages in memory. This

...

Use Express-Checkout link below to read the full article (PDF).