Aligning filesystem partitions
Lining Them Up!
Filesystem partitions and page boundaries need to be aligned to native hardware properties to ensure maximum performance is achieved. Often enough, tools do the right thing but not always, and an informed administrator is the most effective line of defense against this pitfall.
Spinning Media
Hard disk platters are divided into thousands of concentric circular tracks . Tracks are in turn divided into sectors , which represent the minimum data size that can be written to a disk – this has historically been set at 512 bytes, but the newest spinning media drives sport 4KB sectors, a transition necessary to make error correction codes more efficient in storage terms. The term cylinder was used to refer to all tracks of the same diameter located on all platters of a drive. Finally, clusters refer to the minimum file allocation size a filesystem manages and effectively represent the smallest possible disk allocation for a file. (A smaller file would be padded with slack space to that minimum allocation.)
Things are complicated further by the fact that the abstractions used to represent disk structure are woefully obsolete: As spinning disk technology advanced, disks started to lie outright [1] about their geometry, in effect using disk attributes as an abstraction to interface with the operating system rather than a real description of disk structure, as I discussed in a previous article [2]. This has escalated even further with the introduction of SSD drives, which evidently do not physically possess things like platters or tracks at all, but use the same disk description to receive and service the system's requests.
Dishonest Drives
The transition from 512-byte to 4KB sectors found in Advanced Format drives has consequences, because permanent storage I/O is optimally performed in multiples of a disk's sector size. Ideally, limited read performance impact is experienced by applications as drives map 512-byte legacy requests on aligned 4KB reads internally [3]. However, that is no longer the case when I/O operations are not aligned to 4KB boundaries, and a second sector read may become necessary. In configurations where new hardware is set to emulate older behavior, a 4,096-byte read must be performed in advance of every write, where the new 512 bytes are overlaid on the results of the read operation.
This "compatibility mode" of Advanced Format 512e drives results in a write performance degradation of at least 50 percent because two I/O operations are now necessary for each write, and the specific situation can be considerably worse as spinning media devices have to wait a full revolution to write over the same sector they just read [1]. In a worst case, read-write-modify situation, a staggering 11ms penalty may be incurred by write operations on a 5,400RPM disk [4].
These mismatched combinations of drives and filesystem are most likely to occur when an old system is being upgraded or retrofitted. A particularly extreme example of this is the jumper that some hardware manufacturers have added to their newest drives – often dubbed the "XP Compatibility" facility. Historically, DOS has implicitly aligned to sector 63 [5], which has the misfortune of being off by just one sector from a 4KB boundary. Setting this jumper will adjust sector addressing "in hardware" off by one, mapping address n into physical sector n +1. You want to be extremely careful not to enable this unless it is actually called for [6], or you may see degraded performance for a setup that externally appears perfectly aligned.
You can check the properties of your first disk by looking up /sys/block/sda/queue/physical_block_size
, which should in theory hold the correct sector size. But, given the "disks lie" mantra, you must make certain by identifying the drive (/sys/block/sda/device/model
) and looking up the actual specs on the manufacturer's website.
Some drives will also report an alignment offset in /sys/block/sda/alignment_offset
– this one measured in sectors, not bytes like the previous. Drives exhibit best performance when access is aligned to internal sector size, so that should be your first consideration. Without further data, the rule of thumb adopted by Microsoft in Windows 7 of aligning to 1MB (2048x512 and 256x4096, both equal 1,048,576) works for both popular spinning media types and is sure to exert a gravitational pull on SSD manufacturers as well.
RAID
Microsoft has published remarkable data showing a 30 percent improvement in both disk latency and query timing over SQLServer systems where the partitions were out of alignment as a result of system upgrades: Newly created partitions in Server 2008 are typically aligned. However, that is not the case for those created in Server 2003, and a system upgrade will not change the preexisting RAID setup [7].
Across a striped disk array, a stripe unit is the element of a RAID stripe stored in a single disk. A single I/O request coming into the array will turn into multiple I/O requests when the request crosses stripe unit boundaries. The effect is cumulative and can contribute to significant performance degradation, as Microsoft's results show.
Aligning RAID stripes should be done with the assistance of your vendor's documentation, but the process has been documented very well by HP [8] and expanded on by Ian Chard [9] in a quick how-to. Once you have determined the number of sectors you want to align to, you can create the partition with
(parted) mkpart primary 2048s 100%
for a Redmond-styled partition starting at sector 2048. Do not forget the trailing "s" or Parted will prompt you. You can then check the alignment against what the tool knows using:
(parted) align-check optimal 1 1 aligned
Tools won't always agree with your determination, particularly if you are using an older distribution: Parted sometimes generates alignment warnings even when invoked with parted -a optimal
. In that case, you want to dig deeper and determine whether your tooling is out of date or you made a mistake. Now you have all the necessary knowledge to do so.
Infos
- "Disks from the Perspective of a File System" by Marshall Kirk McKusick, ACM Queue : http://queue.acm.org/detail.cfm?id=2367378
- "Heavy Rotation" by Federico Lucifredi, ADMIN , Issue 14, pg. 88
- Advanced Format 512e drives: http://en.wikipedia.org/wiki/Advanced_Format#512e
- Partition alignment of drives with internal sector size larger than 512 bytes: http://www.novell.com/support/kb/doc.php?id=7007193
- Master boot record: http://en.wikipedia.org/wiki/Master_boot_record
- "Linux on 4KB-sector disks: Practical advice" by Roderick W. Smith: http://www.ibm.com/developerworks/linux/library/l-4kb-sector-disks/
- Disk Partition Alignment Best Practices for SQL Server: http://msdn.microsoft.com/en-us/library/dd758814%28v=sql.100%29.aspx
- Parted partition alignment for best performance: http://h10025.www1.hp.com/ewfrf/wc/document?cc=us&lc=en&docname=c03479326
- "How to align partitions for best performance using parted" by Ian Chard: http://rainbow.chard.org/2013/01/30/how-to-align-partitions-for-best-performance-using-parted/