SDS configuration and performance

Put to the Test

Bonnie Voyage

The test uses Bonnie++ and IOzone (see the "Benchmarks" box). In the first run, Bonnie++ tests byte-by-byte and then block-by-block writes, for which it uses the putc() macro or the write(2) system call. The test overwrites the blocks by rewriting and measures the data throughput.

Benchmarks

In this test, I used Bonnie++ 1.97 and IOzone 3.429. The first test is sequential block-by-block and byte-by-byte writing, reading, and overwriting. The software also generates files sequentially, but randomly, and deletes them again. IOzone is limited to testing writing, reading, and overwriting, but with different block sizes.

The test begins by emptying the caches with

echo 3 > /proc/sys/vm/drop_caches

to test the performance of the filesystem – not that of the cache memory from the client. Each test was repeated twice for each filesystem, and then I looked at the averages of the respective tests.

The test results showed that Bonnie++ maintained the CPU load during the test. With a fast network connection or local disks, this byte-by-byte writing was more a test of the client's CPU than the memory.

CephFS and Ceph RADOS wrote significantly more quickly at 391 and 386KBps than Gluster at 12KBps and Lizard at 19KBps (Figure 2). However, the client's CPU load for the two Ceph candidates was well over 90 percent, whereas Bonnie only measured 30 percent (Lizard) and 20 percent (Gluster).

Figure 2: The two Ceph candidates write data faster, but also place a significantly greater load on the CPU.

Here, an NFS mount between the client and <Fileserver_2> managed 530 KBps. The differences were less striking for block-by-block writing. Both Ceph variants and Lizard were very similar, whereas Gluster achieved about half of the performance (Figure 3): Ceph RADOS achieved 11.1MBps, CephFS 11.5MBps, Lizard 11.4MBps, and Gluster 5.7MBps. In comparison, the local NFS connection only managed 10.6MBps. Ceph and Lizard presumably achieved a higher throughput here thanks to distribution over multiple servers.

Figure 3: The result looks a little more balanced for block-by-block writing. The NFS values are less than those of Ceph and Lizard.

The results were farther apart again for overwriting files. CephFS (6.3MBps) was noticeably lighter than Ceph RADOS (5.6MBps). LizardFS (1.7MBps), on the other hand, was significantly less, and Gluster came in fourth place (311KBps). The field converged again for reading (Figure  4). For byte-by-byte reading, Gluster (2MBps), which has otherwise been running at the back of the field, was ahead of Ceph RADOS (1.5MBps), Lizard (1.5MBps), and the well-beaten CephFS (913KBps). NFS was around the middle at 1.4MBps. Noticeably, CephFS CPU usage was at 99 percent despite the low performance.

Figure 4: For byte-by-byte reading, GlusterFS moves to the top, and CephFS drops away significantly.

Bonnie++ also tests seek operations to determine the speed of the read heads and the creation and deletion of files.

The margin in the test was massive again for the seeks; Ceph RADOS was the clear winner with an average of 1,737 Input/Output Operations per Second (IOPS), followed by CephFS (1,035 IOPS) and then a big gap before Lizard finally dawdled in (169 IOPS), with Gluster (85 IOPS) behind. NFS ended up around the middle again with 739 IOPS.

The last round of testing involved creating, reading (referring to the stat system call, which reads the a file's metadata, such as owner or production time), and deleting files. In certain applications (e.g., web cache) such performance data is more important than raw reading or writing of a file.

Ceph RADOS won the linear file creation test with a sizable advantage (Figure 5). This unusual outcome was presumably the result of Ceph RADOS running file operations on the ext4 filesystem of the block device, probably reporting back to Bonnie++ a much faster time than it took to create the file. The RADOS device managed 7,500 IOPS on average. CephFS still ended up in second place with around 540 IOPS, followed by Lizard with 340 IOPS and Gluster close behind with 320 IOPS. NFS lost this competition with just 41 IOPS. Barely any recordable differences were noticed with random creation, except for Ceph RADOS, which worked even faster.

Figure 5: File operation on the ext4 filesystem of the block device could be responsible for the large fluctuations in the Ceph RADOS test.

Reading data structures did not provide any uniform result. With Ceph RADOS, Bonnie++ denied the result for linear and random reading. This was also the case with random reading for CephFS. POSIX compliance is probably not 100 percent for these filesystems.

Lizard won the linear read test with around 25,500 IOPS, followed by CephFS (16,400 IOPS) and Gluster (15,400 IOPS). For random reading, Lizard and Gluster values were just one order of magnitude lower (1,378 IOPS for Lizard and 1,285 IOPS for Gluster). NFS would have won the tests with around 25,900 IOPS (linear) and 5,200 IOPS (random).

Only deleting files remained. The clear winner here (and as with creating, probably unrivalled) was Ceph RADOS, with around 12,900 linear and 11,500 random IOPS, although the other three candidates handled random deletion better than linear deletion. A winner could not be determined between them, but CephFS was last: Lizard (669 IOPS linear, 1,378 IOPS random), Gluster (886 IOPS linear, 1,285 IOPS random), CephFS (567 IOPS linear, 621 IOPS random).

In the Zone

IOzone provided a large amount of test data. Figure 6 shows a sample. Unlike Bonnie++, the test tool is limited to reading and writing but does so in far more detail. It reads and writes files in different block and file sizes; writes are performed with write() and fwrite(), with forward and random writing as well as overwriting. IOzone also reads forward, backward, and randomly using operating system and library calls.

Figure 6: In IOzone's stride read test, LizardFS (yellow) takes the lead with smaller file sizes. CephFS is tan, and Ceph RADOS is green.

In the writing test, IOzone essentially confirmed the results from Bonnie++, but some readings put Ceph RADOS ahead, mostly in operations with small block sizes. For reading, Ceph RADOS was almost always at the front for small block sizes, and CephFS led the middle range. LizardFS did well in some tests with large file and block sizes.

One exception was the stride read test, in which IOzone linearly reads every umpteenth block (around 64 bytes from block 1024, and so on). Lizard also won with small files and small file sizes (Figure 6, yellow).

Conclusions

GlusterFS configuration was easiest, followed by LizardFS. Setting up Ceph required a lot more work.

In terms of performance, CephFS (POSIX mounts) and Ceph RADOS (block device provided with its own filesystem) cut fine figures in most of the tests in the lab. Ceph RADOS showed some upward outliers but, in reality, only benefited from caching on the client.

If you shy away from the complexity of Ceph, Lizard might mean only small declines in performance, but it would allow you to achieve your goal more quickly thanks to the ease of setup. GlusterFS remained in the shadow in most tests, but performed better in the byte-by-byte sequential reads.

LizardFS frequently had its nose in front in the IOzone read test, but not the write test. The more the block size and file size increased, the more often Lizard won the race – or at least one place behind CephFS, but ahead of Ceph RADOS.

NFS running as a traditional SDS alternative did well, but did not necessarily run away from the distributed filesystems.

The Author

Konstantin Agouros works at Xantaro Deutschland GmbH as a solutions architect focusing on networking, cloud security, and automation. His book Software Defined Networking, SDN-Praxis mit Controllern und OpenFlow [Software-Defined Networking, SDN practice with controllers and OpenFlow] was published in the fall by De Gruyter.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Software-defined storage with LizardFS
    Standard hardware plus LizardFS equals a resilient, flexible, and configurable POSIX-compliant storage pool.
  • Getting Ready for the New Ceph Object Store

    The Ceph object store remains a project in transition: The developers announced a new GUI, a new storage back end, and CephFS stability in the just released Ceph v10.2.x, Jewel.

  • Ceph object store innovations
    The Ceph object store remains a project in transition: The developers announced a new GUI, a new storage back end, and CephFS stability in the just released Ceph c10.2.x, Jewel.
  • CephX Encryption

    We look at the new features in Ceph version 0.56, alias “Bobtail,” talk about who would benefit from CephX Ceph encryption, and show you how a Ceph Cluster can be used as a replacement for classic block storage in virtual environments.

  • What's new in Ceph
    Ceph and its core component RADOS have recently undergone a number of technical and organizational changes. We take a closer look at the benefits that the move to containers, the new setup, and other feature improvements offer.
comments powered by Disqus