« Previous 1 2 3
SDS configuration and performance
Put to the Test
Bonnie Voyage
The test uses Bonnie++ and IOzone (see the "Benchmarks" box). In the first run, Bonnie++ tests byte-by-byte and then block-by-block writes, for which it uses the putc()
macro or the write(2)
system call. The test overwrites the blocks by rewriting and measures the data throughput.
Benchmarks
In this test, I used Bonnie++ 1.97 and IOzone 3.429. The first test is sequential block-by-block and byte-by-byte writing, reading, and overwriting. The software also generates files sequentially, but randomly, and deletes them again. IOzone is limited to testing writing, reading, and overwriting, but with different block sizes.
The test begins by emptying the caches with
echo 3 > /proc/sys/vm/drop_caches
to test the performance of the filesystem – not that of the cache memory from the client. Each test was repeated twice for each filesystem, and then I looked at the averages of the respective tests.
The test results showed that Bonnie++ maintained the CPU load during the test. With a fast network connection or local disks, this byte-by-byte writing was more a test of the client's CPU than the memory.
CephFS and Ceph RADOS wrote significantly more quickly at 391 and 386KBps than Gluster at 12KBps and Lizard at 19KBps (Figure 2). However, the client's CPU load for the two Ceph candidates was well over 90 percent, whereas Bonnie only measured 30 percent (Lizard) and 20 percent (Gluster).
Here, an NFS mount between the client and <Fileserver_2>
managed 530 KBps. The differences were less striking for block-by-block writing. Both Ceph variants and Lizard were very similar, whereas Gluster achieved about half of the performance (Figure 3): Ceph RADOS achieved 11.1MBps, CephFS 11.5MBps, Lizard 11.4MBps, and Gluster 5.7MBps. In comparison, the local NFS connection only managed 10.6MBps. Ceph and Lizard presumably achieved a higher throughput here thanks to distribution over multiple servers.
The results were farther apart again for overwriting files. CephFS (6.3MBps) was noticeably lighter than Ceph RADOS (5.6MBps). LizardFS (1.7MBps), on the other hand, was significantly less, and Gluster came in fourth place (311KBps). The field converged again for reading (Figure 4). For byte-by-byte reading, Gluster (2MBps), which has otherwise been running at the back of the field, was ahead of Ceph RADOS (1.5MBps), Lizard (1.5MBps), and the well-beaten CephFS (913KBps). NFS was around the middle at 1.4MBps. Noticeably, CephFS CPU usage was at 99 percent despite the low performance.
Bonnie++ also tests seek operations to determine the speed of the read heads and the creation and deletion of files.
The margin in the test was massive again for the seeks; Ceph RADOS was the clear winner with an average of 1,737 Input/Output Operations per Second (IOPS), followed by CephFS (1,035 IOPS) and then a big gap before Lizard finally dawdled in (169 IOPS), with Gluster (85 IOPS) behind. NFS ended up around the middle again with 739 IOPS.
The last round of testing involved creating, reading (referring to the stat
system call, which reads the a file's metadata, such as owner or production time), and deleting files. In certain applications (e.g., web cache) such performance data is more important than raw reading or writing of a file.
Ceph RADOS won the linear file creation test with a sizable advantage (Figure 5). This unusual outcome was presumably the result of Ceph RADOS running file operations on the ext4 filesystem of the block device, probably reporting back to Bonnie++ a much faster time than it took to create the file. The RADOS device managed 7,500 IOPS on average. CephFS still ended up in second place with around 540 IOPS, followed by Lizard with 340 IOPS and Gluster close behind with 320 IOPS. NFS lost this competition with just 41 IOPS. Barely any recordable differences were noticed with random creation, except for Ceph RADOS, which worked even faster.
Reading data structures did not provide any uniform result. With Ceph RADOS, Bonnie++ denied the result for linear and random reading. This was also the case with random reading for CephFS. POSIX compliance is probably not 100 percent for these filesystems.
Lizard won the linear read test with around 25,500 IOPS, followed by CephFS (16,400 IOPS) and Gluster (15,400 IOPS). For random reading, Lizard and Gluster values were just one order of magnitude lower (1,378 IOPS for Lizard and 1,285 IOPS for Gluster). NFS would have won the tests with around 25,900 IOPS (linear) and 5,200 IOPS (random).
Only deleting files remained. The clear winner here (and as with creating, probably unrivalled) was Ceph RADOS, with around 12,900 linear and 11,500 random IOPS, although the other three candidates handled random deletion better than linear deletion. A winner could not be determined between them, but CephFS was last: Lizard (669 IOPS linear, 1,378 IOPS random), Gluster (886 IOPS linear, 1,285 IOPS random), CephFS (567 IOPS linear, 621 IOPS random).
In the Zone
IOzone provided a large amount of test data. Figure 6 shows a sample. Unlike Bonnie++, the test tool is limited to reading and writing but does so in far more detail. It reads and writes files in different block and file sizes; writes are performed with write()
and fwrite()
, with forward and random writing as well as overwriting. IOzone also reads forward, backward, and randomly using operating system and library calls.
In the writing test, IOzone essentially confirmed the results from Bonnie++, but some readings put Ceph RADOS ahead, mostly in operations with small block sizes. For reading, Ceph RADOS was almost always at the front for small block sizes, and CephFS led the middle range. LizardFS did well in some tests with large file and block sizes.
One exception was the stride read test, in which IOzone linearly reads every umpteenth block (around 64 bytes from block 1024, and so on). Lizard also won with small files and small file sizes (Figure 6, yellow).
Conclusions
GlusterFS configuration was easiest, followed by LizardFS. Setting up Ceph required a lot more work.
In terms of performance, CephFS (POSIX mounts) and Ceph RADOS (block device provided with its own filesystem) cut fine figures in most of the tests in the lab. Ceph RADOS showed some upward outliers but, in reality, only benefited from caching on the client.
If you shy away from the complexity of Ceph, Lizard might mean only small declines in performance, but it would allow you to achieve your goal more quickly thanks to the ease of setup. GlusterFS remained in the shadow in most tests, but performed better in the byte-by-byte sequential reads.
LizardFS frequently had its nose in front in the IOzone read test, but not the write test. The more the block size and file size increased, the more often Lizard won the race – or at least one place behind CephFS, but ahead of Ceph RADOS.
NFS running as a traditional SDS alternative did well, but did not necessarily run away from the distributed filesystems.
Infos
- GlusterFS: https://www.gluster.org
- Ceph: http://ceph.com
- LizardFS: https://lizardfs.com
- CephFS: http://ceph.com/ceph-storage/file-system/
- Bonnie++: http://www.coker.com.au/bonnie++/
- IOzone: http://www.iozone.org
- EPEL repository: https://fedoraproject.org/wiki/EPEL
- Ceph documentation: http://docs.ceph.com/docs/master/start/
- ceph-deploy pre-flight checklist: http://docs.ceph.com/docs/jewel/rados/deployment/preflight-checklist/
« Previous 1 2 3
Buy this article as PDF
(incl. VAT)