Lead Image © Jakub Jirsak, 123RF.com

Lead Image © Jakub Jirsak, 123RF.com

SDS configuration and performance

Put to the Test

Article from ADMIN 37/2017
By
Software-defined storage promises centrally managed, heterogeneous storage with built-in redundancy. We examine how complicated it is to set up the necessary distributed filesystems. A benchmark shows which operations each system is best at tackling.

Frequently, technologies initially used in large data centers end up at some point in time in smaller companies' networks or even (as in the case of virtualization) on ordinary users' desktops. This process can be observed for software-defined storage (SDS), as well.

SDS basically converts hard drives from multiple servers into a large, redundant storage environment. The idea is that storage users have no need to worry about which specific hard drive their data is on. Equally, if individual components crash, users should be confident that the landscape has consistently saved the data so that it is always accessible.

This technology makes little sense in an environment with only one file server, but it is much more useful in large IT environments, where you can implement the scenario professionally with dedicated servers and combinations of SSDs and traditional hard drives. Usually, the components connect with each other and the clients over a 10Gb network.

However, SDS now also provides added value if several servers with idle disk space are waiting in small or medium-sized businesses. In this case, it can be interesting to combine this space using the distributed filesystems in a redundant array.

Candidate Lineup

Linux admins can immediately access several variants of such highly available, distributed filesystems. Well-known examples include GlusterFS [1] and Ceph [2], whereas LizardFS is relatively unknown [3]. In this article, I analyze the three systems and compare the read and write speeds in the test network in a benchmark.

Speed may be key for filesystems, but a number of other features are of interest, too. For example, depending on your use of the filesystem, sometimes sequential writing, sometimes creating new files quickly, or sometimes even random reading of different data is crucial. Benchmark results can provide information about a system's strengths and weaknesses.

The three candidates in the test follow different approaches. Ceph is a distributed object store that can also be used as a filesystem in the form of CephFS [4]. GlusterFS and LizardFS, on the other hand, are designed as filesystems; however, although just two nodes are enough to operate a Gluster setup, LizardFS needs an additional control node, for which it has a web interface (Figure 1) that informs you about the state of the cluster.

Figure 1: The web interface for LizardFS, a newcomer among distributed filesystems.

First, I take a look at the overhead required to install and configure the filesystems. Second, I analyze data throughput measured with read/write tests.

The Test Setup

The lab setup consisted of a client with Ubuntu Linux 16.04 LTS connected to two storage servers with plenty of hard disks (Table 1). The operating system running on the file servers was CentOS 7, with Ubuntu 16.04 LTS on a fourth admin computer that served CephFS and LizardFS as a monitor or master server.

Table 1

Test Hardware

Server CPU RAM Storage
File server 1 Intel Xeon X5667 with 3GHz and 16 cores 16GB Disk array T6100S with 10 Hitachi drives (7200rpm) configured as RAID 1
File server 2 Intel Core i3-530 with 2.9GHz and four cores 16GB iSCSI array Thecus with two 320GB hard drives configured as RAID 1
Admin server Intel Core 2 E6700 with 2.66GHz and two cores 2GB Not specified
Client Intel Core 2 E6320 with 1.86GHz and two cores 2GB Not specified

File server 1 had a storage array connected via SCSI, whereas file server 2 was connected to another array via iSCSI. Communication between the servers was with Gigabit Ethernet, while the client and the admin server each only had a 100Mbps interface. This setup did not meet high-performance requirements, although it is not uncommon in smaller companies. The setup also allowed the candidates to test on equal terms. Version 1.97 of Bonnie++ [5] and IOzone 3.429 [6] were the test tools. To begin, I launched each tool on the client and then released them on the respective test candidate's mounted filesystem.

GlusterFS

GlusterFS was the easiest to install among the participants. I only needed to install the software on the Ubuntu client and the two storage servers. For the installation on CentOS, I added the EPEL repository [7] and called yum update.

Version 3.8.5 of the software then installed with the centos-release-gluster and glusterfs-server packages. To start the service, I enabled glusterd via systemctl. Once the service was started on both storage servers, I checked for signs of life using:

gluster peer probe <IP/Hostname_of_Peer>
peer probe: success.

The response in the second line indicates that everything is going according to plan. An error message would probably have indicated a lack of name resolution.

The tester generates the filesystem on the existing filesystems of the storage servers to activate it in the second step:

gluster volume create lmtest replica 2 transport tcp <Fileserver_1>:/<Mountpoint> <Fileserver_2>:/<Mountpoint>
gluster volume start lmtest

To use the created volume, I still need the glusterfs-client package, which provides the kernel drivers and tools needed to integrate a volume and is then executed with the command:

mount.glusterfs <Fileserver_1>:/lmtest /mnt/glusterfs

The filesystem is then available on the client for performance tests.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Software-defined storage with LizardFS
    Standard hardware plus LizardFS equals a resilient, flexible, and configurable POSIX-compliant storage pool.
  • Getting Ready for the New Ceph Object Store

    The Ceph object store remains a project in transition: The developers announced a new GUI, a new storage back end, and CephFS stability in the just released Ceph v10.2.x, Jewel.

  • Ceph object store innovations
    The Ceph object store remains a project in transition: The developers announced a new GUI, a new storage back end, and CephFS stability in the just released Ceph c10.2.x, Jewel.
  • CephX Encryption

    We look at the new features in Ceph version 0.56, alias “Bobtail,” talk about who would benefit from CephX Ceph encryption, and show you how a Ceph Cluster can be used as a replacement for classic block storage in virtual environments.

  • What's new in Ceph
    Ceph and its core component RADOS have recently undergone a number of technical and organizational changes. We take a closer look at the benefits that the move to containers, the new setup, and other feature improvements offer.
comments powered by Disqus