Comparing Ceph and GlusterFS
Shared storage systems GlusterFS and Ceph compared
GlusterFS Interfaces
Access to shared storage via a POSIX-compatible interface is not without controversy. Strict conformity with the interface defined by the IEEE would have a significant effect on performance and scalability. In the case of GlusterFS, FUSE can be added as a further point of criticism. Since version 3.4, it's finally possible to access the data on GlusterFS directly via a library. Figure 5 shows the advantages that derive from the use of libgfapi
[11].
The developers of Samba [12] and OpenStack [13] have already welcomed this access and incorporated it into their products. This step has not yet happened in the open source cloud, however; here, users can find a few GlusterFS mounts. For this, libgfapi
uses the same GlusterFS blocks in the background, such as the previously mentioned "translators." Anyone who works with QEMU [14] can also dispense with the annoying mounts and store data directly on GlusterFS. For this, the emulator and virtualizer must at least be version 1.3:
# ldd /usr/bin/qemu-system-x86_64\ |grep libgfapi libgfapi.so.0 => \ /lib64/libgfapi.so.0 (0x00007f79dff78000)
Python bindings also exist for the library [15] As with Glupy already, GlusterFS also opens its doors to the world of this scripting language.
A special feature of GlusterFS is its modular design, and the previously mentioned translators are the basic structural unit. Almost every singular characteristic of a volume is represented by one of these translators. Depending on the configuration, GlusterFS links the corresponding translators to create a chain graph, which then determines the total capacity of the volume. The set of these property blocks is broken down into several groups. Table 1 lists these, together with a short description.
Table 1
Group Membership of the Translators
Group | Function |
---|---|
Storage | Determines the behavior of the data storage on the back-end filesystem |
Debug | Interface for error analysis and other debugging |
Cluster | Basic structure of the storage solution, such as replication or distribution of data |
Encryption | Encryption and decryption of stored data (not yet implemented) |
Protocol | Communication and authentication for client-server and server-server |
Performance | Tuning parameters |
Bindings | Extensions to other languages, such as Python |
Features | Additional features such as locks or quotas |
Scheduler | Distribution of new write operations in the GlusterFS cluster |
System | Interface to the system, especially to filesystem access control |
The posix
storage translator, for example, controls whether or not GlusterFS uses the VFS cache of the Linux kernel (O_DIRECT
). Here, admins can also set whether files should be deleted in the background. On the encryption side, unfortunately, there's not much to see. A sample implementation using ROT13 [16] can be found in the GlusterFS source code scope. A useful version for users is planned for the next release. Another special refinement is the binding group. Binding groups are not actually real translators but instead compose a framework for writing them in other languages. The most famous is Glupy [17], which represents the interface to Python.
Extensions and Interfaces in Ceph
Unlike GlusterFS, the topic of "extensions via plug-ins" is dealt with quickly for Ceph: Ceph currently supports no option for extending functionality at run time through modules. Apart from the almost factory-standard configuration diversity, Ceph restricts itself to "programming in" external features via the previously mentioned APIs, such as Librbd and Librados.
With respect to offering a RESTful API, Ceph has a clear advantage over GlusterFS. Because Ceph has already mastered the art of storing binary objects, only a corresponding front end with a RESTful interface is missing. At Ceph, the RADOS gateway occupies this place, providing two APIs for a Ceph cluster: On the one hand, the RADOS gateway speaks the protocol of OpenStack's Swift object store [18]; on the other, it also supports the protocol for Amazon's S3 storage, so that S3-compatible clients can also be used with the RADOS gateway.
In the background, the RADOS gateway relies on librados
. Moreover, the Ceph developers explicitly point out that the gateway does not support every single feature of Amazon's S3 or OpenStack's Swift. However, the basic features work, as expected, across the board, and the developers are currently focusing on implementing some important additional features.
Recently, the RADOS gateway has also added the seamless connection to OpenStack's Keystone source code – for example, for operations with both Swift and Amazon S3 (meaning the protocol in both cases). This is particularly handy if OpenStack is used in combination with Ceph (details on the cooperation between Gluster or Ceph and OpenStack can be found in the box "Storage for OpenStack").
Storage for OpenStack
As mentioned previously, storage solutions like GlusterFS and Ceph show their true attraction in combination with a cloud environment. The most active representative of this species is currently OpenStack, and both Ceph and GlusterFS can be easily operated alongside OpenStack.
This functionality makes it quite easy to operate GlusterFS as a back end for Cinder, which is the component in OpenStack that supplies VMs with persistent block storage. Gluster can also be used as a storage back end for the Glance image service via UFO so that seamless integration is possible here, too. Recently, Red Hat has itself invested a fair amount of development work with a view to integrating native GlusterFS support in QEMU so that Gluster is not a problem as a direct back end for VM storage – even if the VM needs to use the volume directly as a hard disk for its root system.
Ceph support is equally convenient: Glance has a native back end for Ceph out of the box; the same applies for Cinder. As previously mentioned, RBD support is already included in Libvirt via librbd
, so there should be no fear of problems occurring. All told, this a comprehensive victory – for both Ceph and Gluster. In terms of OpenStack support, the two projects are neck and neck.
Object Orientation in GlusterFS
Up to and including version 3.2, GlusterFS was a file-based storage solution. An object-oriented approach was not available. The popularity of Amazon S3 and other object store solutions "forced" the developers to act. UFO – Unified File and Object (Store) – came with version 3.3. GlusterFS is based very heavily on OpenStack's well-known Swift API. Integration in the open source cloud solution is easily the most-used application, and it does not look as if this will change in the foreseeable future.
Version 3.4 replaced UFO with G4O (GlusterFS for OpenStack) and needed an improved RESTful API. In this case, improvement refers mainly to compatibility with Swift, which is too bad; a far less specialized interface could attract other software projects. Behind the scenes, GlusterFS still operates on the basis of files. Technically, the back end for UFO/G4O consists of additional directories and hard links (Listing 2).
Listing 2
Behind the Scenes of Gluster Objects
# ls -ali file.txt .glusterfs/0d/19/0d19fa3e-5413-4f6e-abfa-1f344b687ba7 132 -rw-r--r-- 2 root root 6 3. Feb 18:36 file.txt 132 -rw-r--r-- 2 root root 6 3. Feb 18:36 .glusterfs/0d/19/0d19fa3e-5413-4f6e-abfa-1f344b687ba7 # # ls -alid dir1 .glusterfs/fe/9d/fe9d750b-c0e3-42ba-b2cb-22ff8de3edf0 .glusterfs /00/00/00000000-0000-0000-0000-000000000001/dir1/ 4235394 drwxr-xr-x 2 root root 6 3. Feb 18:37 dir1 4235394 drwxr-xr-x 2 root root 6 3. Feb 18:37 .glusterfs/00/00/00000000-0000-0000-0000-000000000001/dir1/ 135 lrwxrwxrwx 1 root root 53 3. Feb 18:37 .glusterfs/fe/9d/fe9d750b-c0e3-42ba-b2cb-22ff8de3edf0 \ -> ../../00/00/00000000-0000-0000-0000-000000000001/dir1
GlusterFS appears here as a hybrid and assigns an object to each file, and vice versa. The objects manage the software in the .glusterfs
directory. The storage software generates a fairly flat directory structure from the GFID (GlusterFS file ID) and creates hard links between object and file. The objects, containers, and accounts known from the world of Swift are equivalent to files, directories, and volumes on the GlusterFS page.
Buy this article as PDF
(incl. VAT)