Cloud-native storage with OpenEBS
Home in the Clouds
In the beginning, running containers in a stateless mode seemed to be the best and simplest approach because it was possible to scale container-based applications by starting additional containers on additional hosts. If you had problems with a container, it could simply be terminated and restarted on another node – or at least that was the theory.
In practice, users unfortunately achieve less with stateless containers than they would like, prompting the Kubernetes container platform in early days to add new features in each release, such as the stateful sets or the storage interface for persistent volumes. Kubernetes also implemented the Container Storage Interface (CSI), which the Cloud Native Computing Foundation (CNCF) made standard.
In addition to the persistent volume capabilities, numerous storage vendors in the Kubernetes world cater to container clusters. If Kubernetes is running on one of the major cloud platforms, you can typically find an interface to the corresponding storage service, such as GCE Persistent Disk (Google) or AWS Elastic Block Store (EBS; Amazon). Moreover, you have the option to use classic storage protocols, such as NFS, Internet SCSI (iSCSI), and FibreChannel, or modern distributed filesystems such as Ceph or GlusterFS.
Cloud-Native Storage
Another group of storage solutions tries to get to the root of the problem with technologies developed from scratch that understand how to deal with the specifics of a container landscape. Examples of these cloud-native storage approaches include Rook [1], Portworx, StorageOS, and OpenEBS [2], which I take a closer look at in this article.
OpenEBS is an open source product from MayaData, who also provides corresponding offerings with support for enterprise customers. Company founder Evan Powell coined the term "container-attached storage" for the technology, with the essential difference from so-called legacy systems being that all management components of the storage layer run as pods in the Kubernetes cluster itself. Of course, this is not a unique selling point because other cloud-native offerings (e.g., Rook) also work in this way; however, in the case of classic storage providers or cloud providers, the storage management runs outside the cluster.
One key difference between OpenEBS and Rook, for example, is how storage is handled behind the scenes. Whereas Rook essentially takes over the task of automating a Ceph cluster running in Kubernetes, OpenEBS has developed its own approach that implements various components for integration into Kubernetes and relies on iSCSI for storage provisioning. The provisioning is completely transparent to the cluster admin, so there is no need to run an iSCSI server, as would be the case with classic Kubernetes-iSCSI integration. You will see exactly how this works in a moment when I take you through installing OpenEBS on a Kubernetes cluster.
In principle, individual worker nodes in the cluster act as storage nodes by making their local storage space available to the cluster. They can be appropriately optimized, dedicated nodes or even normal worker nodes, provided they have the necessary storage. According to the OpenEBS developers, this flexibility even improves the reliability of the storage, because as you add more nodes for more workloads, storage availability also increases. At the same time, according to the manufacturer, the effects of the failure of individual nodes can be minimized, because OpenEBS internal replication means that the metadata is automatically available on every node.
Installing OpenEBS
To use OpenEBS, you need a Kubernetes cluster (Figure 1) of at least version 1.13 – preferably 1.14 if you want to use some new features like snapshots and volume cloning. Nodes running OpenEBS services must have an iSCSI client ("initiator" in iSCSI-speak) installed, which assumes that you have enough control over the nodes to allow the installation of packages.
To install OpenEBS, you must be in the cluster admin context; in other words, you need maximum permissions for the Kubernetes cluster, in part because various elements of OpenEBS are implemented as custom resource definitions. To install the OpenEBS components in the cluster, use either the Kubernetes package manager Helm version 2 or 3 or the YAML file of the OpenEBS operator. Instead of a default installation, you can also perform a customized install by downloading and editing the YAML file, which means you can specify, for example, node selectors that specify which Kubernetes worker nodes run the OpenEBS control plane, the Admission controller, and the Node Disk Manager. You can apply the default YAML file directly from the repository:
kubectl apply -f https://openebs.github.io/charts/openebs-operator.yaml
After that, a quick look at the list of running pods reveals whether everything is working. Here, you have to specify the openebs namespace, which is where the OpenEBS components are located (Listing 1). A call to
kubectl get storageclass
lists the available storage classes, including four new ones: openebs-device , openebs-local , openebs-jiva-default , and openebs-snapshot-promoter . Now, in principle, you can start using storage from OpenEBS for pods, even if the storage classes set up so far are not yet complete or optimal.
Listing 1
openebs Namespace
kubectl get pods -n openebs NAME maya-apiserver-7b4988fcf6-f4q9q openebs-admission-server-54dd65b4c9-hkpdh openebs-localpv-provisioner-68bb775959-g7z8q openebs-ndm-ggndz openebs-ndm-hdp5s openebs-ndm-operator-6b678c6f7f-tp9t6 openebs-ndm-qh4zs openebs-provisioner-67bfd5bff-p45mz openebs-snapshot-operator-85d8d495c-g7b77
A simple example of a persistent volume claim (PVC) is shown in Listing 2; it uses
kubectl apply -f pvc.yaml
to create a Jiva volume that can then be used in a database's YAML deployment file.
Jiva is one of four available storage engines and, in addition to the two special cases Local Hostpath and Local Device, the more sophisticated storage type implemented by OpenEBS. Each participating node contributes a directory to the Jira storage pool. A volume created with it is automatically replicated between the nodes. Without further intervention, the default storage pool for Jiva is already configured, creating three replicas of the data, which require three nodes in the cluster.
Managing Storage
For a quick and more detailed look at the configuration of the storage class, you can enter:
kubectl describe sc openebs-jiva-default kubectl describe sp default
The first command points you to the associated "default" storage pool, and the second reveals more about it, such as the location (/var/openebs
) on each storage node. If you want to have more control over your storage, you can add another disk to each node, install a filesystem, and mount it as a new Jiva storage pool. The associated storage class can then be used to define various parameters, such as the number of replicas.
Basically, data replication happens at the volume level and not the storage pool level. You can see this when you request a persistent volume with the PVC in Listing 2. You will then see four pods in the OpenEBS namespace that contain the repl number name component in addition to the PVC's ID PVC. Each pod runs on a different node and takes care of its part of the replication.
Listing 2
PVC for Jiva Storage
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: demo-vol1-claim spec: storageClassName: openebs-jiva-default accessModes: - ReadWriteOnce resources: requests: storage: 4G
Two other storage options offered by OpenEBS are Local PV Hostpath and Local PV Device. Whereas the Hostpath engine can make a directory of a node available to the pods running on the same machine, Local PV Device accesses a dedicated block device that is available on a node. One advantage of both approaches over the local volume provisioner, which Kubernetes comes with out of the box, is the ability to provision storage dynamically (i.e., to serve an application's request instead of provisioning a volume in advance in the admin context). Additionally, both storage options, like all other OpenEBS storage engines, support backup and restore with the Kubernetes application Velero.
Buy this article as PDF
(incl. VAT)
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.