Photo by Jan Schulz # Webdesigner Stuttgart

Cloud-native storage with OpenEBS

Home in the Clouds

Article from ADMIN 62/2021
By
Software from the open source OpenEBS project provides a cloud-native storage environment that makes block devices available to individual nodes in the Kubernetes cluster.

In the beginning, running containers in a stateless mode seemed to be the best and simplest approach because it was possible to scale container-based applications by starting additional containers on additional hosts. If you had problems with a container, it could simply be terminated and restarted on another node – or at least that was the theory.

In practice, users unfortunately achieve less with stateless containers than they would like, prompting the Kubernetes container platform in early days to add new features in each release, such as the stateful sets or the storage interface for persistent volumes. Kubernetes also implemented the Container Storage Interface (CSI), which the Cloud Native Computing Foundation (CNCF) made standard.

In addition to the persistent volume capabilities, numerous storage vendors in the Kubernetes world cater to container clusters. If Kubernetes is running on one of the major cloud platforms, you can typically find an interface to the corresponding storage service, such as GCE Persistent Disk (Google) or AWS Elastic Block Store (EBS; Amazon). Moreover, you have the option to use classic storage protocols, such as NFS, Internet SCSI (iSCSI), and FibreChannel, or modern distributed filesystems such as Ceph or GlusterFS.

Cloud-Native Storage

Another group of storage solutions tries to get to the root of the problem with technologies developed from scratch that understand how to deal with the specifics of a container landscape. Examples of these cloud-native storage approaches include Rook [1], Portworx, StorageOS, and OpenEBS [2], which I take a closer look at in this article.

OpenEBS is an open source product from MayaData, who also provides corresponding offerings with support for enterprise customers. Company founder Evan Powell coined the term "container-attached storage" for the technology, with the essential difference from so-called legacy systems being that all management components of the storage layer run as pods in the Kubernetes cluster itself. Of course, this is not a unique selling point because other cloud-native offerings (e.g., Rook) also work in this way; however, in the case of classic storage providers or cloud providers, the storage management runs outside the cluster.

One key difference between OpenEBS and Rook, for example, is how storage is handled behind the scenes. Whereas Rook essentially takes over the task of automating a Ceph cluster running in Kubernetes, OpenEBS has developed its own approach that implements various components for integration into Kubernetes and relies on iSCSI for storage provisioning. The provisioning is completely transparent to the cluster admin, so there is no need to run an iSCSI server, as would be the case with classic Kubernetes-iSCSI integration. You will see exactly how this works in a moment when I take you through installing OpenEBS on a Kubernetes cluster.

In principle, individual worker nodes in the cluster act as storage nodes by making their local storage space available to the cluster. They can be appropriately optimized, dedicated nodes or even normal worker nodes, provided they have the necessary storage. According to the OpenEBS developers, this flexibility even improves the reliability of the storage, because as you add more nodes for more workloads, storage availability also increases. At the same time, according to the manufacturer, the effects of the failure of individual nodes can be minimized, because OpenEBS internal replication means that the metadata is automatically available on every node.

Installing OpenEBS

To use OpenEBS, you need a Kubernetes cluster (Figure 1) of at least version 1.13 – preferably 1.14 if you want to use some new features like snapshots and volume cloning. Nodes running OpenEBS services must have an iSCSI client ("initiator" in iSCSI-speak) installed, which assumes that you have enough control over the nodes to allow the installation of packages.

Figure 1: A Kubernetes cluster with storage in the worker nodes.

To install OpenEBS, you must be in the cluster admin context; in other words, you need maximum permissions for the Kubernetes cluster, in part because various elements of OpenEBS are implemented as custom resource definitions. To install the OpenEBS components in the cluster, use either the Kubernetes package manager Helm version 2 or 3 or the YAML file of the OpenEBS operator. Instead of a default installation, you can also perform a customized install by downloading and editing the YAML file, which means you can specify, for example, node selectors that specify which Kubernetes worker nodes run the OpenEBS control plane, the Admission controller, and the Node Disk Manager. You can apply the default YAML file directly from the repository:

kubectl apply -f https://openebs.github.io/charts/openebs-operator.yaml

After that, a quick look at the list of running pods reveals whether everything is working. Here, you have to specify the openebs namespace, which is where the OpenEBS components are located (Listing 1). A call to

kubectl get storageclass

lists the available storage classes, including four new ones: openebs-device , openebs-local , openebs-jiva-default , and openebs-snapshot-promoter . Now, in principle, you can start using storage from OpenEBS for pods, even if the storage classes set up so far are not yet complete or optimal.

Listing 1

openebs Namespace

kubectl get pods -n openebs
NAME
maya-apiserver-7b4988fcf6-f4q9q
openebs-admission-server-54dd65b4c9-hkpdh
openebs-localpv-provisioner-68bb775959-g7z8q
openebs-ndm-ggndz
openebs-ndm-hdp5s
openebs-ndm-operator-6b678c6f7f-tp9t6
openebs-ndm-qh4zs
openebs-provisioner-67bfd5bff-p45mz
openebs-snapshot-operator-85d8d495c-g7b77

A simple example of a persistent volume claim (PVC) is shown in Listing 2; it uses

kubectl apply -f pvc.yaml

to create a Jiva volume that can then be used in a database's YAML deployment file.

Jiva is one of four available storage engines and, in addition to the two special cases Local Hostpath and Local Device, the more sophisticated storage type implemented by OpenEBS. Each participating node contributes a directory to the Jira storage pool. A volume created with it is automatically replicated between the nodes. Without further intervention, the default storage pool for Jiva is already configured, creating three replicas of the data, which require three nodes in the cluster.

Managing Storage

For a quick and more detailed look at the configuration of the storage class, you can enter:

kubectl describe sc openebs-jiva-default
kubectl describe sp default

The first command points you to the associated "default" storage pool, and the second reveals more about it, such as the location (/var/openebs) on each storage node. If you want to have more control over your storage, you can add another disk to each node, install a filesystem, and mount it as a new Jiva storage pool. The associated storage class can then be used to define various parameters, such as the number of replicas.

Basically, data replication happens at the volume level and not the storage pool level. You can see this when you request a persistent volume with the PVC in Listing 2. You will then see four pods in the OpenEBS namespace that contain the repl number name component in addition to the PVC's ID PVC. Each pod runs on a different node and takes care of its part of the replication.

Listing 2

PVC for Jiva Storage

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: demo-vol1-claim
spec:
  storageClassName: openebs-jiva-default
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 4G

Two other storage options offered by OpenEBS are Local PV Hostpath and Local PV Device. Whereas the Hostpath engine can make a directory of a node available to the pods running on the same machine, Local PV Device accesses a dedicated block device that is available on a node. One advantage of both approaches over the local volume provisioner, which Kubernetes comes with out of the box, is the ability to provision storage dynamically (i.e., to serve an application's request instead of provisioning a volume in advance in the admin context). Additionally, both storage options, like all other OpenEBS storage engines, support backup and restore with the Kubernetes application Velero.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus