Kubernetes StatefulSet
Hide and Seek
Many enterprises are looking to migrate legacy monolithic applications to scale-out architectures with containers. Unfortunately, this step turns out to be far more complicated than many IT departments would like. A cloud-native architecture requires that various components of the applications communicate with each other on a message bus to allow for distribution and scaling to different nodes. The big challenge for the developer is: Where does the application back up its data files?
Up to now, SQL databases have mainly handled this task. However, very few SQL implementations can run as clusters in a scale-out architecture. Application developers now have a choice: Either run the existing database technology in a container or switch to a scale-out database. However, this choice is only available to large companies with their own software developers. Small and medium-sized enterprises (SMEs), on the other hand, tend to work with off-the-shelf tools and are forced to use one of the databases that their application supports.
Despite the steadily increasing number of NoSQL databases for scale-out architectures, many web applications still only support one or more classic SQL databases. The most popular candidates include MariaDB (MySQL), PostgreSQL, and Microsoft SQL. In this article, I discuss how to run these SQL classics reliably in container environments with Docker, Podman, or Kubernetes and present a few interesting NoSQL approaches.
Trusted Sources
In a genuine scale-out application, the failure of a single container should not affect the application as such. Therefore, early implementations had no function to provide persistent (i.e., non-volatile) mass storage for a single container. Unfortunately, the SQL classics run the database server in a single container: If the container fails, the database is lost, and it doesn't help that the container platform can restart the container in a fraction of a second.
The platform therefore needs to provide the container a reliable mass storage device on which it can store its data and that will survive the demise of a container. This storage is then usually mounted as an overlay on the container's filesystem at a specified location (e.g., /var/lib/mysql
), and the container template for the respective database needs to provide the appropriate logic to evaluate correctly the content of the persistent storage.
If a container starts with empty persistent storage (and only then), the container template's housekeeping system has to create the appropriate directory structure. If the container starts up with populated persistent storage, the application reads the data and configuration from the overlay. In the process, the logic also needs to start update or repair processes in case of doubt. If a database container crashes and corrupts the database files in the process, the container needs to check for consistency on restarting.
On the other hand, you could stop a container with database version x and start a container with version x +1. This container then checks the dataset for its version and carries out an update without corrupting any data.
You have to be very careful where you pick up your container templates and just as careful if you build them yourself. Most database server vendors offer an official container template through one of the popular Docker or Kubernetes registries and on GitHub. The documentation specifies exactly what is included in the template and how it handles update and recovery scenarios. If you want to build your own template, check the GitHub repositories of the official builds to see what their entrypoint.sh
scripts do before starting the database.
Small Steps to the Container
If you run containers on a single node with Docker or Podman, you will not usually assign separate IP addresses – at least not addresses that would be visible to other machines on the network. Port forwarding routes one or more IP ports of the container to ports of the host system. A MariaDB container can forward its internal port 3306 directly to host port 3306 or any other.
Alternatively, you could provide a bridge adapter to the container environment. The containers can then be managed directly with individual IP addresses like virtual machines (VMs). In such an environment, a "dedicated" MariaDB container could be run as a replacement for a MariaDB VM:
podman run --name maria --volume /var/pods/maria:/var/lib/mysql:Z --net pub_net --ip 192.168.1.10 --mac-address 12:34:56:78:9a:bcdocker.io/mariadb:latest
The host system /var/pods/maria
directory contains the database files of the container. If it crashes, the data is retained. The whole setup works without port forwarding. The pub_net
defined by the administrator runs on a network bridge. All the systems on the local network can access the MariaDB container by way of its IP address.
The code points to the mariadb:latest
image. Each restart of the container can trigger an update of the database – major versions, too. For production environments, you will therefore always want to specify the MariaDB version number and only update after appropriate testing. Image tags let you to specify only the major or minor releases (e.g., mariadb:10
or mariadb:10.8.3
). However, a setup like this with a "generic" database server that many clients in the local area network address directly is not used very often. In far more cases, a database server only supports a single application.
The second example shows how to run a MariaDB server on Kubernetes (see the box "Kubernetes Test with Microshift"). The example uses StatefulSet
to ensure that an application with a state always has the appropriate persistent storage available (Figures 1 and 2). To begin, create the mariadb-state.yml
file (Listing 1) that launches with the service.
Listing 1
<mariadb-state.yml
01 --- 02 apiVersion: v1 03 kind: Service 04 metadata: 05 name: mariadb 06 labels: 07 app: mariadb 08 spec: 09 type: NodePort 10 ports: 11 - port: 3306 12 protocol: TCP 13 selector: 14 app: mariadb 15 --- 16 apiVersion: v1 17 kind: ConfigMap 18 metadata: 19 name: mariadb 20 labels: 21 app: mariadb 22 data: 23 MYSQL_ROOT_PASSWORD: mysqlroot 24 MYSQL_DATABASE: db1 25 MYSQL_USER: mysqluser 26 MYSQL_PASSWORD: mysqlpwd 27 --- 28 apiVersion: apps/v1 29 kind: StatefulSet 30 metadata: 31 name: mariadb 32 spec: 33 serviceName: "mariadb" 34 replicas: 1 35 selector: 36 matchLabels: 37 app: mariadb 38 --- 39 volumeClaimTemplates: 40 - metadata: 41 name: mariadb 42 spec: 43 accessModes: [ "ReadWriteOnce" ] 44 resources: 45 requests: 46 storage: 10Gi
Kubernetes Test with Microshift
The simple Microshift, a variety of OpenShift-Kubernetes that targets the niche between lean Linux edge devices and OpenShift-Kubernetes edge clusters, is useful as a Kubernetes environment for testing and development. To proceed, simply set up a Red Hat Enterprise Linux 8 (RHEL 8), CentOS Stream 8, or Fedora 35 VM with a minimal setup of two to four virtual (v)CPUs and 4 to 8GB of RAM and turn off the firewall. Then, you just need to type a few lines:
dnf module enable -y cri-o:1.21 dnf install -y cri-o cri-tools systemctl enable crio -now dnf copr enable -y @redhat-et/microshift dnf install -y microshift systemctl enable microshift -now
While the service starts and fetches the required container images from the Internet, you can download the oc
and kubectl
clients:
curl -O https://mirror.openshift.com/pub/open-shift-v4/$(uname -m)/clients/ocp/stable/openshift-client-linux.tar.gz tar -xf openshift-client-linux.tar.gz -C /usr/local/bin oc kubectl
To check whether the back-end services are running, the command
crictl ps
should return a list of containers. If this is the case, copy the configuration file to your home directory:
mkdir ~/.kube cat /var/lib/microshift/resources/kubeadmin/kubeconfig > ~/.kube/conf
If you want to manage your Kubernetes server from another system (e.g., a Windows workstation), first copy the specified kubeconfig
file to the client system, load the file into an editor, and look for the line:
server: https://127.0.0.1:6443
Replace 127.0.0.1
with the external IP address of your VM. The line occurs several times in the kubeconfig
file. Next, install the kubectl
and oc
client tools you need for your workstation. You can then control your Microshift VM from your workstation.
Lines 1-14 declare the database port as a service. This definition can then be used as an easy way to connect other applications to the database server. Because this example is a single-node cluster and you want to address the MariaDB container directly, you need to create the service as a NodePort
. Kubernetes then generates an automatic port mapping (Figure 3).
The file continues with a ConfigMap
(lines 16-26). The values specified as data
correspond to the environment variable declarations (-e
) that you pass in at the Docker or Podman command line. However, the ConfigMap
data is in plain text in the Kubernetes configuration. In a production environment, you would define the passwords separately as Secret
for encryption and security.
Now it's time for the StatefulSet
itself (lines 28-37). This section declares the Kubernetes pod with its name
and the number of replicas
. Because MariaDB in this configuration does not support active-active mirroring, the pod only has one replica, so StatefulSet
makes sure that exactly one pod is running at any given time. If it crashes for any reason, Kubernetes automatically starts a new pod within seconds (Listing 2).
Listing 2
MariaDB Pod Reboot
01 template: 02 metadata: 03 labels: 04 app: mariadb 05 spec: 06 containers:: 07 - name: mariadb 08 image: mariadb:latest 09 ports: 10 - containerPort: 3306 11 name: mariadb 12 volumeMounts: 13 - name: mariadb 14 mountPath: /var/lib/mysql 15 envFrom: 16 - configMapRef: 17 name: mariadb
A pod can contain one or more containers that always run together and cannot scale separately. Mostly, however, a pod comprises a single container (i.e., mariadb:latest
here); alternatively, the version number might be stated. You could optionally specify quota rules at this point (i.e., the RAM size and CPU shares the container will be given) as maximum or minimum values. This pod retrieves its environment variables from the ConfigMap
declared earlier.
At this point, the volume mount point for the persistent volume (PV) is important. With volumeClaimTemplates
(lines 39-46), StatefulSet
automatically generates a PV claim, which in turn creates the PV and binds it to the pod. To run StatefulSet
, simply create a namespace (projects) and run the specified YML file:
oc new-project mysql oc create -f mariadb-state.yml
The command is
kubectl create namespace mysql kubectl apply -f mariadb-state.yml
if you only use the kubectl
client.
PostgreSQL and Microsoft SQL
You can also use exactly the same principle to create stateful sets for PostgreSQL or Microsoft SQL servers in containers. For PostgreSQL, use port 5432 instead of 3306 and the postgresql:latest
or postgresql:14
image and type the following lines into ConfigMap
:
data: POSTGRES_DB: postgresdb POSTGRES_USER: admin POSTGRES_PASSWORD: test123
Microsoft delivers two different images for the SQL server. The link mcr.microsoft.com/mssql/server picks up the current SQL server image on an Ubuntu basis from the in-house registry, whereas the link mcr.microsoft.com/mssql/rhel/server gets a SQL server on an RHEL 8 (UBI8=universal base image 8) basis. Microsoft (MS) SQL server uses port 1433. You have to pay attention to one small detail in the config map:
data: ACCEPT_EULA: "Y" SA_PASSWORD: "<LongPassword>" MSSQL_PID: "Developer"
MS SQL server requires a long password with eight digits or more. If you only specify test123
here, as in the PostgreSQL example, the MS software will stop the container immediately after it is started and you end up in a crash loop. The ACCEPT_EULA
variable is also mandatory.
Buy this article as PDF
(incl. VAT)