Photo by Alexander Redl on Unsplash

Photo by Alexander Redl on Unsplash

Persistent volumes for Docker containers

Solid Bodies

Article from ADMIN 45/2018
By
Equip containers with persistent storage using Docker's plugin system.

Docker guarantees the same environment on all target systems: If the Docker container runs for the author, it also runs for the user and can even be preconfigured accordingly. Although Docker containers seem like a better alternative to the package management of current distributions (i.e., RPM and dpkg), the design assumptions underlying Docker and the containers distributed by Docker differ fundamentally from classic virtualization. One big difference is that a Docker container does not have persistent storage out of the box: If you delete a container, all data contained in it is lost.

Fortunately, Docker offers a solution to this problem: A volume service can provide a container with persistent storage. The volume service is merely an API that uses functions in the loaded Docker plugins. For many types of storage, plugins allow containers to be connected directly to a specific storage technology. In this article, I first explain the basic intent of persistent memory in Docker and why a detour through the volume service is necessary. Then, in two types of environments – OpenStack and VMware – I show how persistent memory can be used in Docker with the appropriate plugins.

Planned Without Storage

The reason persistent storage is not automatically included with the delivery of every Docker container goes back to the time long before Docker itself existed. The cloud is to blame: It made the idea of permanent storage obsolete because storage regularly poses a challenge in classic virtualization setups. If you compare classic virtualization and the cloud, it quickly becomes clear that two worlds collide here. A virtual machine (VM) in a classic environment rightly assumes that it is on persistent storage, so the entire VM can be moved from one host to another. However, this requires redundant memory in the background on which the data is stored centrally. Local memory (e.g., on the hard drive of the virtualization server) does not allow the flexibility described.

Because of the complicated tinkering in the background, the concept of persistent storage in the cloud was difficult right from the start. Amazon, the industry leader, distinguishes between two storage types in the Amazon cloud: Ephemeral storage is volatile storage that belongs to a specific VM and only exists as long as the VM exists, whereas volumes are persistent and can be connected to and disconnected from VMs at will. The normal case, as Amazon makes clear, is ephemeral storage. Because the basic assumption is that every VM in the cloud is completely automated anyway, in the event of a crash, any VM can be, at least theoretically, restored quickly. The sober assumption is that the application simply has to accommodate this design: Applications need to store their data in a central database instead of locally.

Volume Service for Persistent Storage

Although Docker itself is not used exclusively in the cloud context, it has adopted some of the cloud's design maxims, including those regarding persistent storage. However, the idea of "cloud-ready" applications that do not contain their own data often cannot be reconciled with the reality of the everyday sys admin. Even the greatest supporters of the concept must admit at some point that, in the end, some situations need persistent memory. Those who have to store their data in a database, as mentioned above, outlast the restart of the container or, if necessary, are linked to another container (e.g., as part of a database update).

The Docker developers solve the problem with their volume service: A volume is by definition a virtual hard drive that can be connected to any container and provides permanent storage. At the heart of the solution is the volume API: If you define persistent memory for a container, calling docker on the command line leads to a corresponding API request to the volume API that provides the desired memory and then connects it directly to the target VM.

Redundancy Is Essential

When dealing with persistent storage, Docker clearly must solve precisely those problems that have always played an important role in classic virtualization. Without redundancy at the storage level, for example, such a setup cannot operate effectively; otherwise, the failure of a single container node would mean that many customer setups would no longer function properly. The risk that the failure of individual systems precisely hitting the critical points of the customer setups, such as the databases, is clearly too great in this constellation.

The Docker developers have found a smart solution to the problem: The service that takes care of volumes for Docker containers can also commission storage locally and connect it to a container. Here, Docker makes it clear that the volumes are not redundant; that is, Docker did not even tackle the problem of redundant volumes itself. Instead, the project points to external solutions: In fact, various approaches are now on the market that offer persistent storage for clouds and deal with issues such as internal redundancy. One of the best-known representatives is Ceph, and to enable the use of such storage services, the Docker volume service is coupled with the plugin system that already exists, thus providing redundant volumes for Docker containers with the corresponding plugin of an external solution.

The Docker documentation also refers to "storage drivers," which involves storing data in the container and focuses on the container's root filesystem. The developers point out that normal data in the container should be on volumes so that the actual container is not changed.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus