Keeping Docker containers safe
Weak Link
Few debate that the destiny of a hosting infrastructure is running applications across multiple containers. Containers are a genuinely fantastic, highly performant technology ideal for deploying software updates to applications. Whether you're working in an enterprise with a number of critical microservices, tightly coupled with a pipeline that continuously deploys your latest software, or you're running a single LEMP (Linux, Nginx, MySQL, PHP) website that sometimes needs to scale up for busy periods, containers can provide with relative ease the software dependencies you need across all stages of your development life cycle.
Containers are far from being a new addition to server hosting. I was using Linux containers (OpenVZ) in production in 2009 and automatically backing up container images of around 250MB to Amazon's S3 storage very effectively. A number of successful container technologies have been used extensively in the past, including LXC, Solaris Zones, and FreeBSD jails, to name but a few.
Suffice to say, however, that the brand currently synonymous with container technology is the venerable Docker. Vendors, embracing their ever-evolving technology, in hand with clever, targeted marketing and some very nifty networking improvements, have driven Docker to the forefront of techies' minds and helped Docker ride on the crest of the DevOps wave. Docker provides businesses at all levels the ability to approach their infrastructure from a different perspective and, along with other DevOps technology offerings, has genuinely twisted the old paradigm and rapidly become the new norm.
As more businesses adopt such technologies, however, teething problems are inevitable. In the case of Docker, the more you use it, the more concerned you become about secure deployment. Although Docker's underlying security is problematic (great strides have been made to improve it over time), users tend to treat containers as though they are virtual machines (VMs), which they most certainly are not.
To begin, I'll fill you with fear. If I communicate the security issues correctly, you might never want to go near a container again. I want to state explicitly at this stage that these issues do not just affect the Docker model; however, because Docker is undoubtedly the current popular choice for containerization, I'll use Docker as the main example.
To provide sanity toward the end of the article, I offer some potential solutions to mitigate the security issues that you might not have considered to affect your containers previously.
Fear, Uncertainty, and Doubt
As I've alluded to already, a number of attack vectors exist on a system that runs a Docker daemon, but to my mind, the most critical is the Docker run
command and a handful of other powerful commands.
The run
command is powerful because it runs as the root user and can download images and mount volumes, among other things. Why is this bad? Well, the run
command can legitimately mount the entire filesystem on the host machine with ease. Consider being able to write to any file or directory from the top level of your main disk with this command, which mounts a volume using -v
:
$ docker run -v /:/tmp/container-filesystem chrisbinnie/my-web-server
Inside the container's filesystem (under the directory /tmp/container--filesystem
), you can see the whole drive for the host system and affect it with root user access.
As simply as I can put it, should you give access to common Docker commands to any user or process on your system, then they effectively, without any other rules being enforced, have superuser access to your entire host machine and not just the container that they're running.
The Docker website [1] doesn't hold back in telling users to exercise caution with their Docker daemon:
Running containers (and applications) with Docker implies running the Docker daemon. This daemon currently requires root privileges, and you should therefore be aware of some important details.
First of all, only trusted users should be allowed to control your Docker daemon. This is a direct consequence of some powerful Docker features. Specifically, Docker allows you to share a directory between the Docker host and a guest container; and it allows you to do so without limiting the access rights of the container. This means that you can start a container where the /host directory will be the / directory on your host; and the container will be able to alter your host filesystem without any restriction. This is similar to how virtualization systems allow filesystem resource sharing. Nothing prevents you from sharing your root filesystem (or even your root block device) with a virtual machine.
Now that I've given you cause for alarm, let me elaborate. First, I'll explore the key differences between VMs and containers. I'm going to refer to a Dan Walsh article [2]; Walsh does work for Red Hat and was pivotal in the creation of the top-notch security tool SELinux [3]. As the joke goes, Walsh weeps when you switch off SELinux and disables its sophisticated security because you don't know how to configure it [4].
Permit me to paraphrase some of the content from Walsh's aforementioned article as I understand it. Before continuing, here's a super-quick reminder about device nodes, which Linux uses to speak to almost everything on a system. This is thanks to the sophisticated tool that is Unix, which brought us Everything Is A File [5]. For example, on your filesystem, a CD-ROM drive in Linux is usually a file called /dev/cdrom
(actually, it's a symlink to /dev/sr0
for a SCSI drive), which streams data from your hardware to the system.
For security reasons, when a hypervisor helps run a system, the device nodes can talk to the VM's kernel but not the host's kernel. As a result, if you want to attack the host that is running on a VM, you first need to get a process to negotiate the VM's kernel successfully and then find a vulnerability in the hypervisor. After those two challenges have been met, if you're running SELinux, an attacker still has to circumvent the SELinux controls (which are normally locked down on a VM) before finally attacking the host's kernel.
Unfortunately for those users running containers of the popular variety, an attacker has already reached the point of talking to the host's kernel. In other words, very little protective buffering lies between taking control of a container and controlling the whole host running the container's daemon.
Therefore, by giving a low-level user access to the Docker daemon, you cannot break an established security model (sometimes called the "principle of least privilege" [6]), which has been used since the early 1970s: Where normal users may run certain commands, admin users may run a few more commands with slightly more risk associated with them, and superusers can run any command on the system. To reinforce the matter at hand, Figure 1 explains the official word from Oracle on giving access to the Docker command.
Well-Intentioned Butlers
At this juncture, let me use the popular automation tool Jenkins [8] as an example, because it's highly likely that Jenkins is being used somewhere on a continuous integration (CI) pipeline near you to run automated jobs. Likely, one or more of these Jenkins jobs affect Docker containers in some way by stopping, starting, deleting, or creating them.
As you might guess, any privileged job (e.g., running a container) would need direct access to the Docker daemon. Even when you follow documentation and set up a Docker system group (e.g., the docker group), you're sadly not protected and are effectively giving up root access to your host. Being a member of the docker system group means that a human, process, or CI tool such as Jenkins can use SSH keys or passwords to log in to a host running the Docker daemon and act like the root user without actually being the superuser.
Ultimately this means your Jenkins user (or equally any other user or process accessing your Docker daemon) has full control of your host, so how your Jenkins logs in and authenticates then becomes a superuser-level problem to solve.
From a security perspective that's bad news. Imagine silently losing a host in a cluster, with an attacker sitting quietly for a month or two picking up all your data and activity for that period before exposing their presence – if they ever do. Such attacks of the Advanced Persistent Threat (APT) variety are more common than you might think.
Privilege Separation
A quick reminder about privilege separation and why it's so important before continuing: An application like Nginx, the lightning-fast web server, might start up as the root user so that it can open a privileged port within the TCP 0-1024 port range. By default, these ports are protected from all but the root user to stop the nefarious serving of data from common server ports.
In the case of a web server, of course, the most common ports would usually be TCP ports 80 and 443. Once Nginx has opened these ports, worker processes run as less privileged users (e.g., user nginx ) in an attempt to mitigate any exploits on the Nginx daemon running as the root user.
As a result, your network-exposed ports don't have your root user sitting attentively waiting for an attack; instead, a user with much less system access listens. You can find out more about privilege separation online if you're interested [9].
In Figure 2, you can see which user the Docker container is exposing to the big, bad Internet at large. Thankfully, it's not the root user but the nginx user, as it should be.
Buy this article as PDF
(incl. VAT)