Building sustainably safe containers
Build by Number
Among other things, my job involves developing applications in the field of network automation on the basis of the Spring Boot framework, which requires a running Java environment. At the same time, some infrastructure applications are required, such as DNS servers.
Before containers existed, infrastructure services ran in minimal change root environments, containing only the necessary binaries (e.g., chroot/named
), configuration files, and libraries. This setup reduced the number of potential attack vectors for exposed services. For example, an attempt by the attacker to call /bin/sh
would fail because the environment would not have a shell.
Classical Docker build files, which use FROM ubuntu
to include a complete Ubuntu environment, are the exact opposite of the approach just described. The resulting container is easier to debug because, for example, a shell is available. However, it is also far larger and less secure because an attacker could find and use the shell binary.
Manufacturers keep their official containers up to date, which means that when the container is rebuilt, an updated Ubuntu would also be dragged in. However, no mechanism automatically triggers such a rebuild. One of my goals was therefore to rebuild automatically all containers that contain components for which patches are available. At the same time, I wanted the containers to be leaner.
Dockerfiles
Docker supports the ability to import the compressed tarball of a change root environment, but the build process is hard to maintain. It makes more sense to use a Dockerfile that contains the components of the image and also lets you import single files from other images. Calling scripts or entire installations might be possible, as well. To create such a container, you would use docker build
. To begin, though, copy an archive (usually a .tar.gz
) into a folder and create a file named Dockerfile
:
FROM dockerrepo.matrix.dev/gentoo-java:latest-amd64 ADD webapp.tar.gz / ENTRYPOINT ["java", "-jar", "mywebapp.jar"] EXPOSE 8080/tcp
The first line describes a base image whose filesystem is inserted into the current container. In this case, it's a Gentoo Linux-based image (see the "Why Gentoo?" box) that provides a runnable Java environment. The next line adds the contents of webapp.tar.gz
to the root directory of the container. The third line ensures that the call java -jar mywebapp.jar
is executed automatically if the container is started with docker run
and without arguments. The last line finally exposes port 8080, so that you can leave out the -p 8080:8080
option in the Docker call.
Why Gentoo?
The system presented here would also work with other distributions. I chose Gentoo because the distribution compiles applications locally from source files. Therefore, you can easily archive and document the sources of the binaries for a later audit of each version of each container. Because admins compile the documentation themselves and the compiler sources are also available, the chain of documentation can be traced back to the source code. Only an infection of the build host would offer an attack vector, and the risk can be mitigated by appropriate protection.
The Docker build process is organized hierarchically. The images provided by the binaries in the container build on each other. Starting with a base image, which is initially created as an empty image with FROM scratch
, several images can each completely import another one, which creates the layers that are downloaded one by one from the registry. If a layer remains unchanged, no download is required, saving time and bandwidth.
The referenced image, gentoo-java
, includes the GNU C library (glibc
) image and (because the Java binaries require it) the zlib
library and some GNU compiler collection (GCC) libraries. However, only the necessary shared libraries are included, not the complete images. Finally, the glibc
image uses a base image in its FROM
line, which contains a minimal filesystem with the /etc
, /dev
, and /tmp
directories. Thanks to its hierarchical structure, the build system, described later, can update individual layers of the image separately.
The source files for the images are available as tar.gz
archives, which are created from cleaned up file lists of packages. In the container, for example, neither man pages nor sample configurations are needed. Building up with one image per package might sound complex, but it only requires more work in the first step. The application images at the end of the chain can be exported as a single file and integrated into other registries if required.
Practical Implementation
The first step in creating a container image from a package is to collect the files from the operating environment. To help me keep track, I first defined a folder structure. Each container has a folder with a name that follows the <distribution>-<package name>
pattern, resulting in folders in the form gentoo-glibc
or gentoo-gcc
. Each of these folders contains the respective Docker file and the tar.gz
archive that was picked up.
GNU Make is used as the build tool because it makes it relatively easy to map dependencies to files by timestamps. If a package was updated since the last creation date of the tar.gz
archive, the timestamp of the files is newer and Make triggers an action.
A list of files is necessary to create the archive. The easiest way for an admin on Gentoo to create this list is to run the q files <package>
command. To discard unnecessary files, then, use grep
filters and pass the resulting list into a tar
command that reads the list of files to archive from standard input. For most of the packages that only deliver shared libraries, the section of the Makefile for the libuv
package is:
gentoo-libuv/gentoo-libuv.tar.gz: /usr/lib64/libuv.so.1 q files dev-libs/libuv | grep /usr/lib | tar -c -T - -v -z -f $@
Some packages need more files, so suitable grep
filters more or less sort out or sort in. The example also shows the dependency. The archive is only rebuilt if the /usr/lib64/libuv.so.1
file has changed. The manual work for each package now consists of identifying a file that can be used as an indicator for a patch and sorting out which files in the archive are necessary at the end.
My environment has two Makefiles: one to create the tar.gz
archives and one that then triggers the Docker build processes. Listing 1 shows the Makefile for the archives.
Listing 1
Makefile for Archives
all: gentoo-glibc/gentoo-glibc.tar.gz gentoo-gcc/gentoo-gcc.tar.gz gentoo-java/gentoo-java.tar.gz gentoo-gmp/gentoo-gmp.tar.gz gentoo-mpc/gentoo-mpc.tar.gz gentoo-mpfr/gentoo-mpfr.tar.gz gentoo-glibc/gentoo-glibc.tar.gz: /usr/include/libintl.h sh createglibctar.sh gentoo-gcc/gentoo-gcc.tar.gz: /usr/bin/gcc sh creategcctar.sh gentoo-java/gentoo-java.tar.gz: /usr/lib/jvm/icedtea-bin-8 createjavatar.sh sh createjavatar.sh gentoo-gmp/gentoo-gmp.tar.gz: /usr/lib64/pkgconfig/gmp.pc q files dev-libs/gmp |grep usr/lib|tar czvf $@ -T - gentoo-mpc/gentoo-mpc.tar.gz: /usr/lib64/libmpc.so q files dev-libs/mpc |grep lib|grep -v doc|tar czvf $@ -T - gentoo-mpfr/gentoo-mpfr.tar.gz: /usr/lib64/libmpfr.so q files dev-libs/mpfr |grep lib|grep -v doc|tar czvf $@ -T - gentoo-zlib/gentoo-zlib.tar.gz: /usr/lib64/pkgconfig/zlib.pc q files sys-libs/zlib | grep /lib64 | tar cvzf $@ -T -
For GCC and Java, a small shell script handles the task of compiling the packages, because softlinks still play a role that would otherwise be missing. The base container is not included in the Makefile, because it is not generated statically, but from packages.
After an upgrade, you now just need to call Make to recreate the archives where necessary, and the containers are then built. Immediately after building they are uploaded to the local registry with the latest
tag.
The sticking point here was the modification date. Although it is possible to query the modification data of existing containers in the registry or on the local host with an API call, it is difficult to do in the Makefile, which was what prompted me to cheat and simply add && touch builddate
to the docker build
call and then && touch pushtime
after docker push
. The two files are only created if the step was successful, and pushtime
serves as the target in the Makefile.
To map the hierarchy of the containers in the Makefile, the pushtime
files of all images are also included in the dependencies that are necessary to build the container. The Makefile section in Listing 2 illustrates this.
Listing 2
Managing Dependencies
gentoo-java/pushtime: gentoo-java/gentoo-java.tar.gz gentoo-glibc/pushtime gentoo-zlib/pushtime gentoo-gcc/pushtime cd gentoo-java; docker build -t dockerrepo.matrix.dev:gentoo-java:latest-amd64 . && touch buildtime && docker push dockerrepo.matrix.dev/gentoo-java:latest-amd64 && touch pushtime
The Java image is based on the glibc image, but also copies files from zlib and GCC, which means you have to build and upload these images before the Java image can be created. Listing 3 (abridged) shows the call to Make and its screen output after patches for glibc were released, triggering a rebuild of all containers.
Listing 3
Make After glibc Update (Abridged)
# make -f Makefile.docker cd gentoo-glibc; docker build -t dockerrepo.matrix.dev/gentoo-glibc:latest-amd64 . && touch buildtime && docker push dockerrepo.matrix.dev/gentoo-glibc:latest-amd64 && touch pushtime Sending build context to Docker daemon 21.12MB Step 1/2 : FROM dockerrepo.matrix.dev/gentoo-base:latest ---> 22fe37b24ebe Step 2/2 : ADD gentoo-glibc.tar.gz / ---> 4e800333acbd Successfully built 4e800333acbd Successfully tagged dockerrepo.matrix.dev/gentoo-glibc:latest-amd64 The push refers to repository [dockerrepo.matrix.dev/gentoo-glibc] 22bac475857f: Pushed 636634f1308a: Layer already exists [...] Step 2/8 : FROM dockerrepo.matrix.dev/gentoo-glibc:latest-amd64 [...] Step 8/8 : ADD gentoo-gcc.tar.gz / ---> b89e1b4ab2ba Successfully built b89e1b4ab2ba Successfully tagged dockerrepo.matrix.dev/gentoo-gcc:latest-amd64 The push refers to repository [dockerrepo.matrix.dev/gentoo-gcc] 794c152bde4c: Pushed [...] 22bac475857f: Mounted from gentoo-bind 636634f1308a: Layer already exists latest-amd64: digest: sha256:667609580127bd14d287204eaa00f4844d9a5fd2847118a6025e386969fc88d5 size: 1996 cd gentoo-java; docker build -t dockerrepo.matrix.dev/gentoo-java:latest-amd64 . && touch buildtime && docker push dockerrepo.matrix.dev/gentoo-java:latest-amd64 && touch pushtime Sending build context to Docker daemon 66.12MB Step 1/6 : FROM dockerrepo.matrix.dev/gentoo-glibc:latest-amd64 ---> 4e800333acbd Step 2/6 : COPY --from=dockerrepo.matrix.dev/gentoo-zlib:latest-amd64 /lib64/* /lib64/ ---> aaf3f557c027 Step 3/6 : COPY --from=dockerrepo.matrix.dev/gentoo-gcc:latest-amd64 /usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/lib* /lib64/ ---> 6f7d7264921c Step 4/6 : ADD gentoo-java.tar.gz / ---> afb2d5612109 Step 5/6 : ENV JAVA_HOME /opt/icedtea-bin-3.16.0 [...] 441dec54d0dd: Pushed 22bac475857f: Mounted from gentoo-glibc 636634f1308a: Layer already exists latest-amd64: digest: sha256:965aeac1b1cd78cde11aec58d6077f69190954ff59f5064900ae12285e170836 size: 1371
The trickiest task in this approach is that of resolving all the dependencies. Minimizing the container means finding all the necessary shared libraries, and the tool of choice is ldd
, which lists the referenced shared libraries of a binary.
Instead of running the binary right in the container in the environment you created, it makes sense to launch it in a change root environment, which makes it easier to find out which library is missing. Also, a run with strace
, which identifies missing configuration files, for example, is easier to handle in this way. If several binaries are used, the program might launch, but it could throw an error were a certain function called.
Developers also need to keep in mind that shared libraries occasionally change versions of dependencies. If the file used to determine whether the archive needs to be rebuilt is /usr/lib64/libdb-5.3.so
, and if version 5.4 is available after the updates, then the indicator file is missing and the Makefile fails. This possibility must be taken into account when selecting the indicator files.
Debugging Containers?
If the container does not work even though all libraries are present, it would normally be possible to find the error by starting a shell in the container; however, this lean approach does not have a shell option. Instead, a debug container can be built very easily. In the first step you need to create a container for the BusyBox package and then the debug container with the Docker file:
FROM dockerrepo/applicationcontainer:latest-amd64 COPY --from=dockerrepo/gentoo-busybox:latest-amd64 /bin/ /bin/
In the busybox
container a softlink needs to point from /bin/busybox
to /bin/sh
, which gives developers a version of the container with an interactive shell. However, this is a separate debug container, which means it is less likely to end up in production by mistake.
Buy this article as PDF
(incl. VAT)