Verifying your configuration
Checkup
Misconfiguration has long been a major problem in the IT world. Any admin can recall a scenario in which Dev did something, QA did another thing, and other departments did their own thing to pass the software through its stages of delivery. Finally, the product fails because of unknown manual tricks and black magic. Who could forget the time wasted in painful war room sessions, always wondering and struggling to reason about the weird failures? In fact, misconfiguration has been identified as the root cause for many recent public cloud outages – even for the tech giants like Amazon and Google.
How do you solve config misconfiguration and parity issues in the modern cloud-driven IT world? Open source solutions come to the rescue. In this article I describe how to use a tool called goss (an acronym for Golang server spec) [1] [2] to detect standard configuration baseline drift. I'll describe how to establish the practice of Acceptance as Code so that the same configuration flows across the whole software development life cycle to maintain much-needed parity. I'll use the Footloose and Docker container tools to bring up container machines for a test setup. Everything was tested on my Ubuntu 18.04 LTS laptop, but the code examples should run on any modern GNU/Linux machine able to run Docker Engine and Golang static binaries.
Server Validation
How do you verify a server configured to run a single service or a set of services? You get into the server and run a set of commands to dump and verify that the required packages are installed with proper versions and make sure the configuration files have correct values. You ensure that required services are configured to start automatically – and that those services are running and listening on the configured ports. You could also add more tests to cover more points of misconfigurations (e.g., disk space, partitions and their filesystems, RAM, etc.). So, you are asking your server a set of questions and expecting the correct answers as per your specifications. The modern automated way to do this is to bake all the necessary acceptance tests in your server only, with the server providing an on-demand report showing which tests are passing or failing.
Goss is a lightweight, no dependencies, free and open source software (FOSS) solution designed to achieve machine-level acceptance testing. The solution comprises a Golang static binary (macOS and Windows binaries are in alpha currently), so you can just put it in your servers, define your acceptance tests in YAML files, and feed those tests to the running goss application – that's it.
Getting Started
The first step is to create a base image to invoke a container machine test setup by running the command
docker build -f Dockerfile_ServerBase . -t ubuntu2204basewgoss
in a directory where the Dockerfile (Listing 1) and setup script (Listing 2) are located.
Listing 1
Dockerfile_ServerBase
FROM ubuntu:22.04 ENV container docker # Don't start any optional services except for the few we need. RUN find /etc/systemd/system /lib/systemd/system -path '*.wants/*' -not -name '*journald*' -not -name '*systemd-tmpfiles*' -not -name '*systemd-user-sessions*' -exec rm \{} \; RUN apt-get update && apt-get install -y dbus systemd openssh-server net-tools iproute2 iputils-ping curl wget vim-tiny sudo && apt-get clean && rm -rf /var/lib/apt/lists/* RUN >/etc/machine-id RUN >/var/lib/dbus/machine-id EXPOSE 22 RUN systemctl set-default multi-user.target RUN systemctl mask dev-hugepages.mount sys-fs-fuse-connections.mount systemd-update-utmp.service systemd-tmpfiles-setup.service console-getty.service RUN systemctl disable networkd-dispatcher.service # This container image doesn't have locales installed. Disable forwarding the # user locale env variables or we get warnings such as: # bash: warning: setlocale: LC_ALL: cannot change locale RUN sed -i -e 's/^AcceptEnv LANG LC_\*$/#AcceptEnv LANG LC_*/' /etc/ssh/sshd_config COPY setup_goss.sh /usr/local/bin/setup.sh RUN setup.sh && rm /usr/local/bin/setup.sh # https://www.freedesktop.org/wiki/Software/systemd/ContainerInterface/ STOPSIGNAL SIGRTMIN+3 CMD ["/bin/bash"]
Listing 2
setup_goss.sh
#! /bin/bash set -uo pipefail GOSSVER='0.3.18' GOSSCDIR='/etc/goss' RQRDCMNDS="chmod echo sha256sum tee wget" preReq() { for c in ${RQRDCMNDS} do if ! command -v "${c}" > /dev/null 2>&1 then echo " Error: required command ${c} not found, exiting ..." exit 1 fi done } instlGoss() { if ! wget -P /tmp "https://github.com/aelsabbahy/goss/releases/download/v${GOSSVER}/goss-linux-amd64" then echo "wget -P /tmp https://github.com/aelsabbahy/goss/releases/download/v${GOSSVER}/goss-linux-amd64 failed, exiting ..." exit 1 fi if ! wget -P /tmp "https://github.com/aelsabbahy/goss/releases/download/v${GOSSVER}/goss-linux-amd64.sha256" then echo "wget -P /tmp https://github.com/aelsabbahy/goss/releases/download/v${GOSSVER}/goss-linux-amd64.sha256 failed, continuing ..." else pushd /tmp if ! sha256sum -c goss-linux-amd64.sha256 then echo 'sha256sum -c goss-linux-amd64.sha256 failed, continuing ...' fi rm goss-linux-amd64.sha256 popd fi if chmod +x /tmp/goss-linux-amd64 then mv /tmp/goss-linux-amd64 /usr/local/bin/goss else echo 'chmod +x /tmp/goss-linux-amd64 failed, exiting ...' exit 1 fi } cnfgrGoss() { mkdir "${GOSSCDIR}" tee "${GOSSCDIR}/goss.yaml" <<EOF kernel-param: kernel.ostype: value: Linux mount: /: exists: true filesystem: overlay usage: lt: 90 port: tcp:22: listening: true ip: - 0.0.0.0 tcp6:22: listening: true ip: - '::' user: sshd: exists: true package: docker-ce: installed: true service: sshd: enabled: true running: true docker: enabled: true running: true process: sshd: running: true containerd: running: true dns: localhost: resolvable: true addrs: consist-of: ["127.0.0.1","::1"] timeout: 500 # in milliseconds EOF } main() { preReq instlGoss cnfgrGoss } main 2>&1
Now you have a base image with the goss binary and its test configuration baked in. The Footloose configuration shown in Listing 3 brings up your test container machine with the command:
footloose create
Listing 3
footloose.yaml
cluster: name: cluster privateKey: cluster-key machines: - count: 1 spec: backend: docker image: ubuntu2204basewgoss name: node%d privileged: true portMappings: - containerPort: 22
The command should run in the directory where the Footloose config file is located.
To get into the created test node and initiate a goss acceptance test run, use the commands:
footloose ssh root@node0 goss -g /etc/goss/goss.yaml v -f tap
Congratulations, you have run your first Acceptance as Code for your test server, and you should see some red or green output (Figure 1).
The goss configuration is built around a set of resources and their properties, creating a set of test criteria. The goss run checks the current state of those resources and declares them pass (green) or fail (red). The default test configuration file is goss.yaml
, but you can change the file name with the -g
global option.
Goss requires an action argument, which is -v
(--validate
) in this case. Typing only goss
on the command line dumps all the possible actions, as well as various global options. Every goss action takes a set of options that is dumped when you append the -h
(--help
) option. The -f
(--format
) option to validate
dumps the test report in various formats, including documentation
, json
, junit
, nagios
, and others. A silent
format doesn't dump anything but does indicates overall pass or fail for tests. You could use environment variables to set various options, as dumped in various help screens.
To summarize, I want to test for a GNU/Linux-based test server with at least working SSH login functionality, having root formatted with a particular filesystem and not filled up beyond 90 percent, with Docker Engine components properly set up and running. The goss test run verifies these requirements compared with reality in a report to catch a misconfiguration early, which can save you a lot of headaches later. If you run the command pipeline:
wget "https://get.docker.com" -O get-docker.sh && sh get-docker.sh && systemctl enable --now docker
to install and set up Docker Engine, then the goss
run will show all green (you could turn off the colors with the --no-color
validation option) to indicate that everything is in place on the test server (Figure 2).
Goss provides additional resources to build tests covering almost every aspect of a modern running machine. You can browse through goss resources [3] [4] to discover all the tests it currently provides. Also, when you're done with the test server, don't forget to clean it up with the command:
footloose delete
You don't always need to create the goss test configuration from scratch. You can use the add
command to append a test for a resource. If the mentioned resources are absent, then the added tests ensure they do not exist somewhere on the system. It's recommended you use the add
command on a fully configured system matching your desired end state. An autoadd
command automatically adds many, but not all, existing resources in your server by matching the provided argument. You should delve into the goss documentation to try your hand at these commands.
More Power to goss
So far I have used goss
at the command line, which is good for basic uses; however, you can also serve the testing reports continuously on an HTTP endpoint by creating a self-acceptance-testing server with this feature and running goss as a system service.
To set up this scenario, I'll create another base image to test more goodies provided by goss in a directory where the Dockerfile shown in Listing 1 and the setup script shown in Listing 4 are located:
docker build -f Dockerfile_ServerBase . -t ubuntu2204wgossrvc
Listing 4
setup_goss.sh Diff Additions
5a6 > GOSSVFLE='/lib/systemd/system/goss.service' 9,10c10,12 < sha256sum < tee --- > sha256sum > systemctl > tee 81a84,91 > tcp:58080: > listening: true > ip: > - 0.0.0.0 > tcp6:58080: > listening: true > ip: > - '::' 97a108,110 > goss: > enabled: true > running: true 103a117,118 > goss: > running: true 114a130,156 > setupGosSrvc() { > > tee "${GOSSVFLE}" <<'EOF' > [Unit] > Description=GOSS - Quick and Easy server validation > After=network.target > Documentation=https://github.com/aelsabbahy/goss/blob/master/docs/manual.md > > [Service] > ExecStart=/usr/local/bin/goss -g /etc/goss/goss.yaml s -l :58080 -f documentation -e /status > ExecStop=/bin/kill -s QUIT ${MAINPID} > Restart=on-abnormal > StandardOutput=journal > StandardError=journal > > [Install] > WantedBy=multi-user.target > EOF > > if ! systemctl enable goss > then > echo ' systemctl enable goss failed, exiting ...' > exit 1 > fi > > } > 119a162 > setupGosSrvc
Note that Listing 4 shows only the additions made to Listing 2 that were revealed by a diff.
Now bring up your new test node by changing footloose.yaml
, as shown in Figure 3, and executing the command footloose create
. You should now see the test node port 58080 exposed on a local port when you run the footloose show
command. If you hit that HTTP port with the command
curl localhost:<localhost exposed 58080 port>/status
you'll see a dump of the goss test report. Now that you have a working self-acceptance test running on the machine, you could further integrate the goss HTTP endpoint with your monitoring and alerting system to roll out a configuration drift detection system.
Until now, I have hard-coded properties and put the entire test configuration in a single file. This approach could work for a small fleet of servers, but you need more dynamic behavior and modularity in configuration to work with server automation for a medium to large fleet of servers. Goss provides a gossfile resource to render the final test configuration by aggregating multiple separate configurations. For example, you could create separate test configuration files for each resource in your overall test suite and manage a large number of tests easily.
I use this functionality to derive the final test configuration, at runtime, from the tests common to each server along with server-specific tests. Listing 5 creates the necessary resource-specific test files.
Listing 5
Aggregation Test Files
tee /tmp/kernel-param.yaml <<EOF kernel-param: kernel.ostype: value: Linux EOF tee /tmp/mount.yaml <<EOF mount: /: exists: true filesystem: overlay usage: lt: 90 EOF tee /tmp/port.yaml <<EOF port: tcp:22: listening: true ip: - 0.0.0.0 tcp6:22: listening: true ip: - '::' EOF tee /tmp/user.yaml <<EOF user: sshd: exists: true EOF tee /tmp/package.yaml <<EOF package: docker-ce: installed: true EOF tee /tmp/service.yaml <<EOF service: sshd: enabled: true running: true docker: enabled: true running: true EOF tee /tmp/process.yaml <<EOF process: sshd: running: true containerd: running: true EOF tee /tmp/dns.yaml <<EOF dns: localhost: resolvable: true addrs: consist-of: ["127.0.0.1","::1"] timeout: 500 # in milliseconds EOF tee /tmp/goss.yaml <<EOF gossfile: /tmp/kernel-param.yaml: {} /tmp/mount.yaml: {} /tmp/port.yaml: {} /tmp/user.yaml: {} /tmp/package.yaml: {} /tmp/service.yaml: {} /tmp/process.yaml: {} /tmp/dns.yaml: {} EOF
To render a valid goss test configuration, type the command
goss -g /tmp/goss.yaml r
in your terminal, and you should see a configuration dumped to your screen similar to that used in the earlier section. Now you can validate your server with the command:
goss -g /tmp/goss.yaml v -f tap
Goss provides templating functionality to generate tests dynamically and interpolate values for various properties. Value interpolation could use environmental variables or values provided in YAML or JSON files with the --vars
option. The test snippet in Listing 6 is a quick example demonstrating the loop functionality to ensure required interconnectivity in a three-node Kafka cluster.
Listing 6
Local Looping Template
addr: {{- range mkSlice "kafka0" "kafka1" "kafka2"}} tcp://{{.}}:2181: reachable: true timeout: 500 tcp://{{.}}:9092: reachable: true timeout: 500 {{end}}
Buy this article as PDF
(incl. VAT)