Harden services with systemd
A Hard Nut to Crack
One of the most important goals in the development of systemd is securing Linux. Of course, you can only improve what can be measured, which is why Galileo Galilei advised: "Measure what is measurable, and make measurable what is not." Following this maxim, systemd now makes system security under Linux measurable and improvable.
More specifically, it is the systemd-analyze security
command that allows this measurement. When executed, it returns a table like that shown in Figure 1, listing each service managed by systemd (UNIT
); a numerical value for the degree of protection (EXPOSURE
, where 10 is both the highest and worst value); a verbal translation of this value (PREDICATE
); and another version of the rating (HAPPY
) in the form of an emoji.
Additionally, systemd-analyze
can reveal how it arrives at its assessment: To see this, start it with the name of a service unit. As shown in Figure 2, it lists all the factors that have been checked, along with a checkmark for passed or an X for failed.
Not a Tough Cookie
After that, the user knows systemd's opinion on the security status of the services it checked, but what can be done to improve the bad scores? To find out, you can build a minimal service, whose security you then elevate step-by-step. As an example, first create a minimal HTML page in an empty directory (e.g., /home/$USER/Python/sectest/
, which will serve later as the document root of a small web server) (Listing 1).
Listing 1
Minimal HTML Page
<!doctype html> <html lang=en> <head> <meta charset=utf-8> <title>Hello World</title> </head> <body> <p><h1>HELLO WORLD!</h1></p> </body> </html>
The easiest approach is to borrow the web server itself from Python, which already has a simple model that can be used with virtually no configuration. Next, wrap the server start in a systemd unit file – again, keeping it as simple as possible (Listing 2). Now save the unit file as /lib/systemd/system/helloworld.service
and the HTML page as index.html
in the document root directory. After typing
systemctl start helloworld.service
Listing 2
Unit File
[Unit] Description=Simple HTTP Server Documentation=https://docs.python.org/3/library/http.server.html [Service] Type=simple WorkingDirectory=/home/jcb/Python/sectest ExecStart=/usr/bin/python3 -m http.server 8080 ExecStop=/bin/kill -9 $MAINPID [Install] WantedBy=multi-user.target
enter localhost:8080 in the address bar of a web browser to bring up the plain Hello World page.
In this state, without any precautions, the service is completely unprotected. In the output of systemd-analyze security
, it appears with a high score of 9.6 as UNSAFE
and a shocked emoji (Figure 3).
Fundamentals
In the first step, add a line reading
NoNewPrivileges=true
to the Service
section of the unit file to prevent the process from escalating its privileges later (e.g., with setuid
or setgid
bits). After this (as for all subsequent additions to the unit file), you need to reload all unit files and restart the service:
systemctl daemon-reload systemctl restart helloworld.service
If you now look at the output of systemd-analyze security
, the exposure value of helloworld.service
has already dropped slightly, from 9.6 to 9.4. Admittedly, this still counts as unsafe.
On with the task: A whole class of attacks can be rendered impossible by adding
PrivateTmp=yes
to the unit file, which causes systemd to create a new, exclusive filesystem namespace for the process and to mount /tmp
and /var/tmp/
there. Therefore, the temporary files are no longer shared publicly and are immediately deleted after the process ends. Attacks based on swapping or manipulating temporary files now come to nothing. The exposure value drops to 9.0, but the rating remains unsafe.
The next step is to add
RestrictNamespaces=uts ipc pid user cgroup
to the unit file, which prevents the process from accessing the listed namespaces. The list deliberately excludes the net
namespace and a few others that the web server has to use. After this action, the exposure value drops below 9 (to 8.8) for the first time, and the rating is no longer unsafe, only EXPOSED
. The emoji's expression changes from horrified to merely unhappy.
Kernel and Control Groups
The next step is to enable additional protections in the unit file:
ProtectKernelTunables=yes ProtectKernelModules=yes ProtectControlGroups=yes
The kernel variables, which users can access via /proc/sys/
, /sys
, /proc/sysrq-trigger/
, /proc/latency_stats/
, /proc/acpi/
, /proc/timer_stats/
, /proc/fs/
, and /proc/irq/
, are now read-only and therefore no longer editable for the process. In any case, the system should only have write access to these variables during booting, so you are not losing any functionality.
Because the web server does not need any special kernel modules, you have also stopped it loading and unloading such modules for the web server process. From now on, it cannot access the control groups. Although container administration software might need this access, a web server does not. This step pushes the exposure value down to 8.1.
Finally, you can set:
ProtectSystem=strict PrivateUsers=strict
The first line mounts /usr
and the bootloader directories /boot
and /efi
in read-only mode for all processes that this unit starts. The second line configures a user group mapping for the process that maps root and the user that starts the unit's main process to itself – but maps all other users or groups to nobody
. The system's user and group database is thus decoupled from the process running in its own sandbox. The exposure value now drops below 8 (more precisely, to 7.8).
Buy this article as PDF
(incl. VAT)