« Previous 1 2 3 4 Next »
Secure microservices with centralized zero trust
Inspired
Deployment on Kubernetes
The SPIRE quickstart documentation for Kubernetes [5] is easy to deploy, and the example commands demonstrate the successful delivery and content of an SVID in a simple client workload (also created in the quickstart guide). Don't be put off by the comment that it has been tested on Kubernetes versions 1.13.1, 1.12.4, and 1.10.12, which are all at least five years out of date; I tried it on Kubernetes v1.24 and had no problems, because all of the Kubernetes objects required are completely standard. Simply clone the repo [6] and deploy the YAML files in the quickstart
directory.
Your Kubernetes cluster must have a default storage class, because, as you might expect, the spire-server
pod stores its SQLite data store on a persistent volume. You'll see that a new namespace called spire
is created. Inside that namespace, spire-server
is deployed as a statefulset, and spire-agent
is deployed as a daemonset, with a pod on each worker node in the cluster. The Unix socket through which pods will access the SPIRE Workload API can be seen at /run/spire/sockets/agent.sock
on each Kubernetes node.
However, the quickstart is of limited use because it doesn't provide a reliable way to make the Workload API Unix socket available to your application pods. In a SPIFFE domain, a pod with no access to a Workload API socket is like Jason Bourne with no fisherman to rescue him! Therefore, I recommend bypassing the quickstart and starting with the SPIFFE container storage interface (CSI) driver example [7] straightaway. The SPIFFE CSI example deploys spire-server
and spire-agent
just like the official quickstart does, but it additionally creates a CSI driver, csi.spiffe.io
, implemented by means of another daemonset, spiffe-csi-driver
.
The CSI driver connects each node's Workload API Unix socket to any pod that requires it in the form of a volume. If you're familiar with Kubernetes, it might seem more straightforward simply to mount the Unix socket into the pod as a hostPath
volume; however, the security policies in many Kubernetes clusters prevent non-privileged pods from hostPath
mounting. The CSI method, although more expensive in terms of cluster resources, is at least reliable.
To deploy the SPIRE CSI example onto a Kubernetes cluster, take these steps:
Clone the SPIFFE CSI repo to the host you use for performing Kubernetes admin tasks:
git clone https://github.com/spiffe/spiffe-csi
- Amend
spiffe-csi/example/config/spire-server.yaml
to add a persistent volume and a corresponding volumeMount for the/run/spire/data
mountpoint (optional, but recommended for even the most trivial production use). Unlike the quickstart example for Kubernetes, the SPIFFE CSI example does not specify a persistent volume for the SPIRE Server. Execute the deployment script:
spiffe-csi/example/deploy-spire-and-csi-driver.sh
This action applies all of the YAML files under
spiffe-csi/example/config
withkubectl
. Check the output for any Kubernetes-related errors. At this stage, the most likely cause of any problems is that your Kubernetes context does not have sufficient permissions to create all of the required objects.Check the content of the spire namespace (Figure 4):
kubectl get all -n spire
At this stage, it is also interesting to review the contents of the configmaps that were created by the deployment script. To do this, run the command:
kubectl -n spire get cm spire-agent spire-server -o yaml
You can clearly identify important configuration options, such as the name of the trust domain, the subject of the built-in CA, and the TTL of the SVIDs. To change them, edit the configmaps, for example, with:
kubectl -n spire edit cm spire-server
and restart the relevant pods to make your changes take effect.
If no
spiffe-csi-driver
pods are running, check the status of thespiffe-csi-driver
daemonset:kubectl -n spire describe ds spiffe-csi-driver
The
spiffe-csi-driver
pods will not be scheduled if the pod security policies in place on your cluster prevent the creation of privileged pods in the spire namespace. That's because they use ahostPath
volume mount to connect to the Workload API's Unix socket at/run/spire/sockets
on each worker host, and they need to be privileged to do so.Check that the Workload API's Unix socket has been created on the Kubernetes workers (it won't exist on the masters):
ls /run/spire/agent-sockets/spire-agent.sock
Check that the
spire-agent
pods are connected tospire-server
and have successfully performed node attestation:SPIRE_SERVER_POD=$(kubectl -n spire get po -l app=spire-server -o jsonpath="{.items[0].metadata.name}") kubectl -n spire logs $SPIRE_SERVER_POD | grep -B1 attestation
You will see that the SPIRE Server issued an SVID to the node agent in the form
spiffe://example.org/spire/agent/k8s_psat/<cluster name>/<kubernetes node uid>
. If you runkubectl get nodes -o yaml | grep uid:
you'll see that the SPIFFE IDs issued to the nodes do indeed match the nodes' Kubernetes universally unique identifiers (UUIDs). made possible by SPIRE's
k8s_psat
node attestation plugin (Figure 5), which enables the SPIRE server to confirm the identity of the attesting node by querying the Kubernetes API to confirm various aspects of the node's identity. More information about node attestation with projected service account tokens (PSATs) is given in the SPIRE docs [8].Create registration entries for the
spire-agents
. Later, you'll specify thespire-agent
SPIFFE ID as the parent ID for each of the workload registrations you create. This is how we can control which node(s) are allowable for each workload:kubectl -n spire $SPIRE_SERVER_POD -- /opt/spire/bin/spire-server entry create -spiffeID spiffe://example.org/ns/spire/sa/spire-agent -parentID spiffe://example.org/spire/server -selector k8s_psat:agent_ns:spire -selector k8s_psat:agent_sa:spire -agent -selector k8s_psat:cluster:example-cluster
This generic registration will be matched by
spire-agent
on each Kubernetes worker; therefore, eachspire-agent
pod will receive the complete bundle of SVIDs for all the workloads whose registrations specify thisspire-agent
registration as the parent. If workloads were tied to particular nodes, you could use pod affinity to create multiplespire-agent
registrations with node-specifick8s_psat
selectors (e.g.,-selector k8s_psat:agent_node_name
) and set those as the workloads' parent IDs, so the workloads could only attest successfully when running on the correct nodes.At this point, the SPIRE infrastructure is ready for use, and workloads can be deployed by the sequence shown in Figure 3.
Register and Deploy Workloads
The example application to be deployed secures communication between a client workload and a server workload by SVIDs that are generated from the infrastructure just installed. This simple example is written in Go and uses SPIRE's Go library. The server waits for an incoming connection; when it receives one, it requests its own SVID and then checks that the TLS certificate of the inbound client request contains the SPIFFE ID it's been configured to expect. Meanwhile, the client repeatedly requests its own SVID, then sends an HTTPS GET request to the server, checking that the server presents a TLS certificate matching the expected SPIFFE ID. To deploy the example, take these steps:
Clone the repository containing the example:
git clone https://github.com/datadoc24/golang-spire-examples.git
This repository contains the Golang source code and Dockerfile in the
example
directory and a YAML file for Kubernetes deployment in thek8s
directory.Apply the YAML file to your default namespace:
kubectl apply -f golang-spire-examples/k8s/spire-https.yaml
- Register the workloads with the SPIRE Server. Suitable registration commands for the two workload pods, along with the
spire-agent
registration used in the infrastructure example, are ingolang-spire-examples/k8s/reg.sh
. Check that the
spire-https-client
andspire-https-server
pods are running in your default namespace, and see on which nodes Kubernetes deployed them. Tail the logs of thespire-agent
pod on one of those nodes, and see the workload attestation process (Figure 6).Tail the logs of the client pod with
kubectl logs -l app=spire-https-client -f
You'll see that it is sending requests to the server and that the data specified in the server pod spec's
DATA_TO_SEND
environment variable is received by the client. The logs also print out the.pem
content of the client pod's SVID, which was shown in Figure 2.The acid test. You want to be sure that communication between the client and the server will break down if either side's SVID expires and is not renewed. The following tests are a good way to prove that: * Stop the SPIRE Server. Communication should break when the SPIRE Agents' SVIDs expire.
* List all registrations and delete one of the workload registrations with:
spire-server entry show spire-server entry delete -entryID <ID of registration to be deleted>
* Finally, edit the
spire-https-client
orspire-https-server
deployment to change the expected SPIFFE ID of the other workload. For example, runkubectl edit deploy spire-https-client
and change the value of the SERVER_SPIFFE_ID environment variable. Saving the edited deployment will automatically recreate the client workload using the updated value. Tailing the logs of the recreated client workload pod will show you that the mTLS connection is now failing.
Troubleshooting
The logs of the spire-agent
pods are an excellent source of debugging information and visibility into the information provided by the attestor plugins. In these, you can see whether node attestation was successful and whether the agents themselves successfully received a bundle from the SPIRE Server, which tells you if the spire-agent
registrations were created correctly.
You can also see whether workloads are contacting the spire-agent
through the workload API and are receiving the correct SVID. Search in the spire-agent
pod logs for the message PID attested to have selectors
. Absence of these messages suggests that communication on the Workload API socket is not set up correctly. When an SVID is delivered to a workload, the logs will show the SVID's SPIFFE ID. Check for these points:
- The
hostPath
volume for the Workload API socket must be identical between thespire-agent
andspire-csi
driver pods. Check the volume in both daemonsets to make sure it matches. - Pod processes (
spire-agent
and user workloads) must use a local Workload API socket path that matches thevolumeMounts
path, which maps thehostPath
socket into the pod. - The registrations created via the Registration API must have a sufficiently specific combination of selectors to ensure that each workload is correctly identified by the attestation process. If not, your workload might have received an SVID intended for another workload, and will therefore not be trusted by other workloads that are set up to check SPIFFE IDs when establishing TLS connections.
« Previous 1 2 3 4 Next »
Buy this article as PDF
(incl. VAT)