Monitoring container clusters with Prometheus
Perfect Fit
Kubernetes [1] makes it much easier for admins to distribute container-based infrastructures. In principle, you no longer have to worry about where applications run or if sufficient resources are available. However, if you want to ensure the best performance, you usually cannot avoid monitoring the applications, the containers in which they run, and Kubernetes itself.
You can read how Prometheus works in a previous ADMIN article [2]; here, I shed light on the collaboration between Prometheus and Kubernetes. Because of its service discovery, Prometheus independently retrieves information about the container platform, the current container, services, and applications via the Kubernetes API. You do not have to change the configuration of Prometheus when pods launch or die or when new nodes appear in the cluster: Prometheus detects all of this.
Uplifting
In addition to the usual information, such as CPU usage, memory usage, and hard disk performance, the metrics of containers, pods, deployments, and ongoing applications are of interest in a Kubernetes environment. In this article, I show you how to collect and visualize information about your Kubernetes installation with Prometheus and Grafana. A demo environment provides impressions of the insights Prometheus delivers into a Kubernetes installation.
The Prometheus configuration is oriented on the official example [3]. When querying metrics from the Kubernetes API, the excerpt from Listing 1 is sufficient. Thanks to service discovery in Prometheus, many metrics can be retrieved, as shown in Figure 1.
Listing 1
02-prometheus-configmap.yml (Extract 1)
[...] - job_name: 'kubernetes-apiservers' kubernetes_sd_configs: - role: endpoints scheme: https tls_config: ca_file: /var/run/[...]/ca.crt insecure_skip_verify: true bearer_token_file: /var/run/[...]/token [...]
Labeled
The biggest advantage from the interaction between Prometheus and Kubernetes has to be the support for labels. Labels are the only way to access or identify specific pods, services, and other objects in Kubernetes. An important task for Prometheus, therefore, is to identify and maintain these labels. The software's service discovery stores this information temporarily in meta labels. With the use of relabeling rules, Prometheus converts the meta labels into valid Prometheus labels and discards the meta labels as soon as it has generated the monitoring targets.
A blog post [4] describes the relabeling process in detail. The rules could look something like Listing 2. In the end, Prometheus knows the labels that Kubernetes assigns its nodes, applications, and services.
Listing 2
02-prometheus-configmap.yml (Extract 2)
[...] - action: labelmap regex: __meta_kubernetes_service_label_(.+) - source_labels: [__meta_kubernetes_namespace] action: replace target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_service_name] action: replace target_label: kubernetes_name [...]
Prom Night
You can define graphs or alarms based on these labels with the powerful PromQL [5] query language. Kubernetes defines labels as shown in Listing 3, which more or less inherits a resulting Prometheus metric:
my_app_metric{app="<myapp>",mylabel="<myvalue>",[...]}
Listing 3
Label Example
metadata: labels: app: <myapp> mylabel: <myvalue>
Prometheus creates a separate time series for each additional label. Each label adds another dimension to my_app_metric
, which Prometheus in turn stores as a separate time series. The software can already cope with millions of time series, yet version 2.0 [6] should cover more extreme Kubernetes environments with thousands of nodes.
Buy this article as PDF
(incl. VAT)