Make better use of Prometheus with Grafana, Telegraf, and Alerta
Makeover
Prometheus is specially designed for monitoring large and scalable setups. The solution comprises several components: Prometheus itself is only the time series database. The Prometheus Node Exporter reads and provides basic system values on the target systems. Pushgateway handles the values that Prometheus itself cannot read directly on the hosts. If a data problem arises, the Alertmanager appears, comparing the incoming metric data with freely definable limit values. If a value gets out of control, it generates an alarm.
Theoretically, these components can be used to build a complete monitoring, alerting, and trending (MAT) system that easily monitors large environments, but practically, the components in this compilation lack elementary functions.
In this article, I look at complementary projects that exist for Prometheus that make admin life easier, including ready-made dashboards for data visualization, various metric data exporters, and two tools that display alarms graphically and coherently.
Beautiful Is Not Enough
The measurement data in Prometheus is good, but you need a way to visualize it. For a long time Prometheus developed its own dashboard; today, the official recommendation is to use Grafana, for which Prometheus has a plugin that can be used to configure it as a data source.
If you want to display measured values from Prometheus in Grafana, however, you need suitable dashboards. To aggravate the situation, the metrics provided by Prometheus Node Exporter are not sufficient in many setups. Special cases like MySQL, RabbitMQ, or Open vSwitch are not handled by the Node Exporter.
The Alertmanager also causes trouble. Most conventional monitoring systems offer a clearly arranged web page on which the current alarms are listed. However, the Prometheus Alertmanager only has a rudimentary GUI that does not meet modern requirements.
Grafana
If you use Prometheus as a classical monitoring system, you do not even come into direct contact with a large part of the measurement data collected by the tool, because only Alertmanager analyzes incoming values and raises the alarm if necessary. However, this leaves one of Prometheus' core functions unleveraged: trending.
Trending requires that the metric data stored in Prometheus needs to be prepared graphically in such a way that it is comprehensible to the admin. A table listing the current RAM usage figures for 5,000 hosts doesn't help you much. A corresponding graph that shows the progression of RAM usage provides much more information.
Grafana offers exactly this possibility: The program specializes in the graphical display of values from various tools and now supports a large number of data sources. One of the great strengths of Grafana is undoubtedly its modularity: You can define what you want displayed, and how, and the graphs you want to combine. Welcome to the world of Grafana dashboards: You can define any number of dashboards, and any number of metrics can be visualized in them, as long as they come from the same data source.
The problem is that the Grafana dashboard list is empty out of the box. If you build a brand new Prometheus setup, you first have to construct your own dashboards laboriously. Grafana expects the dashboards in JSON format, and the names of the individual metrics are not so intuitive that you would automatically hit on them without help.
Although you can find tools that let admins create JSON files in a GUI, you still need the catalog of metrics, and the work is still time consuming (Figure 1).
Not Reinventing the Wheel
The good news is that others also have wanted to wed Prometheus and Grafana, so you can find several setups that do so. Grafana offers a marketplace on its website [1], where developers can make their DIY dashboards available to the public. Admins can download these dashboards and simply import them into Grafana by hitting the Import button. If necessary, you can also adapt the templates.
If you want to use prebuilt dashboards in Grafana from the web, you will find different filters on the left side of the selection page. These filters are important, because how the dashboard is put together depends primarily on the exporter you use to collect your metrics. For example, if you use Prometheus Node Exporter, you need a matching dashboard. If you decide on Telegraf, the Grafana dashboards for Node Exporter will not work.
Several dashboards from the Grafana collection stand out. Knut Ytterhaug's Node Exporter Server Metrics, for example, lets you compare the values for Node Exporter metrics on different servers. As an alternative, Thomas Cheronneau's Prometheus System dashboard represents the values of a single server (Figure 2).
Grafana dashboards that display data prepared by Docker are particularly popular: Docker Host & Container Overview, for example (Figure 3), shows which containers run on a host and the resources they consume.
Ready-made dashboards are also available for more pedestrian software: The Apache dashboard, for example, displays the metrics Prometheus can collect from the Apache web server. If you also use other exporters, the scope of the dashboards available for Grafana expands again. If you collect data from an OpenStack setup with Telegraf, for example, you can display it with the appropriate dashboard for Grafana.
All in all, it is clear: If you combine Grafana with the dashboards from its marketplace, you can get started far faster without losing flexibility. A detailed look at the Grafana website is therefore recommended to every prospective Prometheus admin.
Buy this article as PDF
(incl. VAT)
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.