Lead Image © zlajo, 123RF.com

Lead Image © zlajo, 123RF.com

Central logging for Kubernetes users

Shape Shifter

Article from ADMIN 55/2020
By
Grafana's Loki is a good replacement candidate for the Elasticsearch, Logstash, and Kibana combination in Kubernetes environments.

In conventional setups of the past, admins had to troubleshoot fewer nodes per setup and fewer technologies and protocols than is the case today in the cloud, with its hundreds and thousands of technologies and protocols for software-defined networking, software-defined storage, and solutions like OpenStack. In the worst case, network nodes also need to be checked separately. If you are searching for errors in this kind of environment, you cannot put the required logfiles together manually.

The Elasticsearch, Logstash, and Kibana (ELK) team has demonstrated its ability to collect logs continuously from affected systems, store them centrally, index the results, and thus make them searchable. However ELK and its variations prove to be complex beasts. Getting ELK up and running is no mean achievement, and once it is finally running, operations and maintenance prove to be complex. A full-grown ELK cluster can massively consume resources, as well.

Unfortunately, you don't have a lot of alternatives. In the case of the popular competitor Splunk, a mere glance at the price list is bad for your blood pressure. However, the Grafana developers are sending Loki [1] into battle as a lean solution for central logging, aimed primarily at Kubernetes users who are already using Prometheus [2].

Loki claims to avoid much of the overhead that is a fixed part of ELK. In terms of functionality, the product can't keep up with ELK, but most admins don't need many features that bloat ELK in the first place. Unfortunately, ELK does not allow you to sacrifice part of the feature set for reduced complexity. Loki from Grafana opens up this door. In this article, I go into detail about Loki and describe which functions are available and which are missing.

The Roots of Loki:Prometheus and Cortex

If you follow Loki back to its roots, you will come across some interesting details: Loki is not a completely new development; the Grafana developers oriented their work on Prometheus – but not directly. Loki was inspired by a Prometheus fork named Cortex [3], which extends the original Prometheus, adding the horizontal scalability admins often missed.

Prometheus itself has no scale-out story. Instead, the developers recommend running many instances in parallel and sharding the systems to be monitored. Sending the incoming metric data to several Prometheus instances is intended to provide redundancy in such a setup, but this construct forces you to tie different Prometheus instances to a single instance of the graphics drawing tool Grafana, often with unsatisfactory results.

Cortex removes this Prometheus design limitation but has not yet achieved the widespread distribution level and popularity of its ancestor. Clearly, it was well enough known to the Grafana developers, because in their search for a suitable tool for their project they used Cortex as a starting point, which also explains the slogan the Loki developers use to advertise their product: Loki is "like Prometheus, but for logs."

Log Entries as Metric Data

Both Prometheus and its derivative Cortex are tools for monitoring, alerting, and trending (MAT). However, they cannot be compared with the well-known monitoring tools such as Icinga 2 or Nagios, which primarily focus on event-based monitoring. MAT systems, on the other hand, are designed to collect as many performance metrics as possible from the computers to be monitored.

From this data, the applied load can be read off and the future load can be estimated; monitoring is more or less a waste product. If you know how many instances of the httpd process are running on a system, you can use a suitable component to raise an alert as soon as a value drops below a certain threshold. Loki's radically revolutionary approach now consists of treating the log data of the target systems exactly as if they were regular metric data.

If you have already set up a complete Prometheus for an environment, you will have dealt with labels, which are useful in Prometheus to distinguish between metrics. Admins typically use labels for certain values: An http_return_codes metric could have a value label, which in turn takes tags of 200, 403, 404, and so on. Ultimately, labels help admins keep the total number of all metrics reasonably manageable, limiting the overhead needed for storage and processing.

Different from ELK

Loki attaches itself to these labels and uses them to index the incoming log messages, which marks the biggest architectural difference from ELK. For this very reason, Loki is far more efficient and lightweight: It does not meticulously evaluate incoming log messages and store them on the basis of defined rules and keywords; rather, it works on the basis of the labels attached to them.

What sounds complicated in theory is simple and comprehensible in practice. Suppose, for example, an instance of the Cluster Consensus Manager Consul is running in a Kubernetes environment and produces log messages. If you rely on Prometheus for monitoring, you will use this tool to monitor Consul on the hosts.

One metric that Prometheus uses for Consul is consul_service_health_status, but if you are running a development instance and a production instance of the environment, you could define an Env label that can assume the value dev or prod. With Grafana linked to Prometheus, different graphs could then be drawn by label. Loki does something very similar by classifying the stored log entries by label so you can display log entries for prod and dev.

Although not as convenient as the full-text search feature to which ELK users are accustomed, the Loki solution is far more frugal in terms of resources. Because Prometheus and its Cortex fork are easy to configure dynamically, Loki is far better suited for operation in containers, as well.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus