A modern logging solution

Eloquent

Filtering

Logs are abundant with raw data, but it's the processed information that holds the key to meaningful insights. This transformation of raw data into actionable information is where Fluent Bit's filter plugins shine. Acting as intermediaries between input and output plugins, filters modify, enrich, or eliminate specific records, ensuring the logs are tailored to your needs.

Consider an environment flooded with logs, but you're only interested in entries that contain the term ERROR . In this typical scenario, the grep filter becomes invaluable. By configuring it, you can instruct Fluent Bit to forward only logs that match a specific pattern:

[FILTER]
    Name   grep
    Match  *
    Regex  log ERROR

This configuration allows only logs containing ERROR to pass through, filtering out the rest.

In a Kubernetes environment, pods generate helpful logs, but without context they might be difficult to decipher. Fluent Bit's kubernetes filter steps in to associate logs with Kubernetes metadata (e.g., pod names, namespaces, and container IDs):

[FILTER]
    Name      kubernetes
    Match     kube.*
    Kube_URL  https://kubernetes.default.svc:443
    Merge_Log On

In some instances, a single filter might not suffice. Fluent Bit supports chaining multiple filters, allowing for a series of transformations on the data. This chaining process is sequential; the order in which filters are listed matters.

For example, suppose you're interested in logs with ERROR , but from those, you want to exclude entries mentioning timeout . In this case you can set up a sequence of grep filters (Listing 8). The first filter captures logs containing ERROR . The next filter excludes entries with timeout , ensuring the final set of logs is precisely what you want.

Listing 8

Chaining Filter Plugins

[FILTER]
    Name   grep
    Match  *
    Regex  log ERROR
[FILTER]
    Name   grep
    Match  *
    Exclude log timeout
    @KE:

Parsing Logs

Although Fluent Bit input plugins fetch data and the filters mold it, the ability to interpret and transform the raw data format is an important part of any logging solution. Parsers allow Fluent Bit to understand and restructure logs, converting them into structured formats suitable for further analysis.

Imagine you have a log in the format

2023-08-14 14:45:32, INFO, User login successful

Without understanding its structure, Fluent Bit sees this log as a single string. However, with a defined parser, the platform can discern the timestamp, log level, and message (Listing 9). With this parser, the log is divided into three parts: The time field captures the timestamp, level identifies the log level (INFO in this case), and message collects the log message. The Time_Key and Time_Format further instruct Fluent Bit how to interpret the log's timestamp.

Listing 9

Parser Plugin

[PARSER]
    Name        custom_log_format
    Format      regex
    Regex       ^(?<time>[^,]+),\s(?<level>[^,]+),\s(?<message>.+)$
    Time_Key    time
    Time_Format %Y-%m-%d %H:%M:%S

Fluent Bit comes packed with a range of predefined parsers for common log formats, such as JSON, Apache, and syslog. However, the ability to define custom parsers, as in the example above, ensures that Fluent Bit remains adaptable to unique logging scenarios.

To employ a specific parser, you can easily associate it with an input plugin:

[INPUT]
    Name        tail
    Path        /var/log/custom_app.log
    Parser      custom_log_format

In the this configuration, the logs from custom_app.log are interpreted by the custom_log_format parser defined earlier.

Certain applications, especially those generating stack traces or multiline logs, can pose parsing challenges. Fluent Bit's multiline parser functionality provides a solution. For instance, if an application generates logs with Java stack traces, a multiline parser can consolidate these multiple lines into a single log entry for clearer analysis.

A Few Tips

The Fluent Bit architecture is inherently lightweight and designed for minimal resource usage. Yet specific scenarios, like sudden spikes in log volume, can strain the system. To handle this, Fluent Bit offers buffering options. By default, the in-memory buffering mechanism manages data, ensuring fast processing. However, when dealing with potential data overflow scenarios, you can enable on-disk buffering as a fallback:

[SERVICE]
    storage.path  /var/fluentbit/storage/
    storage.backlog.mem_limit        50MB

In this way, data exceeding the 50MB in-memory limit is stored in the /var/fluentbit/storage/ directory, preventing data loss and ensuring smoother operation during high-volume periods.

Fluent Bit provides an internal monitoring interface that exposes metrics about its operations. By enabling the service's HTTP monitoring endpoint, users can access these metrics:

[SERVICE]
    http_server  On
    http_listen  0.0.0.0
    http_port    2020

With the server enabled, visiting http://<your-server-ip>:2020 offers a range of metrics, from input plugin data rates to buffer utilization. This information is invaluable for identifying bottlenecks, diagnosing issues, or simply understanding Fluent Bit's operational state. Moreover, you can configure Prometheus to scrape metrics from this endpoint by adding Fluent Bit as a target in the Prometheus configuration:

scrape_configs:
  - job_name: 'fluent-bit'
    static_configs:
      - targets: ['<your-server-ip>:2020']

This setup enables Prometheus to collect Fluent Bit's operational metrics, making it easier to visualize performance, identify bottlenecks, and set up alerts for potential issues.

Fluent Bit's inherent resilience is highlighted by its approach to failures. If, for instance, an output destination becomes temporarily unavailable, Fluent Bit doesn't just drop the data. Instead, it retains and retries, ensuring data integrity. The frequency and count of these retries can be fine-tuned, allowing you to strike a balance between persistence and resource utilization:

[OUTPUT]
...
    Retry_Limit  5

In this setup, Fluent Bit attempts to send data to the chosen output five times before considering it a failure.

Although you might be tempted to collect everything everywhere, discernment in data collection aids in efficient processing and storage. Prioritize logs that provide actionable insights. For instance, while debugging, verbose logs are invaluable, but in a stable production environment, these might just clutter your storage.

Data storage isn't infinite. You should regularly review and adjust your log retention settings on the basis of data relevance. If you are using an output like Elasticsearch or AWS S3, make use of their built-in retention policies to clear older logs automatically that no longer serve a purpose.

Misconfigurations can be the root of many issues. Commands such as

fluent-bit -c /path/to/fluentbit.conf --dry-run

allow you to validate configurations without running Fluent Bit. They help identify any syntax errors or misconfigurations upfront.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus