« Previous 1 2 3
Detect anomalies in metrics data
Jerk Detector
PAD in Practical Use
As mentioned earlier, PAD is designed to run in OpenShift, but the container runs quite well without Red Hat's orchestrator. The following example assumes that you use Podman to manage your containers. To begin, source the complete PAD container; the code is also available on GitHub [2]:
podman pull quay.io/aicoe/prometheus-anomaly-detector:latest
PAD gets the metrics for which it is supposed to detect anomalies per Fourier and Prophet via environment variables.
Next, you need to build the command line to start the container. PAD gets the metrics for which it is supposed to detect anomalies with Fourier and Prophet from environment variables. These, in turn, can be easily passed in to the container. The command
docker run --name pad -p 8080:8080 --network host --env FLT_PROM_URL=http://pad.local.lan:9090 --env FLT_RETRAINING_INTERVAL_MINUTES=15 --env FLT_METRICS_LIST='up' --env APP_FILE=app.py --env FLT_DATA_START_TIME=3d --env FLT_ROLLING_TRAINING_'WINDOW_SIZE=15d quay.io/aicoe/prometheus-anomaly-detector:latest
starts the container and sets up
as a metric in FLT_METRICS_LIST
(i.e., you want to know whether or not the systems are running). Instead of up
, you need to add the names of the metrics for which you want to detect anomalies. If you enter the Prometheus Node Exporter's node_filesystem_avail_bytes
metric here, for example, you are telling PAD to monitor changes in the allocation of storage drives of individual devices, because suddenly increasing space occupation (i.e., reductions in free space) can be an indicator of some undesirable processes that justify a closer look.
PAD alone usually does not help you, because Prometheus does not visualize the data graphically. Grafana is the tool you want to use, and the PAD developers make it easy to do just that, because PAD exports the calculated metrics data in Prometheus format. An existing Prometheus instance can read the data from PAD just as from any other exporter. For this step, the developers rely on Flask.
The rest is plain sailing. Once the metrics data and the predictions are available in Prometheus, dashboards can be created in Grafana in the usual way. What's more, the predictions from Fourier and Prophet can be integrated into the same dashboards and superimposed – together with the measured values, if required (Figure 3). If you want to set up alarms from the predictions, you can do so in the alert manager. A Red Hat talk from 2019 [3] provides some details of the configuration.
Conclusions
Administrators often turn up their noses when vendors present their AI solutions for attack detection; in fact, it's not uncommon to hear them referred to as hocus pocus. However, Red Hat has come up with a very concrete and immediately usable approach to generating added value in everyday life with AI in the form of PAD. The more metric data Fourier and Prophet have available, the better they can train their models and the more reliable the predictions become. Therefore, you do not need to allow for start-up time with false positives at the beginning. However, the extra work will pay off when you track down an attacker because you noticed even earlier than usual that something was wrong.
Infos
- Cooley, J. W. and J. W. Tukey. An Algorithm for the Machine Calculation of Complex Fourier Series. Mathematics of Computation , 1965;19:297-301, https://www.ams.org/journals/mcom/1965-19-090/S0025-5718-1965-0178586-1/S0025-5718-1965-0178586-1.pdf
- Prometheus Anomaly Detector: https://github.com/AICoE/prometheus-anomaly-detector
- AIOps: Anomaly detection with Prometheus, by Marcel Hild, Linux Foundation, https://events19.linuxfoundation.org/wp-content/uploads/2017/12/AIOps-Anomaly-Detection-with-Prometheus-Marcel-Hild-Red-Hat.pdf
« Previous 1 2 3
Buy this article as PDF
(incl. VAT)