OpenStack observability with Sovereign Cloud Stack

Guard Duty

Components and Tools

SCS relies on the Open Source Infrastructure and Service Manager (OSISM) [2] as a tool for deployment and day 2 operations for OpenStack. OSISM itself relies heavily on Kolla Ansible [3] and on OpenStack-Ansible, a collection of Ansible playbooks for deploying OpenStack. One of the main focuses of OSISM is to simplify the operation of OpenStack-based systems and, in particular, upgrades from one OpenStack version to the next. The goal is to be able to install updates on a system at any time.

Kolla Ansible comes with a Prometheus-based [4] monitoring stack out of the box, which coincided very well with the Monitoring SIG favoring an OpenMetrics-based approach from the outset. Initially, the use of traditional monitoring software, such as Zabbix or Icinga for service state monitoring, was considered. However, it became clear relatively quickly in the discussions that these scenarios could just as easily be covered by Prometheus' Blackbox exporter. With a view to reducing complexity, it makes sense to rely on the Blackbox exporter instead of a completely independent software solution. These changes were incorporated into OSISM; Zabbix, which had previously been included, was dropped.

As a first step, additional dashboards were provided for Grafana (Figure  4) and integrated into the Kolla Ansible project. Additionally, various exporters for Prometheus, which are currently not part of Kolla Ansible, should be included.

Figure 4: The Grafana dashboard of the OpenStack Health Monitor.

Alerting

Alerting is an important component in any monitoring setup. It quickly became clear in the Monitoring SIG that every CSP that is not just starting to commission a corresponding environment already has an alerting system in operation. Ideally, the monitoring supplied with SCS would dock onto it.

Therefore, the decision was made to opt for the Prometheus Alert Manager, which is already integrated in Kolla Ansible, and to document best practices [5] for connecting to external alerting systems. The open source Alerta [6] software provides an alternative for aggregating alert occurrences at this point. Initially, the idea of integrating it directly was considered; however, for the time being, Alert Manager was deemed sufficient.

Alert rules are an important part of Prometheus monitoring. To create a good starting point, several rule sets have been adopted from the Awesome alert rules [7] project, and they are now also making their way into Kolla Ansible.

There's Monitoring and Then There's Monitoring

When the talk turns to monitoring, people tend to talk first about simple process monitoring. Does the Foo process exist, and does the Bar service respond on port 42? Often, instead of simply checking whether a service responds on a port, it is a good idea to use test scenarios that carry out a functional check of the service.

For example, in an environment like OpenStack, it's helpful to know whether the Horizon web front end is being delivered correctly to the browser or whether an API should be used to check that VMs can be started. However, checking a network component such as Open Virtual Network (OVN) for correct functionality can become complex.

To monitor OVN efficiently, the SIG is currently working on integrating the OVN exporter [8] upstream to provide various OVN metrics for Prometheus. Figure 5 illustrates where the exporter needs to reside to capture data from the redundant components and detect failures.

Figure 5: Open Virtual Network setup with Prometheus exporters.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Bare metal deployment with OpenStack
    Automating processes in the age of the cloud is not just a preference, but a necessity, especially as it applies to the installation and initial setup of compute nodes. OpenStack helps with built-in resources.
  • Combining containers and OpenStack
    The OpenStack cosmos cannot ignore the trend toward containers. If you want to combine both technologies, projects like Magnum, Kolla, and Zun come into play. Which one?
  • The state of OpenStack in 2022
    The unprecedented hype surrounding OpenStack 10 years ago changed to disillusionment, which has nevertheless had a positive effect: OpenStack is still evolving and is now mainly deployed where it actually makes sense to do so.
  • Questions about the present and future of OpenStack
    OpenStack has been on the market for 12 years and is generally considered one of the great open source projects. Thierry Carrez and Jeremy Stanley both work on the software and provide information about problems, innovations, and future plans.
  • Alternative virtualization solutions when OpenStack is too much
    OpenStack is considered the industry standard for building private clouds, but the solution is still far too complex and too difficult to maintain and operate for many applications. What causes OpenStack projects to fail, and what alternatives do administrators have?
comments powered by Disqus