Lead Image © Maksym Chornii, 123rf.com

Lead Image © Maksym Chornii, 123rf.com

Monitoring in the Google Cloud Platform

Cloud Gazer

Article from ADMIN 69/2022
By
We introduce monitoring in the Google Cloud Platform by monitoring virtual machines, setting up alerts, observing important metrics in dashboards, and defining service-level objectives.

When you are responsible for infrastructure and applications, monitoring is a must-have for gaining insights into the status of the components involved, not only for on-premises environments but also for the public cloud. The Google Cloud Platform (GCP) discussed in this article, along with AWS and Microsoft Azure, is one of the three major public clouds.

In recent years, Google has seen considerable growth, especially in the areas of machine learning and big data. However, GCP is also very much in the running in the classic Internet-as-a-Service (IaaS) arena. Monitoring is definitive for the admin's ability to operate a cloud infrastructure effectively. At Google, the Cloud Operations Suite, which addresses the topics of monitoring, logging, tracing, debugging, profiling, and auditing, was named Stackdriver until two years ago. Google no longer uses this name and simply refers to it as the Operations Suite.

To ensure that monitoring works during operations, you need to field the data from the source systems in the form of signals. In the monitoring world, data is equivalent to metrics, which can come from IaaS components such as virtual machines, from added-value services such as managed databases, from platforms such as Kubernetes, and from microservices – but also from the applications themselves. Keeping track of incidents is essential, whether in the form of alerts, error reports, or even service-level objectives. The Operations Suite lets you consolidate and view the data in more detail, visualize the results, and use them for troubleshooting.

Access to monitoring data on GCP can be both centralized and decentralized. In the Google Cloud, all resources are assigned to projects. In Operations Suite, each project can collect and analyze data on its own. However, if you want to keep track of all systems across the entire organization, you need to add multiple projects to a monitoring workspace.

Virtual Machine Monitoring

When logged into GCP, first switch to the monitoring tool by typing Monitoring top center in the search bar and then selecting the first item in the results bar. Alternatively, click directly in the menubar on the left on Operations | Monitoring | Overview . You can then select additional projects top left in the Metrics scope and add them to a workspace.

Another way to view monitoring data is to select the resource directly in the GCP GUI. For virtual machines, simply click on the individual VM and then switch to the Observability tab (Figure 1), which will take you to a dashboard that displays the most important metrics. For VMs, these are CPU utilization, various network traffic values such as the number of packets received and sent, firewall blocked packets, and hard disk metrics such as input/output operations per second (IOPS) or data throughput.

Figure 1: For virtual machines, the Observability dashboard clearly displays numerous metrics.

Additionally, you can view even more extensive metrics on memory and disk utilization; however, you need to install the Ops Agent on the VMs. Fortunately, the GUI also immediately shows the installation instructions for this step and prompts you to execute the commands shown in Listing 1.

Listing 1

Installing the Ops Agent

:> agents_to_install.csv && echo '"projects/playground-gs/zones/europe-west3-c/instances/test-vm1","[{""type"":""ops-agent""}]"' > agents_to_install.csv && curl -sSO https://dl.google.com/cloudagents/mass-provision-google-cloud-ops-agents.py && python3 mass-provision-google-cloud-ops-agents.py --file agents_to_install.csv

By the way, the Ops Agent is based on the OpenTelemetry standard, a project by the Cloud Native Computing Foundation. Because it does not use a proprietary standard, the Ops Agent will work with all monitoring tools that can handle the standard.

Uptime Checks

Besides querying basic metrics, defining uptime checks is one of the basic tasks in monitoring. A classic method of implementation is to call an HTTP endpoint and check the HTTP response status. In the case of a virtual machine, you can easily try this implementation with an Apache web server. For example, if you are using Debian, first install the Apache server:

sudo apt-get update
sudo apt-get install apache2 php7.0
sudo service apache2 restart

The uptime check is then configured in the Operations Suite by selecting Operations | Monitoring | Uptime Checks from the menu on the left and then clicking on + CREATE UPTIME CHECK to start the configuration. On the left side of the screen, you will then see the wizard. The first step is simple: Enter a name for the uptime check and then click Next . Now you need to specify the target to be checked. Besides URLs, you can also select Kubernetes load balancers, App Engine instances, VM instances, and AWS Elastic load balancers. In this case, the configuration is:

- Protocol: HTTP
- Instance: <name of VM>
- Check Frequency: 1 minute

At this point, do not forget that you need to set up appropriate firewall rules for the VM to make sure port 80 is also accessible. Leave the other configuration values as they are and go to the next step. In the Response Validation menu item, keep the value of 10 seconds for Response Timeout and check the Log check failures box to enable logging in the event of an error.

Click Next one more time to get to the last step in the wizard: configuring alerts and notifications. Leave the Create an alert checkbox enabled and change the name of the alert, if necessary. Also leave the Notification Channel empty, but if you want, you can send alerts there. Google supports mobile devices such as smartphones, PagerDuty Services, PagerDuty Sync, Slack, webhooks, email, SMS, and the pub/sub internal messaging service. After completing the wizard, it takes a few minutes for the first data to show up in the Operations Suite.

Setting Alerts

Operations Suite also lets you define alerts linked to conditions and can send notifications if required. In GCP, you set up these alerts by selecting the Operations | Monitoring | Alerting menu item in the navigation bar. You can start creating an alert by pressing +CREATE POLICY at the top.

To begin, define a condition and press ADD CONDITION . A window opens in which you need to select a name for the condition and a corresponding metric. For this scenario, I specified VM Instance as the resource type and network traffic as the metric. Typing Network brings up a list of possible metrics, from which I selected agent.googleapis.com/interface/traffic . Next, I set the following configuration values,

- Condition: is above
- Threshold: 500
- For: 1 minute

then clicked Add and Next . You can now configure the notifications. In the Notification Channels section, select Manage Notification Channels and enter your email address in the window that opens. After doing so, you can select the defined email address in the original window and click OK .

Clicking Next takes you to the last step, where you enter an alert name (e.g., Inbound Traffic Alert ). Optionally, you can also enter text that you want to add to the email. Clicking Save completes the process. After a few minutes, an alert should appear, which you can display on the dashboard.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus