System logging for data-based answers
Log Everything
Oh, I'm a lumberjack, and I'm okay,
I sleep all night and I work all day.
– from "Lumberjack" byMonty Python
Can't you just imagine yourself in the wilds of British Columbia swinging your ax, breathing fresh air, sleeping under the stars?!!! I can't either, but Monty Python's "Lumberjack" song has a strong message for admins, particularly HPC admins – Log Everything.
Why log everything? Doesn't that require a great deal of work and storage? The simple answer to these questions is yes. In fact, you might need to start thinking about a small logging cluster in conjunction with the HPC computational cluster. Such a setup will give you answers to questions.
Answering questions is the cornerstone of running HPC systems. These questions include those from users such as, "Why is my application not running?" or "Why is my application running slow?" or "Why did I run out of space?" It also answers system administrator questions such as, "What commands did the user run?" or "What nodes was the user allocated during their run?" or "Is the user storing a bunch of Taylor Swift videos?"
If you haven't read about the principle of Managing Up [1], you should. One of the keys of this dynamic is anticipating questions your manager might ask, such as something seemingly as simple as "How's the cluster running?" or something with a little more meat to it such as "Why isn't Ms. Johnson's application running?" or perhaps the targeted question, "How could you screw up so badly?" Implicit in these questions are questions from your manager's manager, and on up the chain. Managing up means anticipating these questions or situations that might be encountered up the management chain (answering the "Bob's" question about what you actually do). More than likely, management is not being abusive, but several people have taken responsibility for spending a great deal of money on your fancy cluster, and they want to know how it's being utilized and if it's worth the investment.
The way to answer these questions is to have data. Data-based answers are always better than guesses or suppositions. What's the best way to have data? Be a lumberjack and log everything.
Logging
Regardless of what you monitor, you need to be a lumberjack and log it. HPC systems can be running a few nodes or tens of thousands of nodes. The metrics for each node need to be monitored and logged.
The first step in logging is deciding how to write the logs. For example, you could write the logs as a non-root user to a file located somewhere on a common cluster filesystem. A simple way to accomplish this is to create a special user, perhaps lumberjack, and have this user write logs to their /home
directory that is mounted across the cluster.
The logs written by this user should have file names specific to each node for each entry, which allows you to determine the source of the messages. You should also put a time stamp with each log entry so that you can get a time history of events.
Another good option relative to writing logs to a user directory is to use the really cool Linux tool logger
[2], which allows a user to write a message to the system logs. For example, you could easily run the command
$ logger "Just a notification"
to write a message to the standard system log /var/log/syslog
located on each node. By default, it also writes the time stamp with the log entry. You can specify the log as well, in case you don't want to write to /var/log/syslog
. Just use the -f <file>
option, where <file>
is the fully qualified path to the log file (just to make sure).
If you haven't noticed yet, logger
writes the messages to the local logs, so each node has its own log. However, you really want to collect the logs in a single location to parse them together; therefore, you need a way to gather the logs from all of the nodes to a central location.
A key to good logging habits is to copy or write logs from remote servers (compute nodes) to a central logging server, so you have everything in one place, making it easier to make sense of what is happening on the server(s). You have several ways to accomplish this, ranging from easy to a bit more difficult.
Remote Logging the Easy Way
The simple way to perform remote logging comes with some risk: Configure a cron job on every node that periodically copies the node system logs to the centralized log server. The risk is that logs are copied only in the time period specified in the cron job, so if something happens on the node during that time, you won't have any system logs for that node on the log server.
A simple script for copying the logs would likely use scp
to copy the logs securely from the local node to the log server. You can copy whatever logs or files you like. A key consideration is what you name the files on the log server. Most likely, you will want to put the node name in the name of the logfiles. For example, the name might be node001-syslog
, which allows you to store the logs without worrying about overwriting files from other nodes.
Another key consideration is to include the time stamp when the log copy occurs, which, again, lets you keep files separate without fear of overwriting and makes the creation of a time history much easier.
Remote Logging with rsyslog
Another popular option is rsyslog
(remote syslog) [3], an open source tool for forwarding log messages to a central server using an IP network. It is very configurable using the /etc/rsyslog.conf
file and the files in the /etc/rsyslog.d/
directory to define the various configuration options. Because the tool is so configurable and flexible, be sure to read the man pages very carefully.
You can get started fairly easily with rsyslog
by using the defaults. On the remote host that collects the logs, you begin by editing the /etc/rsyslog.conf
file, uncommenting the following lines:
$ModLoad imtcp $InputTCPServerRun 514
These lines tell rsyslog
to use TCP, which is port 514 by default. After the change, you should restart the rsyslog
server.
On every node that is to send its logs to the logging node, you need to make some changes. First, in the file /etc/rsyslog.d/loghost.conf
, make sure you have a line such as
*.* @@<loghost>:514
where <loghost>
is the name of the logging host (use either the IP address or resolvable hostname), the *.*
refers to all logging facilities and priorities, the @@
portion tells rsyslog
to use TCP for log transfer (an @
alone would tell it to use UDP), and 514
is the TCP port. After this change is made, restart the service on the node and every node that is to send the logs to the logging server.
In the logfiles on the logging server, the hostname of the node will appear, so you can differentiate logs on the basis of hostnames.
You can use either of these approaches, or one that you create, to store all of the system logs in a central location (a logging server). Linux comes with standard logs that can be very useful [4]; alternatively, you might want to think about creating your own logs. In either case, you can log whatever information you feel is needed. The next few sections present some options you might want to consider.
Buy this article as PDF
(incl. VAT)