Lead Image © Orlando Rosu, 123RF.com

Lead Image © Orlando Rosu, 123RF.com

Visualizing data captured by nmon

In Good Time

Article from ADMIN 34/2016
By
When speed, ease of use, and time to answer are paramount in performance monitoring and assessment of nmon logfiles, onTune nmon Analyzer Plus for Windows can help.

An excellent article by Jeff Layton [1] on nmon monitoring showed nmon to be a most useful performance assessment and evaluation tool. My experience and use of nmon focuses on Layton's statement that "Nmon can also capture a great deal of information from the system and produce CSV files for postprocessing. However, the results are typically not easy to postprocess"; hence, you need a tool to visualize the data.

This is particularly true for big data firms that deal with thousands of Linux server systems and very large amounts of captured information. The visualized information needs to get into reports as quickly as possible, because the time to answer is paramount to be proactive. The bottom line can be based on the vast amount of nmon data; to get a comprehensive picture quickly of all server systems, you have to be able to drill down through the data to analyze individual servers' performance behavior on an ad hoc basis. To accomplish that objective, I've been using a tool at my company called onTune nmon Analyzer Plus (ONA Plus) from TeemStone [2].

ONA Plus: It's Fast

To use the tool, you first copy the nmon logs to the tool's directory. After the copy process, the application starts to load the files automatically. It takes about 20 minutes to process 10GB of nmon data (on a contemporary Windows laptop). In most instances, when dealing with a lot of nmon logs from thousands of servers, you can get a cup of java and come back to find the processing done.

The tool pops up the window in Figure 1 right after the last logfile disappears in the tool's directory. To begin, you choose the start and end dates for the period of interest and select OK . The application processes the logs, executes a viewer program, and displays graphs and views.

Figure 1: Setting up the log processing parameters.

It's Easy, Visual, Interactive

ONA Plus starts with the display shown in Figure 2, a Summary View of all server systems. The servers are grouped according to average/maximum values of CPU, memory, paging space utilization, and server count per grouping. You also can group the servers to reflect physical or logical separation of data centers and regions.

Figure 2: Starting view of server nmon logfile data.

The server list on the left, and shown expanded in Figure 3, lists the main performance criteria for each server, as well as system information (e.g., CPU count, clock speed, memory, IP address). These nice features make it easier and more convenient for you to get a quick overview of your systems while focusing on the real performance analysis task at hand. I consider the list view to be most helpful and informative, because all the main performance information for each server is viewable at a glance.

Figure 3: Main performance data for each server.

Band ratio data, shown in the top left pane of Figure 2 and in the list window below, is a supplemental indicator used in conjunction with the average and maximum values to determine the load on a server. In my studies, I did not really have a need for the band ratio, so I usually just turned it off.

Drilling Down

To drill down through an individual server to analyze performance behavior, you can use the Direct View and the Chartlist Detailed view options. The Direct View option (Figure 4) provides detailed trending charts for each individual server's basic performance parameters (for a selected time range). To enhance the analysis process, a base chart is printed at the bottom of the application screen that depicts the entire time period and allows you to choose a smaller time epoch. Ergo, it is really easy to zoom in further and select a smaller sample period for analysis (Figure 5). Furthermore, you can choose the Predicting Trendline option to generate a simple forecasting graph.

Figure 4: Direct View trending charts.
Figure 5: Zooming in to a smaller time period.

In most scenarios, a more comprehensive forecasting technique is required; nevertheless, an interesting feature provided with the tool can be used to assess, for example, CPU fluctuation via a sine wave. The view I have used extensively in projects is the Chartlist Detailed option.

Here, you can display the process ID or command in a split screen directly below the basic performance items (Figure 6), so with the synchronized timeline, you can visually ferret out which process is the culprit underlying or has the potential for a performance bottleneck or anomaly. By providing all of these drill-down features in one analysis ecosystem, the tool can be used to conduct some serious server performance and tuning studies.

Figure 6: Stacking charts to analyze performance problems.

The Filesystem view (Figure 7) conveniently discloses all the filesystems of all the servers and their utilization (%) in one list, as well as detailed filesystem utilization charts on a per-server basis.

Figure 7: The Filesystem view.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Nmon: All-Purpose Admin Tool

    HPC administrators sometimes assume that if all nodes are functioning, the system is fine. However, the most common issue users have is poor or unexpected application performance. In this case, you need a simple tool to help you understand what’s happening on the nodes.

  • GUI or Text-Based Interface?

    Sys admins are like smokejumpers who parachute into fires, fighting them until they are out, or at least under control. When you jump into the fire, you only have the tools you brought with you.

comments powered by Disqus