Monitor your network infrastructure with SNMP

Clear View

SNMP in Action

After completing the preparations, you can query all the SNMP information available for a device using the command:

$ sudo snmpwalk -v1 -c <RO-Community-String> <Host> <OID>

The -v1 parameter enforces the use of SNMPv1, and you can set the community string with -c. On top of this, you need the hostname or the IP address of the computer to query and its OID. If you simply enter a dot for the latter, snmpwalk queries all available OIDs (Figure 3). The more precise the OID, the less you are flooded with information.

Figure 3: Taking an snmpwalk from the route of the MIB tree (OID: .) returns many results.

Information about the meaning of individual OIDs is often found on the device manufacturer's website or on relevant Internet forums. Additionally, there are many standardized OIDs (e.g., for names and uptimes of devices and for the number of packets sent and received). Depending on the product, you can also query information about the number of clients connected with the access point or the number of available and assigned DHCP leases.

Other SNMP commands are available in addition to snmpwalk. Whereas snmpwalk returns the complete OID branch as per the request, snmpget restricts itself to the specified OID. By default, the commands return OIDs on Debian/Raspbian in numeric format. If you prefer to see the intuitive name instead, to be able to assess the significance of the entry more easily, you can uncomment the mibs : line in the /etc/snmp/snmp.conf file.

At Hochlland, I was mainly interested in a couple of things: Can all the devices still be reached? How many WLAN clients are connected to the individual APs? What does the CPU and memory usage look like, and how many packets are passing through each device? You could add any number of points that you are interested in here.

To test the availability of the devices, it makes more sense to use ping than to use SNMP. If you can't ping the system, SNMP will also complain about a timeout, and you would be able to use this for further evaluations. Sending many SNMP queries takes far longer than just offloading a couple of pings, however.

Keep It Short …

A MIB tree typically contains far more information than you actually need for your evaluation. You can do yourself a favor by restricting the output to less information.

The devices connected to individual routers or access points are provided by the ipNetToMediaEntry (OID 1.3.6.1.2.1.4.22.1 ) branch, for example. You can query available and used memory using OIDs 1.3.6.1.2.1.25.2.3.1.5.101 and 1.3.6.1.2.1.25.2.3.1.6.101 ; you can see the average CPU load of the last 15 minutes with the OID 1.3.6.1.4.1.2021.10.1.5.3 ; and OID 1.3.6.1.2.1.2.2.1 collects all the available information for the network interfaces. Again, it might be worthwhile to reduce the volume of data here: You will typically not want to log MTUs, interface designations, and so on every time.

Once you have established that snmpwalk or snmpget returns the results you need, you can bundle the commands into the script that is run later on as a cronjob. Snmpwalk offers another couple of options for truncating the output. In the sample script monitor.sh (Listing 4), the query uses -Oqs; that is, only the last element in the OID and the matching value are output.

Listing 4

monitor.sh

#! /bin/bash
#: Title: monitor.sh
#: Date: 28.01.2015
#: Author: Falko Benthin
#: Version: 1.0
#: Desciption: Sends SNMP requests to individual APs/routers and logs \
   the output with timestamps for evaluation later
#: Options: none
# sends snmp requests to individual hosts
function checkMachines() {
  # ipNetToMediaPhysAddress
  snmpwalk -v1 -Oqs -c $ROCOMMUNITY $HOST .1.3.6.1.2.1.4.22.1.3
  # memory_used
  snmpwalk -v1 -Oqs -c $ROCOMMUNITY $HOST .1.3.6.1.2.1.25.2.3.1.6.101
  # CPU-load-1 snmpwalk -v1 -Oqs -c $ROCOMMUNITY $HOST 1.3.6.1.4.1.2021.10.1.5.1
  # CPU-load-5 snmpwalk -v1 -Oqs -c $ROCOMMUNITY $HOST 1.3.6.1.4.1.2021.10.1.5.2
  # CPU-load-15
  snmpwalk -v1 -Oqs -c $ROCOMMUNITY $HOST .1.3.6.1.4.1.2021.10.1.5.3
  # wlan_clients
  snmpwalk -v1 -Oqs -c $ROCOMMUNITY $HOST .1.3.6.1.4.1.2021.255.3.54.1.3.32.1.4
  # ifInOctets
  snmpwalk -v1 -Oqs -c $ROCOMMUNITY $HOST .1.3.6.1.2.1.2.2.1.10
  # ifInUcastPkts
  snmpwalk -v1 -Oqs -c $ROCOMMUNITY $HOST .1.3.6.1.2.1.2.2.1.11
  # ifInDiscards
  snmpwalk -v1 -Oqs -c $ROCOMMUNITY $HOST .1.3.6.1.2.1.2.2.1.13
  # ifInErrors
  snmpwalk -v1 -Oqs -c $ROCOMMUNITY $HOST .1.3.6.1.2.1.2.2.1.14
  # ifOutOctets
  snmpwalk -v1 -Oqs -c $ROCOMMUNITY $HOST .1.3.6.1.2.1.2.2.1.16
  # ifOutUcastPkts
  snmpwalk -v1 -Oqs -c $ROCOMMUNITY $HOST .1.3.6.1.2.1.2.2.1.17
  # ifOutDiscards
  snmpwalk -v1 -Oqs -c $ROCOMMUNITY $HOST .1.3.6.1.2.1.2.2.1.19
  # ifOutErrors
  snmpwalk -v1 -Oqs -c $ROCOMMUNITY $HOST .1.3.6.1.2.1.2.2.1.20
  }
# Directory for logfiles
LOGDIR="/home/falko/monitorlog"
# community string
ROCOMMUNITY="community"
# date
YEAR=$( date +%Y )
MONTH=$( date +%m )
DAY=$( date +%d )
while read HOST DESC
do
  DATEDIR=$LOGDIR/$YEAR/$MONTH/$DAY
  # Directory for date
  if [ ! -d $DATEDIR ]; then
    mkdir -p $DATEDIR
  fi
  # check if host is reachable
  if ! ping -c3 $HOST > /dev/null; then
    if [ ! -e $LOGDIR/$HOST.lastmail.log ] || [ ! $( date -d @$( cat \
       $LOGDIR/$HOST.lastmail.log ) +%d ) = $DAY ]
    then
      printf "The AP/Router %s, %s is not reachable. Please check." \
             $HOST "$DESC" | mail -s "Check AP/Router" recp1@samplemail.org \
             recp2@samplemail.org recp3@samplemail.org
      echo $( date +%s ) > $LOGDIR/$HOST.lastmail.log
    fi
  else
    # SMTP-Checks and Logging
    checkMachines | \
    while read OUTPUT
    do
      printf "%s %s\n" $( date +%T ) "$OUTPUT" >> $DATEDIR/$HOST.log
    done
  fi
done < machines.txt
exit 0

The script bundles the individual queries into a function so that you only need to modify one part if the requirements change. You can save the hosts you want to monitor with their IP addresses and a description in a text file (Listing 5). The description contains the device type and location so that a member of staff who is not familiar with the setup still knows where to look. Finally, monitor.sh is called regularly as a cronjob.

Listing 5

IP Addresses of Hosts

192.168.2.1   Modem in the office
192.168.2.2   Picostation roof
192.168.10.1  AP seminar building
192.168.10.4  AP Hochlland canteen
192.168.13.1  Picostation new building
192.168.13.2  AP new building first floor
192.168.13.3  AP new building second floor
192.168.13.4  AP new building top

Logfiles

If guests complained that the Internet was slow, a quick check of the logfiles for the day in question helps to identify potential issues or discover whether you need additional information.

Moreover, you can check whether there are WLAN clients that connect suspiciously frequently to your APs and potentially need special treatment. You might want to query more information from the WLAN AP in question and define a firewall rule for the client on that basis.

In the course of time, you will collect a large volume of log data. A script that runs once a day, compress_and_delete.sh (Listing 6), helps save storage by compressing the previous day's logs with gzip and deleting the logs after 30 days.

Listing 6

compress_and_delete.sh

#! /bin/bash
#: Title: compress_and_delete.sh
#: Date: 28.01.2015
#: Author: Falko Benthin
#: Version: 1.0
#: Description: Compresses old logs and deletes very old logs
#: Options: none
# Directory for logfiles
LOGDIR="/home/falko/monitorlog"
# gestern
YEAR=$( date -d "yesterday" +%Y )
MONTH=$( date -d "yesterday" +%m )
DAY=$( date -d "yesterday" +%d )
# Compress yesterday's logs
gzip $LOGDIR/$YEAR/$MONTH/$DAY/*log
# Delete old logs
find $LOGDIR -mtime +30 | xargs rm
exit 0

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Implement your own MIBs with Python
    Measured values and status information can be collected and retrieved, messages received, and configurations changed remotely by SNMP, but if you want to do this for your own hardware or software, you need your own Management Information Base module.
  • Storage monitoring with Grafana
    Create intuitive and meaningful visualizations of storage performance values with a "TIG" stack: Telegraf, InfluxDB, and Grafana.
  • Understanding Autodiscovery

    A lack of information about your infrastructure can result in faulty system configuration and other difficulties. Automatic discovery of all hosts and services would seem to be the best solution – but can it also prove itself in practice?

  • How to query sensors for helpful metrics
    Discover the sensors that already exist on your systems, learn how to query their information, and add them to your metrics dashboard.
  • Improved visibility on the network
    OpenNMS collects and visualizes flows so you can discover which network devices communicate with each other and the volume of data transferred.
comments powered by Disqus