tracking latency, loss, and jitter with SmokePing

Most network monitoring tools retry failed connections. snmpwalk sends multiple SNMP queries, giving the agent multiple chances to respond. Nagios lets you configure how often you retry queries, and specifically delays alarms to avoid transient issues. You do not want your pager going off at 3AM because something dropped a single packet! Losing a packet or two on occasion is fine, but losing one or two every time you run a check is a problem — and most monitoring tools can’t tell the difference. Don’t just crank up your monitoring software’s loss tolerance. You must know how often your network drops requests. That’s where SmokePing comes in. SmokePing measures loss, latency, and jitter for ICMP and application-level requests.

SmokePing is in the FreeBSD ports as /usr/ports/net-mgmt/smokeping, OpenBSD ports as /usr/ports/net/smokeping, and NetBSD as /usr/pkgsrc/net/smokeping. My example server is FreeBSD 9, with SmokePing 2.4.2.

The SmokePing port offers several different probes, or utilities for performing checks. In this example we’ll use the default probe, fping. While other probes, such as measuring DNS response time, are useful, they don’t address today’s day job problem.

SmokePing is configured in /usr/local/etc/smokeping/config. The config file is a little different than most; it’s neither XML-ish nor C-esque. A hash mark is still a comment. Three asterisks marks off a configuration section. SmokePing uses a hierarchical configuration for monitoring hosts, and an item’s depth in the hierarchy is dictated by the number of plus signs before it. Variables are set with equals signs. It’s easy enough once you work through it a bit.

Here’s the basic settings:


*** General ***
owner = mwlucas
contact = mwlucas@blackhelicopters.org
mailhost = mail.blackhelicopters.org
sendmail = /usr/sbin/sendmail

The Web interface needs some paths. I put my Web sites under /var/www/site/application. On this server, I want any local SmokePing stuff under /var/www/monitor/smoke. I’ll also use Apache aliases to direct part of the site to the directory where the port installed the files.

imgcache = /var/www/monitor/smoke/images
imgurl = https://monitor.blackhelicopters.org/smoke-images/
datadir = /var/db/smoke
piddir = /usr/local/var/smokeping/
cgiurl = https://monitor.blackhelicopters.org/smoke/smokeping.cgi
smokemail = /usr/local/etc/smokeping/smokemail
tmail = /usr/local/etc/smokeping/tmail
# specify this to get syslog logging
syslogfacility = local0

Create the directories assigned to datadir and imagesdir. The user smokeping must own the directory assigned to datadir. The Web server user (www) must own the imagesdir.

As a general rule, I don’t permit applications write to files in the same directory that they’re installed in. It interfered with package management and added to security problems. Perhaps that’s not such a big concern these days, but I’m kind of old-school.

Configure /etc/syslog.conf to log local0 to /var/log/smokeping.

local0.* /var/log/smokeping

I’m not configuring alarms right now, so you can comment out the line *** Alerts *** and everything beneath it until the next section. Similarly, comment out the entire *** Slaves *** section.

Leave “Presentation” and “Database” alone, unless you a) understand RRD and want to muck with the innards of how SmokePing stores its data, and b) understand SmokePing. If you’re reading this article to learn about SmokePing, you automatically fail b).

Under the Probes header, ensure the path to FPing is correct.

The interesting bit is the Targets section. Here’s where you define which hosts you want to ping. SmokePing uses a hierarchical configuration that both lists the hosts you want to monitor and how you want the results displayed.

*** Targets ***
probe = FPing

menu = Top
title = Network Latency Grapher
remark = Welcome to BH.org SmokePing.

This header tells SmokePing that we’re configuring objects to be checked with FPing. We set a menu section and title, then proceed to the first target.


+ Southfield
menu = Southfield
title = Southfield

++ router6
host = router6.blackhelicopters.org
++ router8
host = router8.blackhelicopters.org

+ chi
menu = Chicago
title = Chicago
++ chi-1
host=chi-1.blackhelicopters.org

Here I’ve set up two first-level menus, Southfield (a suburb of Detroit) and Chicago. The Southfield menu has two entries beneath it. Each sub-entry has a title (indicated with ++) and a host. SmokePing will check these routers with FPing, and will create an interactive menu on the Web site arranging them as you have here.

Set smokeping_enable=YES in /etc/rc.conf, and run /usr/local/etc/rc.d/smokeping start. Check /var/log/smokeping (you did set up syslog, didn’t you?) for any errors.

Now the Web interface. FreeBSD’s package installed SmokePing’s CGI and related files in /usr/local/smokeping/htdocs. I want to use /var/www/monitor/smoke/images/ as the image cache. My httpd.conf for this is:

Alias /smoke/ "/usr/local/smokeping/htdocs/"

Options ExecCGI
AllowOverride None
Allow from All
AddHandler cgi-script cgi

Alias /smoke-images/ "/var/www/monitor/smoke/images/"

I control access to my network management Web sites with LDAP. If you want to restrict with Apache’s IP address ACLs instead, change the Allow from All to something more suitable. Don’t open SmokePing to the world. Your customers and/or users will find it and ask a lot of inconvenient questions.

SmokePing creates graphs indicating the average ping request latency in a green line, with smoky grey/black bars indicating jitter. When SmokePing loses packets, the line color changes.

I’ll probably write more about SmokePing, as this hardly touches the surface. Tracking things like DNS query latency can help narrow down server-side problems.