Fail Quickly

I’ve started the next book for No Starch Press. There’s an outline, and I’ve written both the introduction and the afterword. All that’s left is the hard stuff in between, twenty-some chapters of it.

Where to start writing? That’s easy: First, I write the stuff that’s most likely to make the book fail.

Every project has easy parts that are fun and go quickly. Those are the tasks you’re most familiar with, that leverage your existing skills. Then there’s the parts that require you to learn new things, or demand that you actually spend time and energy breaking them down so others can understand you. These are the parts of the project that are most likely to make the project fail. I want to get those parts over with as quickly as possible.

If the entire book is going to collapse because four chapters are impossible to write, it’s better to know that up front than after I’ve written the eighteen easy chapters leading up to them. I’m writing about 500 words an hour on this part, where I normally write 1000 words an hour. It’s drudgery, but they’ll get done.

I’ve seen a lot of IT projects fail by spending their initial burst of energy on the easy stuff. If you do the easy part first, the hard part gets time to grow in your mind. You’ll spend energy dreading it. Worse, the time you spend doing the easy stuff might be completely wasted — after all, if you can’t do the hard part, then you have to throw everything else away. You can always do the easy part after you succeed at the hard bit, and it’ll make the rest of the project go more quickly.

Now if you’ll excuse me, I have to finish this section of this chapter tonight…

Public Service Announcement on Painting Old Brick

A modern hand scraper and wire brush can strip peeling, mildewy paint from a concrete basement wall almost easily — at least, much easier than when I was a kid and had to do the same job with a pointed stick and piece of chalk. The equipment comes with warnings in big black letters. “Wear Goggles!” “Wear Gloves!” “May Sever Fingers!” And so on. You don’t want to get a flying paint chip in your eye.

Unfortunately, it doesn’t come with a warning that says “Keep Mouth Shut.”

Describing the taste of a hundred-year-old mildewed paint chip as “Lovecraftian” would leave me without adequate vocabulary to describe the texture.

The moral is: when you need to shut up and do the job, don’t forget the “shut up” part.

Adding IPv6 to a FreeBSD Mail/Web Server

We’ve run out of IPv4 addresses. If you’re not already on IPv6, start hoarding gasoline and canned potted meat food product. Doomsday is here, film at eleven. Or, failing that, start running IPv6 on something so you can have a little familiarity with the new Internet protocol before you absolutely must. My personal FreeBSD 9 server (which hosts my email, this blog, web sites for my books, and a whole bunch of other equally trivial cruft) is now IPv6-enabled, even though the local site doesn’t have IPv6 connectivity. Here’s how I did it.

Establishing IPv6 connectivity to and from an IPv4-only server breaks requires:

  • Get an IPv6 tunnel from a tunnel provider
  • Configure a generic IPv4 tunnel to the tunnel provider
  • Assign IPv6 addresses to your IPv4 generic tunnel
  • Assign your IPv6 default route over the tunnel
  • Establish IPv6 DNS resolution
  • Configure services to run on IPv6
  • Offer IPv6 DNS records

    If you’re reading this , you probably don’t have IPv6 at your facility. You’ll need an IPv6 tunnel, offered for free by many providers. I used Hurricane Electric, but use any broker you like. Sign up for an account, respond to the verification mail, and request a tunnel. The Web interface will give you a bunch of details about your tunnel.

    The gif interface provides a generic IPv4 tunnel that can be used for many protocols. Configuring an IPv4 tunnel requires only the IP addresses on each end. ifconfig(8) creates a tunnel with just:

    # ifconfig gif0 tunnel 198.22.63.8 209.51.181.2

    You must be able to ping the tunnel’s remote address.

    Now assign IPv6 addresses to your gif0 tunnel.

    # ifconfig gif0 inet6 your-IPv6-address remote-IPv6-address prefixlen 128

    For example, my HE-assigned IPv6 tunnel endpoint is 2001:470:1f10:b9c::2. The he.net IPv6 address is 2001:470:1f10:b9c::1. I assign my IPv6 addresses as:

    # ifconfig gif0 inet6 2001:470:1f10:b9c::2 2001:470:1f10:b9c::1 prefixlen 128

    Verify that your IPv6 addresses are correctly configured by using ping6 to hit the far end. Remember, standard ping will not work — ping is specific to IPv4.

    # ping6 2001:470:1f10:b9c::1
    PING6(56=40+8+8 bytes) 2001:470:1f10:b9c::2 –> 2001:470:1f10:b9c::1
    16 bytes from 2001:470:1f10:b9c::1, icmp_seq=0 hlim=64 time=19.209 ms
    16 bytes from 2001:470:1f10:b9c::1, icmp_seq=1 hlim=64 time=21.661 ms

    At this point, you have IPv6. Now assign the IPv6 default route to the remote end of the tunnel.

    # route -n add -inet6 default 2001:470:1f10:b9c::1

    Your server will now send all IPv6 traffic across your IPv4 tunnel, while still routing IPv4 traffic as usual. Remember, IPv4 and IPv6 are different protocols.

    Some Internet sites, such as Google, have special requirements for accessing their IPv6 DNS. Your tunnel broker provides an IPv6-aware DNS server. Now that you have a default route, see if you can ping6 it. If you can ping the DNS server, edit /etc/resolv.conf. Remove your IPv4 nameservers. Add the IPv6 nameserver. Check DNS for IPv4 (A records) and IPv6 (AAAA records) with dig(1).

    # dig www.google.com A

    ;; ANSWER SECTION:
    www.google.com. 20478 IN CNAME www.l.google.com.
    www.l.google.com. 222 IN A 209.85.225.99
    www.l.google.com. 222 IN A 209.85.225.147
    www.l.google.com. 222 IN A 209.85.225.104
    www.l.google.com. 222 IN A 209.85.225.105
    www.l.google.com. 222 IN A 209.85.225.103
    www.l.google.com. 222 IN A 209.85.225.106

    This looks correct. Let’s try AAAA records.

    # dig www.google.com AAAA

    www.google.com. 20368 IN CNAME www.l.google.com.
    www.l.google.com. 180 IN AAAA 2001:4860:b007::63

    This is an IPv6 answer. Google has fewer IPv6 servers than IPv4 servers, but that’s to be expected these days.

    Now configure services on your server to listen on IPv6 addresses. Daemons included in FreeBSD listen to IPv6 by default. Run sockstat -6 to see what programs are listening to your new IPv6 address. In my case, Apache only listened to IPv4. At some point in the foggy past, I had turned off IPv6 when configuring the port. I rebuilt devel/apr1 and www/apache22 with IPv6 support, restarted Apache, and it listened to my IPv6 address without issue.

    Last, you must publish AAAA records for the hosts you want to offer over IPv6. By gradually adding AAAA records, you can slowly increase the amount of traffic you deliver over IPv6, letting your your IPv6 traffic grow slowly.

    www IN A 198.22.63.8
    www IN AAAA 2001:470:1f10:b9c::2

    Properly-configured hosts will attempt to connect to services on IPv6 first. If those connection attempts fail, they will try IPv4 instead.

    To make your FreeBSD changes permanent, use your addresses in the /etc/rc.conf entries below.

    gif_interfaces=”gif0″
    gifconfig_gif0=”198.22.63.8 209.51.181.2″
    ipv6_network_interfaces=”gif0 lo0″
    ifconfig_gif0_ipv6=”inet6 2001:470:1f10:b9c::2 2001:470:1f10:b9c::1 prefixlen 128″
    ipv6_defaultrouter=”2001:470:1f10:b9c::1″

    Lastly, tell your users that you have IPv6. Otherwise, nobody will notice. It’s that transparent.

  • tracking latency, loss, and jitter with SmokePing

    Most network monitoring tools retry failed connections. snmpwalk sends multiple SNMP queries, giving the agent multiple chances to respond. Nagios lets you configure how often you retry queries, and specifically delays alarms to avoid transient issues. You do not want your pager going off at 3AM because something dropped a single packet! Losing a packet or two on occasion is fine, but losing one or two every time you run a check is a problem — and most monitoring tools can’t tell the difference. Don’t just crank up your monitoring software’s loss tolerance. You must know how often your network drops requests. That’s where SmokePing comes in. SmokePing measures loss, latency, and jitter for ICMP and application-level requests.

    SmokePing is in the FreeBSD ports as /usr/ports/net-mgmt/smokeping, OpenBSD ports as /usr/ports/net/smokeping, and NetBSD as /usr/pkgsrc/net/smokeping. My example server is FreeBSD 9, with SmokePing 2.4.2.

    The SmokePing port offers several different probes, or utilities for performing checks. In this example we’ll use the default probe, fping. While other probes, such as measuring DNS response time, are useful, they don’t address today’s day job problem.

    SmokePing is configured in /usr/local/etc/smokeping/config. The config file is a little different than most; it’s neither XML-ish nor C-esque. A hash mark is still a comment. Three asterisks marks off a configuration section. SmokePing uses a hierarchical configuration for monitoring hosts, and an item’s depth in the hierarchy is dictated by the number of plus signs before it. Variables are set with equals signs. It’s easy enough once you work through it a bit.

    Here’s the basic settings:


    *** General ***
    owner = mwlucas
    contact = mwlucas@blackhelicopters.org
    mailhost = mail.blackhelicopters.org
    sendmail = /usr/sbin/sendmail

    The Web interface needs some paths. I put my Web sites under /var/www/site/application. On this server, I want any local SmokePing stuff under /var/www/monitor/smoke. I’ll also use Apache aliases to direct part of the site to the directory where the port installed the files.

    imgcache = /var/www/monitor/smoke/images
    imgurl = https://monitor.blackhelicopters.org/smoke-images/
    datadir = /var/db/smoke
    piddir = /usr/local/var/smokeping/
    cgiurl = https://monitor.blackhelicopters.org/smoke/smokeping.cgi
    smokemail = /usr/local/etc/smokeping/smokemail
    tmail = /usr/local/etc/smokeping/tmail
    # specify this to get syslog logging
    syslogfacility = local0

    Create the directories assigned to datadir and imagesdir. The user smokeping must own the directory assigned to datadir. The Web server user (www) must own the imagesdir.

    As a general rule, I don’t permit applications write to files in the same directory that they’re installed in. It interfered with package management and added to security problems. Perhaps that’s not such a big concern these days, but I’m kind of old-school.

    Configure /etc/syslog.conf to log local0 to /var/log/smokeping.

    local0.* /var/log/smokeping

    I’m not configuring alarms right now, so you can comment out the line *** Alerts *** and everything beneath it until the next section. Similarly, comment out the entire *** Slaves *** section.

    Leave “Presentation” and “Database” alone, unless you a) understand RRD and want to muck with the innards of how SmokePing stores its data, and b) understand SmokePing. If you’re reading this article to learn about SmokePing, you automatically fail b).

    Under the Probes header, ensure the path to FPing is correct.

    The interesting bit is the Targets section. Here’s where you define which hosts you want to ping. SmokePing uses a hierarchical configuration that both lists the hosts you want to monitor and how you want the results displayed.

    *** Targets ***
    probe = FPing

    menu = Top
    title = Network Latency Grapher
    remark = Welcome to BH.org SmokePing.

    This header tells SmokePing that we’re configuring objects to be checked with FPing. We set a menu section and title, then proceed to the first target.


    + Southfield
    menu = Southfield
    title = Southfield

    ++ router6
    host = router6.blackhelicopters.org
    ++ router8
    host = router8.blackhelicopters.org

    + chi
    menu = Chicago
    title = Chicago
    ++ chi-1
    host=chi-1.blackhelicopters.org

    Here I’ve set up two first-level menus, Southfield (a suburb of Detroit) and Chicago. The Southfield menu has two entries beneath it. Each sub-entry has a title (indicated with ++) and a host. SmokePing will check these routers with FPing, and will create an interactive menu on the Web site arranging them as you have here.

    Set smokeping_enable=YES in /etc/rc.conf, and run /usr/local/etc/rc.d/smokeping start. Check /var/log/smokeping (you did set up syslog, didn’t you?) for any errors.

    Now the Web interface. FreeBSD’s package installed SmokePing’s CGI and related files in /usr/local/smokeping/htdocs. I want to use /var/www/monitor/smoke/images/ as the image cache. My httpd.conf for this is:

    Alias /smoke/ "/usr/local/smokeping/htdocs/"

    Options ExecCGI
    AllowOverride None
    Allow from All
    AddHandler cgi-script cgi

    Alias /smoke-images/ "/var/www/monitor/smoke/images/"

    I control access to my network management Web sites with LDAP. If you want to restrict with Apache’s IP address ACLs instead, change the Allow from All to something more suitable. Don’t open SmokePing to the world. Your customers and/or users will find it and ask a lot of inconvenient questions.

    SmokePing creates graphs indicating the average ping request latency in a green line, with smoky grey/black bars indicating jitter. When SmokePing loses packets, the line color changes.

    I’ll probably write more about SmokePing, as this hardly touches the surface. Tracking things like DNS query latency can help narrow down server-side problems.

    Microsoft’s BSD support

    On the NetBSD blog you’ll find an announcement that Microsoft has donated working code to support an experimental hardware platform to NetBSD.

    Microsoft has a mixed relationship with open source software. There’s the perennial discussions about Windows using BSD’s TCP/IP stack, .NET for FreeBSD, Microsoft buying and killing a NetBSD-based phone, and any amount of blather ranging from the absurd to the paranoid. What makes this different?

    First, it’s a gift. No strings attached — the BSD license doesn’t support strings. Copyright has been assigned to the NetBSD Foundation. It’s ours now, and there’s nothing Microsoft — or anyone — can do to take it back.

    Second, the extensible MIPS hardware can be reconfigured in software to support application-specific tasks. This is cool. I’m sure that someone will tell me that this was done twenty years ago and that the prior work has been unfairly ignored since, and someone else will tell me that this is really no big deal, but it sure sounds interesting to my uneducated ears.

    Third, NetBSD support will help get extensible MIPS running on other BSD platforms, and to a lesser extent on other operating systems. If the hardware ever becomes widespread, that is.

    I doubt that this means any sea change in Microsoft’s relationship with open source. This code is of limited use today, given the scarcity of hardware. Microsoft Research offering eMIPS patches would not surprise me, but there’s a difference between cooperation in research and cooperation anywhere else.

    upgrading to OpenBSD-current, the stupid way

    My desktop runs an OpenBSD snapshot from April 2010. It’s well past time I upgraded. OpenBSD’s usual upgrade path works quite well, but I’m simultaneously lazy and willing to reinstall this system from scratch if something ghastly happens. (This might also invalidate any bug report you send.)

    Don’t do this if you have any need or respect for your computer. I treat my desktop with a mix of indifference and contempt, so I’ll proceed.

    Back up your data. I attached my external 1TB USB drive. /var/log/messages shows:

    Jan 21 10:08:17 avarice /bsd: sd0 at scsibus2 targ 1 lun 0: SCSI2 0/direct fixed
    Jan 21 10:08:17 avarice /bsd: sd0: 953869MB, 512 bytes/sec, 1953525168 sec total

    It’s device sd0. What partitions are on it?

    $ sudo disklabel sd0
    ...16 partitions:
    # size offset fstype [fsize bsize cpg]
    c: 1953525168 0 unused
    i: 1953520002 63 MSDOS

    I want to mount sd0i.

    $ sudo mount_msdos /dev/sd0i /mnt/
    $ cd /home
    $ sudo gtar -cvMf /mnt/laptop.tar mwlucas

    One annoyance with using an MSDOS-formatted disk for backup is that you can’t have a file larger than 4GB. My home directory is multiple times that. I must use gtar to back up my home directory, and use the multiple-volumes option. When gtar completes a 4GB file, it asks me to prepare a new volume. Move the existing backup file to a different file, then hit return to have gtar continue.

    While that’s running, let’s get the download files. Go to the OpenBSD mirror list and choose one near you. Use a web browser to verify that the shapshot on the site is current. Open a FTP session to that site, and grab all the bsd* and *.tgz files.

    ftp> cd pub/OpenBSD/snapshots/amd64
    250 Directory successfully changed.
    ftp> prompt
    Interactive mode off.
    ftp> mget bsd*
    wait
    ftp> mget *.tgz
    wait…

    Verify the checksums of the downloaded files against the checksums in the SHA256 file on the FTP site.

    $ cksum -a sha256 *

    I have backups. I have the files, and they aren’t corrupt. We are now at the point of no return. You can still follow the recommended upgrade procedure. I encourage you to do so.

    Shut down all unnecessary processes. If you’re forwarding packets, stop. If you’re in X, exit to a text console. Kill all daemons that aren’t necessary for a minimally-running system.

    Copy your desired kernel to the root directory. I’m using the multiprocessor kernel. Also save a copy of your current reboot command.

    $ rm /obsd ; ln /bsd /obsd && cp bsd.mp /nbsd && mv /nbsd /bsd
    $ cp bsd.rd /
    $ cp bsd /bsd.sp

    Now overwrite the nonessential parts of your userland.

    $ tar -C / -xzvphf xserv49.tgz
    $ tar -C / -xzphf xfont49.tgz
    $ tar -C / -xzphf xshare49.tgz
    $ tar -C / -xzphf xbase49.tgz
    $ tar -C / -xzphf game49.tgz
    $ tar -C / -xzphf comp49.tgz
    $ tar -C / -xzphf man49.tgz

    Do not extract the etc49.tgz distribution, as that will overwrite your core system configuration! You must update /etc separately.

    Update the core programs last. The core system includes programs like tar and reboot. Once you update the core, your system is running a new userland on an old kernel.

    $ tar -C / -xzphf base49.tgz

    Your system is now basically unusable; you have new binaries running on an old kernel. You must reboot now. Afterwards, I’m running:

    OpenBSD 4.9-beta (GENERIC.MP) #777: Tue Jan 18 13:56:34 MST 2011

    Generate the new device nodes.

    $ cd /dev/
    $ sudo ./MAKEDEV all

    I prefer to reboot after recreating device nodes. The new reboot command is now usable. After the next reboot everything looks fine, except for this message:

    Could not load host key: /etc/ssh/ssh_host_ecdsa_key

    So, there’s a new key type. I’ll get that as I upgrade /etc, by running sysmerge(8). Go to the snapshot directory and run:

    $ sudo sysmerge -s etc49.tgz -x xetc49.tgz

    Sysmerge will compare your installed /etc with the snapshot fileset and show you the diffs. You can install the new file, delete the new file, or merge the two together. If you’ve used mergemaster(8), sysmerge(8) will be no surprise.

    Then reboot again. With the new /etc, OpenBSD automatically generates the missing SSH key for the new crypto algorithm.

    My system is now upgraded.

    In the interest of sanity, I need to remove and reinstall all the packages on this system. This isn’t a big deal, except for those few that must be built as ports because I require something unusual. Set PKG_PATH to the packages directory of your closest FTP mirror and run pkg_add -ui

    $ sudo pkg_add -iu
    quirks-1.32: ok
    ORBit2-2.14.19:libiconv-1.13p0->libiconv-1.13p2: ok
    ORBit2-2.14.19:pcre-7.9->pcre-8.02p1: ok
    ORBit2-2.14.19:libgamin-0.1.10->libgamin-0.1.10p3: ok
    ORBit2-2.14.19:gettext-0.17p0->gettext-0.18.1p0: ok
    ...

    Walk away.

    In this particular case, pkg_add crashed when my chosen FTP mirror limited the number of successive connections from my IP address. I raised this on misc@, and got an answer and a fix almost immediately.

    So, even fools like me can get help. But don’t count on it.

    mod_security2 case sensitive?

    I’ve written previously about using mod_security to block referral spam and hosts on a DNS-based RBL.  I thought it was working pretty well, until I looked at my referrers today and saw lots of hits from “FreePornVideos.bogus” (domain name & suffix altered).  I shouldn’t see this, as my mod_security rules include:

    SecRule REQUEST_HEADERS:REFERER "porn" deny,status:500

    Lots of mod_security documentation claims that matches are case-insensitive.  I should not be seeing this.  What’s going on?  I believe that the problem is that the referral matches are case-sensitive, but let’s verify that.  First, let’s try a simple referral in lower case.

    $ wget http://www.michaelwlucas.com/ --referer=porn
    --2011-01-19 10:17:32--  http://www.michaelwlucas.com/
    Resolving www.michaelwlucas.com (www.michaelwlucas.com)... 198.22.63.8
    Connecting to www.michaelwlucas.com (www.michaelwlucas.com)|198.22.63.8|:80... connected.
    HTTP request sent, awaiting response... 500 Internal Server Error
    2011-01-19 10:17:32 ERROR 500: Internal Server Error.

    That works as expected.  Now try with a capital letter:

    $ wget http://www.michaelwlucas.com/ --referer=Porn
    --2011-01-19 10:17:34--  http://www.michaelwlucas.com/
    Resolving www.michaelwlucas.com (www.michaelwlucas.com)... 198.22.63.8
    Connecting to www.michaelwlucas.com (www.michaelwlucas.com)|198.22.63.8|:80... connected.
    HTTP request sent, awaiting response... 200 OK
    Length: 10376 (10K) [text/html]
    Saving to: `index.html'

    Matches are case sensitive, despite what I read in the documentation.  Listing both Porn and porn won’t solve the problem, because that won’t protect me from pORN.

    Lesson of the day: verify you’re reading the correct documentation, and that you read what the author actually wrote.  mod_security2 uses PCRE for regular expressions. Version 1 used POSIX.  If I want case-insensitive matching, I have to declare that in my regex.  I modified the rule to read:

    SecRule REQUEST_HEADERS:REFERER "(?i:(porn))" deny,status:500

    Reload Apache. Test again with wget.  Both porn and Porn are now blocked, as well as pORN.  Petulance of the day remediated. Now back to BGP.

    identifying probable intrusion vectors with flow data

    Shortly after Absolute FreeBSD came out, I worked with gpart(8) and thought “I should have put this in the book.”  Just after Cisco Routers for the Desperate went to the printer, I worked with tracking gateway availability and said “Drat!  This should have gone into the book!”  This is a recurring motif in my life.

    Now that Network Flow Analysis is out, I should have marked calendar space for “interesting flow analysis opportunity.”  If you want to know the details behind all of this, look in the book or in the flow-tools documentation.

    Someone recently penetrated a dev server I help support. I want to learn how they got access, using flow data.  I have no idea if this is realistic, but let’s go for it.  I previously made a reasonable guess about the date the host was compromised, so I know the time window to examine. I’ll attack the problem by identifying “known good” traffic, removing it from the data, and examining what remains. (This might not be the best method, but I know that a couple security and intrusion response folks read this blog, and one in particular won’t hesitate to tell me I’m fubar, so check for comments.)

    First, let’s see the traffic this host sends and receives.

    # flow-cat 2010-11-09/ft* | flow-nfilter -F ip-addr -v ADDR=189.22.36.165 | flow-print | less

    srcIP            dstIP            prot  srcPort  dstPort  octets      packets
    189.22.36.165    194.28.157.50    6     7781     80       40          1
    194.28.157.50    189.22.36.165    6     80       7781     40          1
    189.22.36.165    194.28.157.50    6     9008     80       40          1
    189.22.36.165    194.28.157.50    6     9008     80       40          1
    194.28.157.50    189.22.36.165    6     80       9008     80          2
    189.22.36.165    194.28.157.50    6     6625     80       80          2
    194.28.157.50    189.22.36.165    6     80       6625     80          2
    189.22.36.165    82.135.96.18     6     445      59423    80          2
    82.135.96.18     189.22.36.165    6     59423    445      96          2
    189.22.36.165    72.167.161.47    6     80       51428    40          1
    72.167.161.47    189.22.36.165    6     51404    21       84          2
    ...

    This machine is an Ubuntu box.  It regularly contacts random Internet sites to check for updates.  The developer also browses the Web from it.  If I’m to have any luck, I must exclude Web browsing traffic from this host.  (To the best of my knowledge, there is not yet a Web site that will automatically root any Unix-like system.  I might be wrong.)  I normally configure most filtering on the command line, but this is complicated enough that I need to write an actual filter for it.


    filter-primitive port80
    type ip-port
    permit 80

    filter-primitive victim
    type ip-address
    permit 189.22.36.165

    filter-definition victim-browsing
    invert
    match ip-source-address victim
    match ip-destination-port port80
    or
    match ip-destination-address victim
    match ip-source-port port80

    We match all traffic from the victim machine to port 80, and from port 80 to the victim machine, then invert the filter to exclude everything that matches. Add this filter to the command line and we get:

    srcIP            dstIP            prot  srcPort  dstPort  octets      packets
    189.22.36.165    82.135.96.18     6     445      59423    80          2
    82.135.96.18     189.22.36.165    6     59423    445      96          2
    189.22.36.165    72.167.161.47    6     80       51428    40          1
    72.167.161.47    189.22.36.165    6     51404    21       84          2
    72.167.161.47    189.22.36.165    6     49768    21       296         6
    189.22.36.165    72.167.161.47    6     21       49768    262         3
    72.167.161.47    189.22.36.165    6     51428    80       40          1
    ...

    Some interesting things here. This machine shouldn’t be running a SMB server, but the first two flows show that someone connected to us on port 445, we answered, and we sent a bunch of data. The developer owner probably installed Samba as a dependency of something else she installed, and never even noticed. Nobody on the outside world should be talking to this machine’s Web site, but it’s not that surprising that someone did. There’s a small FTP query next; I suspect it’s one of the innumerable FTP scanners.

    There’s still 1,690 lines of this stuff; far too much to assess by eye.  Let’s trim it down by assuming this is the most common sort of intrusion.

    Generally, an intruder attacks a service on a machine. He would then send the code for the exploit or IRC bouncer to the machine through that service.  Let’s make the (uncertain and unreliable) assumption that one or the other of these is larger than 1 packet.  Most DNS transactions, pings, etc, are 1 packet, so by looking for flows larger than 1 packet we exclude this innocuous traffic.  The following primitive and filter only passes flows larger than 1 packet.

    filter-primitive gt1packet
    type counter
    permit gt 1

    filter-definition gt1packet
    match packets gt1packet

    Now add |flow-nfilter -F gt1packet to the command line and see what remains. The following immediately stands out:

    ...
    189.22.36.165    79.115.103.225   6     22       4382     3703        19
    189.22.36.165    79.115.103.225   6     22       4383     3095        11
    189.22.36.165    79.115.103.225   6     6667     4384     120         3
    189.22.36.165    79.115.103.225   6     6667     4385     120         3
    ...

    The first port 6667 connections are to a host 79.115.103.225, a Romanian system. Let’s strip out all of the previous filters and see what traffic these two hosts have exchanged. There’s a lot of SSH traffic, more than we see from the usual brute-force guesser.

    # flow-cat 2010-11-09/ft* | flow-nfilter -F ip-addr -v ADDR=189.22.36.165 | \
       flow-nfilter -F ip-addr -v ADDR=79.115.103.225  | flow-print | less
    srcIP            dstIP            prot  srcPort  dstPort  octets      packets
    79.115.103.225   189.22.36.165    6     4381     22       371         6
    189.22.36.165    79.115.103.225   6     22       4381     394         7
    79.115.103.225   189.22.36.165    6     4383     22       1984        14
    189.22.36.165    79.115.103.225   6     22       4382     3703        19
    189.22.36.165    79.115.103.225   6     22       4383     3095        11
    189.22.36.165    79.115.103.225   6     6667     4384     120         3
    189.22.36.165    79.115.103.225   6     6667     4385     120         3
    79.115.103.225   189.22.36.165    6     4384     6667     192         3
    79.115.103.225   189.22.36.165    6     4382     22       11804       118
    79.115.103.225   189.22.36.165    6     4385     6667     192         3
    189.22.36.165    79.115.103.225   6     22       4382     12688       103
    79.115.103.225   189.22.36.165    6     4382     22       1664        19
    79.115.103.225   189.22.36.165    6     4382     22       5564        64
    189.22.36.165    79.115.103.225   6     22       4382     9708        50
    79.115.103.225   189.22.36.165    6     4382     22       14956       169
    189.22.36.165    79.115.103.225   6     22       4382     16060       129
    79.115.103.225   189.22.36.165    6     4382     22       1040        12
    189.22.36.165    79.115.103.225   6     22       4382     928         8
    189.22.36.165    79.115.103.225   6     8888     4470     120         3
    79.115.103.225   189.22.36.165    6     4470     8888     192         3
    79.115.103.225   189.22.36.165    6     4382     22       4316        49
    189.22.36.165    79.115.103.225   6     22       4382     11344       42
    79.115.103.225   189.22.36.165    6     4382     22       1924        23
    189.22.36.165    79.115.103.225   6     22       4382     8800        20
    ...
    

    Using flow-print -f 5, I can view the timestamps and verify that the IRC activity started shortly after the SSH activity started using larger amounts of bandwidth.

    Can I be certain that 79.115.103.225 is my attacker? No. Is this activity suspicious? Absolutely. I can examine the hacked machine, or a disk image thereof, and identify the account used to penetrate the machine.

    This is not proof, but it’s a place to start. In assessing the rest of the data, I can now exclude this host. This will further reduce the pool of data I am assessing.

    While I can’t use this as grounds for flying to Romania with body armor, a machine gun, and a machete, I can realistically act on this information. I can report the activity to the IP address owner. I can check my network for other connections from this host, and verify the integrity of any machines it’s connected to. I can use this a a part of my business case to firewall off this part of the network. It will support my argument to forbid passwords for SSH connections on dev machines.

    In retrospect, I could have made other assumptions that might have let me find this more quickly, e.g., I could have investigated the first hosts contacted on the questionable ports. But every puzzle is easy once you’ve solved it. After this, I’d have to say that backtracking intrusion vectors through flow data is very practical, even when you don’t have much experience.