dating an intrusion with flow data

One of my Ubuntu dev boxes was broken into. While the box isn’t vital, I’ll still need to reinstall an operating system and set it back up for the developer. I want to know where the attack came from and what the intruder did.  I cannot trust the logs on the system, but I can trust the flow data from our upstream router.

I’ve changed my IP addresses, but remote addresses are left unchanged. Here I examine my flow data from 1 January 2011, and remove IP addresses I expect to contact this machine.


# flow-cat ft* | flow-nfilter -F ip-addr -v ADDR=189.22.36.165 | flow-print | grep -v mgmt.ip.ad.dr | grep -v dev.ip.add.dr | less
srcIP            dstIP            prot  srcPort  dstPort  octets      packets
189.22.36.165    208.83.20.130    6     60702    6667     198         3
189.22.36.165    208.83.20.130    6     60703    6667     196         3
208.83.20.130    189.22.36.165    6     6667     60702    152         3
208.83.20.130    189.22.36.165    6     6667     60703    152         3
...

So, what do I learn here? This system was compromised on or before New Years’ Day. 208.83.20.130 is an IRC server, and 6667 is an IRC port.  Someone is using my system to play IRC games. Bastards. Other checks show that the intruders are also using port 7000.

I don’t know exactly when the system was compromised.  Fortunately, I have my old flow records.  I go back and check the first of each previous month, narrowing down the time window.  1 December looks like 1 January, but 1 November looks different:

srcIP            dstIP            prot  srcPort  dstPort  octets      packets
189.22.36.165    189.22.37.222    17    123      123      76          1
189.22.36.222    189.22.36.165    17    123      123      76          1
206.80.36.88     189.22.36.165    17    65015    5060     368         1
189.22.36.165    206.80.36.88     1     0        771      396         1
129.82.138.38    189.22.36.165    1     0        2048     28          1
...

The port 123 UDP traffic is NTP.  And someone poked at me with a SIP client, but we didn’t answer. This is about what I’d expect to see on a machine sitting naked on the Internet.

Next, search to narrow down the time window. When I find the first day the IRC server traffic appears, I know when to start looking for the actual intrusion activity. When was the first day that port 6667 and 7000 traffic appeared?  It was present on 1 December, but not 1 November.  Check November 15: present, November 7: present, etc, etc.  Eventually, I see the traffic is present on 9 November, but not on 8 November.


# cd /var/db/flows/rtr8/2010/2010-11
# flow-cat 2010-11-09/ft* | flow-nfilter -F ip-addr -v ADDR=189.22.36.165 | flow-nfilter -F not-ip-port -v PORT=80 | flow-nfilter -F not-ip-port -v PORT=53 | flow-nfilter -F not-ip-port -v PORT=123 | flow-print | grep -c 7000
947

# flow-cat 2010-11-08/ft* | flow-nfilter -F ip-addr -v ADDR=189.22.36.165 | flow-nfilter -F not-ip-port -v PORT=80 | flow-nfilter -F not-ip-port -v PORT=53 | flow-nfilter -F not-ip-port -v PORT=123 | flow-print | grep -c 7000
0

The intrusion happened on or before 9 November 2011.

Next I will examine the traffic for 8 and 9 November and see if I can determine where the intruders came from and their attack vector. I haven’t done that analysis yet, so who knows what I’ll find, if anything, but I’ll post on my efforts one way or another.

UPDATE: Oh, right, I’m an author. While I shouldn’t blatantly pimp myself out here, when I do an on-book-topic post, I should at least say “Hey, if you want to do this too, you can learn how by reading my newest book.” Sheesh. Being non-commercial is one thing, being actively daft is another.

mod_security rule upgrades and logging

I recently installed mod_security2 on my personal Web server to block out the most annoying referral spam. It blocked the worst offenders.  Then I found that mod_security also included the ability to block all access from sites in a DNS-based RBL. This would further reduce my comment and referral spam problems, at the cost of making the site slightly slower.  There are also rules to block SQL injection attacks and other known attack vectors.  If you’re trying to read this from a machine on DNS blacklists, stop reading and go get yourself off the blacklist.

The RBL rules aren’t in the base FreeBSD package. They’re in the newer mod_security2 ruleset, available from mod_security’s Sourceforge download page. Get the newest file.

Move your existing rules to a safe place, and put the new rules where the old rules were.  Do not delete your old rules, you’ll want them for reference.  In my case, the active rules directory is /usr/local/etc/apache22/Includes/mod_security2.  I moved the existing directory to /usr/local/etc/apache22/Includes/old-mod_security2, created a new mod_security2 directory, and unzipped the rules it there.

The rules directory contains an example rule file, modsecurity_crs_10_config.conf.example.  Apache will read any file that ends in .conf as a config file, so copy (not move) that example to modsecurity_crs_10_config.conf.  Edit that file to include changes from the original setup, e.g.:

SecRuleEngine On
SecDataDir /var/run/modsecurity

Copy your referer.conf referrer blacklist into the new rules directory.  Then reload Apache.  If Apache won’t restart, read the error messages and correct them.

Now that you have the base rules upgraded, you can add rules from the optional_rules directory.  I specifically want the comment spam blocking, so I copied modsecurity_crs_42_comment_spam.conf to the main directory and reloaded Apache.

Then use wget to test my work, using one of the less offensive referral spam sites as a referrer. (I’ve changed the name of the site to avoid giving them any more links.)

avarice/tmp$ wget http://www.michaelwlucas.com/ --referer=http://www.fishingscum.com

–2011-01-04 17:04:06–  http://www.michaelwlucas.com/
Resolving www.michaelwlucas.com (www.michaelwlucas.com)… 198.22.63.8
Connecting to www.michaelwlucas.com (www.michaelwlucas.com)|198.22.63.8|:80… connected.
HTTP request sent, awaiting response… 500 Internal Server Error
2011-01-04 17:04:06 ERROR 500: Internal Server Error.

That’s what I want.

As I had to take the time to upgrade, I wanted to also get a log of what hits I was blocking.  This only took adding two lines to the configuration:

SecDebugLogLevel 1
SecDebugLog /var/log/modsecurity.log

My wget request generated this log entry:

[04/Jan/2011:17:04:06 --0500] [www.michaelwlucas.com/sid#801948060][rid#801aa20a0][/][1] Access denied with code 500 (phase 2). Pattern match "fishingscum" at REQUEST_HEADERS:Referer. [file "/usr/local/etc/apache22/Includes/mod_security2/referer.conf"] [line "35"]

Setting SecDebugLogLevel to 2 gave me details on how mod_security2 processed its logs.  That will be useful if I ever have to write my own mod_security2 rules.  I suspect that if I have to do that, though, I’m solving the wrong problem.

One interesting thing I saw here was how the log statement in mod_security2 rules is applied.  If you use the log keyword in a rule, a log message appears in the standard Apache access and error logs as well as the mod_security2.  If you do not use the log statement, a message appears in the modsecurity log but not in the Apache logs.  An anti-referral-spam rule should look like this:

SecRule REQUEST_HEADERS:REFERER "ezinearticles" deny,status:500

24 hours later, WordPress shows only 5 comments in my anti-spam queue.  Another annoyance quashed.

UPDATE: More here.

TechChannel interview published

The video interview I did last month is now available on-line.  It’s about NetFlow, and is based on the Network Flow Analysis book.

I can’t bring myself to watch it.

(Two posts in one day.  This can’t be good.)

UPDATE: No, it’s not good. Apparently, WordPress doesn’t show the links on the front page, even though it shows the complete article. You must click to the individual article to see the link to the interview. I’m sure there’s a perfectly good reason WP behaves this way, but it still feels bogus.

RANCID, Mikrotik, and SSH

I’m a big fan of RANCID.  While RANCID is best known as a management tool for automatically backing up Cisco configs, it also supports much other handware, and is fairly easily extensible.  I’m responsible for several Mikrotik routers, and need to back up their configurations.  People have written scripts for Mikrotik support in RANCID… but they don’t work with SSH, only telnet.  And they don’t work if you run SSH on an unusual port.

After trials, errors, advice from Chris Falz, and more errors and trials, I found that the following RANCID configuration works.

add password YourRouter YourPasswordHere
add user YourRouter YourUsername+ct
add method YourRouter ssh
add sshcmd YourRouter {/usr/local/scripts/microtiklogin.sh}
add noenable YourRouter {1}

Adding +ct to your username turns off color.  Setting an SSH port in RANCID’s usual way didn’t work with the third-party mtlogin script, and the sshcmd variable doesn’t cope with spaces well, so I used an external SSH command script.  This script is just:

#!/bin/sh
exec ssh -p PortNumber $@

My Mikrotik configs are now automatically backed up over SSH.

If you’re looking for a good Perl project, fixing the actual underlying mtlogin and mtrancid SSH functions would be appreciated.

Ubuntu server 10.04 LTS diskless filesystem

A diskless server needs a copy of the operating system files, served from an NFS server.  The Ubuntu docs have a general-purpose tutorial on diskless systems, which suggests copying the files from your NFS server.  My NFS servers are not Ubuntu boxes.  Also, I don’t want to copy from a live system; too many things can happen.  I want a set of Ubuntu server files that I can use to deploy a functional server in a known good state, that complies with the requirements of my environment.  And I need to script it, so I can boot and update my “golden image” server and easily reproduce the same file set. And I want all the routine changes taken care of automatically.

This problem isn’t hard, but I’ve spent a fair amount of time building and rebuilding diskless systems lately, so you get to hear about it.

Install an actual Ubuntu system.  I prefer to install on a virtual machine.  This will become your “golden image.”  When the Ubuntu installer asks for a machine profile, choose OpenSSH server.

  • apt-get update && apt-get upgrade
  • Install required software, such as emacs, tcsh, and configure .
  • install portmap and nfs-common.
  • Install and configure LDAP auth and sudo against LDAP
  • Install and configure ufw.  I’ve seen many attacks against Ubuntu boxes lately, and highly recommend very restrictive firewall rules.  Do not let the world talk to your Ubuntu servers!
  • Make a VM snapshot of your base image, so you can revert to this core functionality
  • Install anything else required to make this a nice clean template for the purpose of this server.

Now mount a directory on another server on the clean server’s /mnt via NFS and tar up the server.

# cd /
# tar -cvpf /mnt/ubuntu1004.tar --one-file-system .

Wait.

The resulting tarball has a few problems.  I don’t want the diskless hosts to all have the same SSH keys, so those files need to be removed. Ubuntu caches the MAC address of attached NICs to maintain consistent interface names across reboots. This cached MAC address will be wrong for the diskless machine. The existing interface configuration will not work on a diskless machine (see below).  Finally, the fstab is wrong for any diskless machine.  The machine will get its hostname from DHCP, rather than from a file.  I therefore remove the troublesome files from the tarball.

# tar --delete -f /mnt/ubuntu1004.tar ./etc/ssh/ssh_host_rsa_key ./etc/ssh/ssh_host_rsa_key.pub ./etc/ssh/ssh_host_dsa_key ./etc/ssh/ssh_host_dsa_key.pub ./etc/udev/rules.d/70-persistent-net.rules ./etc/fstab ./etc/network/interfaces ./etc/hostname


The difficult file is /etc/network/interfaces.  I don’t want to use the server’s network configuration.  My test server boots from either DHCP or with a static IP, and neither will work for a diskless server.  A diskless server needs an /etc/network/interfaces like this:

auto lo
iface lo inet loopback
auto eth0
iface eth0 inet manual

I want to replace the existing ./etc/network/interfaces with one of my own choosing.  Tar won’t let you replace a file in an existing archive, but it will let you add another file of the same name.  I change to a config directory and add this file to my tarball.  Similarly, I need a blank etc/fstab.  I create a fake etc directory in another location, touch etc/fstab, and create a suitable etc/network/interfaces.

# tar --append -f /mnt/ubuntu1004.tar etc/network/interfaces etc/fstab

To use this file, log into NFS server, go to the mount point for the diskless system, and run:

# tar -xpf /path/ubuntu1004.tar

The machine will then boot, is easily cloned, built to my standards, and the only customization needed is to run dpkg-reconfigure openssh-server.

As I installed on a virtual server I can snapshot the golden image and build custom filesystems for different purposes.

Lots of long commands?  Yep.  This basically screams “8-line shell script, please.”  It’s a pretty trivial script, but if you’ve made it this far, you’re either interested in what I’m doing or astonished at my inanity.  In either case, you should get the script too.

#!/bin/sh

mount nfs1:/tmpmount /mnt
cd /
tar -cvpf /mnt/ubuntu1004.tar –one-file-system .

tar –delete -vf /mnt/ubuntu1004.tar ./etc/ssh/ssh_host_rsa_key ./etc/ssh/ssh_host_rsa_key.pub ./etc/ssh/ssh_host_dsa_key ./etc/ssh/ssh_host_dsa_key.pub ./etc/udev/rules.d/70-persistent-net.rules ./etc/fstab ./etc/network/interfaces ./etc/hostname

cd /home/mwlucas/fakeroot
tar –append -f /mnt/ubuntu1004.tar etc/network/interfaces etc/fstab

Yes, this shell script is a good example of fault-oblivious computing. But it suits my minimal needs, and performs the same task the same way every time.

“Page Cannot Be Displayed” and Internet Explorer

I detest this IE error message, especially when a user calls to complain that a Web site is down. Internet Explorer deliberately hides actual HTTP error messages on the grounds that the Web offers unfriendly but useful error messages.  Apparently this generic message is much less likely to cause the user to flee in terror from insanity-inducing text such as “404 – Page Not Found.”  They effectively shift the induced sanity from the end user to the sysadmin.

There’s a way to turn off this generic friendly message and replace it with the actual error.  It’s under Tools-> Internet Options -> Advanced -> Browsing -> Show friendly HTTP error messages.  Uncheck this and restart the browser to get user-hostile but troubleshooting-friendly error messages.

Every time I need this, I have to scramble to find it.  Perhaps now that I’ve documented this, I’ll remember where it is.  But I doubt it.

On an unrelated note:  tomorrow is the Thanksgiving holiday in the US.  I’d like to remind my readers that the holiday buffet is not a challenge, and that leaving food uneaten is not a threat to your masculinity (or femininity, or whatever).

Firewalling diskless Ubuntu

I have diskless Ubuntu 10.04 servers sitting naked on the Internet.  They’re for internal use only, but I don’t have a firewall in that facility, so any firewalling must be done on the host itself.  Ubuntu includes UFW, the “uncomplicated firewall,” a front end to iptables.  I don’t know how anything can claim to make iptables uncomplicated, but I suppose nobody would use the tool if they called it “less appalling firewall.”

These servers need to be able to contact the Internet, to get updates and such, but nobody except myself and my coworkers need to access these servers. The coworkers and I only come from a limited range of IP addresses.

On a disk-based server, I would define rules in UFW and then run ufw default deny incoming, much like this:

# ufw enable
# ufw allow from 10.0.1.0/24
# ufw allow from 172.16.5.0/24
# ufw default deny

If you do this on a diskless Ubuntu server, the system loses disk — even if you have a rule that specifically permits access to the diskless server. The obvious thing to try is to rip out the “default deny” and replace it with a rule to block unwanted traffic at the end.

# ufw deny from 0.0.0.0/0

Your resulting rules look like this:

# ufw status
Status: active

To                         Action      From
--                         ------      ----
Anywhere                   ALLOW       10.0.1.0/24
Anywhere                   ALLOW       172.16.5.0/24
Anywhere                   DENY        Anywhere

This looks like it should work.  I attempt to connect to the SSH server from an IP not in the permitted list, however, and can connect.  It’s not blocking traffic from denied hosts.  Huh?

Go to the file that contains the user rules, /lib/ufw/user.rules.  This is actually a script to feed to iptables. There are several lines like this, one for each block of management addresses:

### tuple ### allow any any 0.0.0.0/0 any 10.0.1.0 in
-A ufw-user-input -s 10.0.1.0 -j ACCEPT

My last rule, however, looks different.

### tuple ### deny any any 0.0.0.0/0 any 0.0.0.0/0 in
-A ufw-user-input -j DROP

The “all other IP addresses” is probably implied in that last rule, but… it really couldn’t be that simple, could it?  I edit the script to explicitly specify the source IP addresses:

-A ufw-user-input-s 0.0.0.0/0 -j DROP

and reboot.

And yes, it is that simple.  The firewall comes up at boot.  ufw status displays exactly the same rules as before.  But now, I can only connect from my management IP addresses.

The problem with tools that make things “uncomplicated” is that rather than removing the underlying complexity, they hide it. I probably need to break down and learn iptables, but I think I’d rather figure out how to get these hosts behind a PF box.

mod_security on FreeBSD

The constant stream of referrer spam isn’t sufficiently annoying; no, now worms constantly nibble at my WordPress install.  I could avoid worrying about this by, say, having a third party host my content and control my work, but if I did that I’d get a punch on both my geek card and my writer card.  And I still wouldn’t know who is linking to me.  Some of the referral spam I get hits 10-15 times a day, flooding actual links.

Fortunately, Apache’s mod_security can help lock down my server.  While you’ll find tutorials on using mod_security to stop referrer spam, mod_security can do much more.  Here I’m installing mod_security on my FreeBSD server running Apache 2.2.

# cd /usr/ports/www/mod_security
# make all install clean

Look in /usr/local/etc/apache22/Includes afterwards.  You’ll find the file mod_security2.conf and the directory mod_security2.  Initially, mod_security is loaded into Apache but doesn’t block anything.  Go into the mod_security2 directory and edit the main config file, modsecurity_crs_10_config.conf.  Change the SecRuleEngine to On, and create a SecDataDir, like so:

SecRuleEngine On
SecDataDir /var/run/modsecurity

You’ll need to create the security data directory and make it writable by Apache.  Then restart Apache.

# mkdir /var/run/modsecurity
# chown www:www /var/run/modsecurity
# apachectl restart

Now test your Web server, and verify that it still functions.  Bad Web applications can trip over mod_security2.  If your Web app fails, I’d suggest talking to the vendor about why your application doesn’t work securely.

If your site still works with mod_security2, you can start to block referrers that bug you.  In the mod_security2 directory, create the file referer.conf for rules to block bogus referrers.  The rule has this general syntax:

SecRule REQUEST_HEADERS:REFERER “REGEX” deny,log,status:500

mod_security will evaluate each incoming request by its header.  If the referrer matches the regular expression in quotes, the browser will return a 500 error.  The sample rules below show a small slice of the things I’m blocking.


SecRule REQUEST_HEADERS:REFERER “write\-a\-resume” deny,log,status:500
SecRule REQUEST_HEADERS:REFERER “wigmall” deny,log,status:500
SecRule REQUEST_HEADERS:REFERER “windowsphone” deny,log,status:500
SecRule REQUEST_HEADERS:REFERER “windows\-phone” deny,log,status:500
SecRule REQUEST_HEADERS:REFERER “zune” deny,log,status:500

It’s possible that this would block legitimate traffic, but I have a hard time imagining being linked from a weight loss or Windows Phone site.  It’ll take a while to accumulate a list of suitable regexes for my site.  And it’s a limited technique — I’m enumerating badness. But mod_security also protects me against the various WordPress worms, and it can also block traffic from addresses on an RBL. I’ll do that at a later date.

UPDATE: Your SecRule should not include the “log” keyword. See the later posting here.

UPDATE2: more here.

fixing ESXi “failed with error N7Vmacore15SystemExceptionE”

An ESXi server failed this morning.  As there’s a couple critical services on this piece of hardware, the power in the new data center isn’t up to where we want it yet, and the radio said it was snowing near the office, I drove in expecting to find some unspeakable power situation.  The power was fine, but the ESXi server was sitting at a panic screen.  Power cycle the machine.  It comes up, but none of the VMs start.  The vSphere client won’t connect.  The server Web page is blank.

Fortunately, tech support mode works.  Hit alt-F1, type unsupported, and enter the root password when asked.  Whenever I tried to connect to the server with vSphere, my “tail -f /var/log/messages” said something like:

Nov  4 23:35:09 Hostd: [2010-11-04 23:35:09.117 25233B90 warning 'Proxysvc Req00011'] 
Error reading from client while waiting for header: 
N7Vmacore15SystemExceptionE(Connection reset by peer)

This is not good.  No, not good at all.  I wanted to spend the day converting a machine from OpenSolaris to FreeBSD and installing my router for my new bandwidth.  Instead Fate has decreed today Wedgie Day.

Mailing list archives and forum posts showed that many people have had this problem.  Lots of the forums end with “did anyone ever solve this?”  A few people reinstalled ESXi to solve the problem.  A couple folks claimed it was a DNS issue.

Our DNS setup hadn’t changed, but I followed the advice and made the following changes.

  • In /etc/hosts, remove the real address for the machine and replace it with 127.0.0.1
  • Remove all DNS servers from /etc/resolv.conf

I rebooted.  The machine came up, and the VMs started.  Everything seems fine, but we’ll have to see what happens later.

I have no idea why this worked.  Three cheers for “occult IT”!  Sigh.