Tech – Page 15 – Michael W Lucas

January 11, 2011

identifying probable intrusion vectors with flow data

Shortly after Absolute FreeBSD came out, I worked with gpart(8) and thought “I should have put this in the book.” Just after Cisco Routers for the Desperate went to the printer, I worked with tracking gateway availability and said “Drat! This should have gone into the book!” This is a recurring motif in my life.

Now that Network Flow Analysis is out, I should have marked calendar space for “interesting flow analysis opportunity.” If you want to know the details behind all of this, look in the book or in the flow-tools documentation.

Someone recently penetrated a dev server I help support. I want to learn how they got access, using flow data. I have no idea if this is realistic, but let’s go for it. I previously made a reasonable guess about the date the host was compromised, so I know the time window to examine. I’ll attack the problem by identifying “known good” traffic, removing it from the data, and examining what remains. (This might not be the best method, but I know that a couple security and intrusion response folks read this blog, and one in particular won’t hesitate to tell me I’m fubar, so check for comments.)

First, let’s see the traffic this host sends and receives.

# flow-cat 2010-11-09/ft* | flow-nfilter -F ip-addr -v ADDR=189.22.36.165 | flow-print | less

srcIP dstIP prot srcPort dstPort octets packets 189.22.36.165 194.28.157.50 6 7781 80 40 1 194.28.157.50 189.22.36.165 6 80 7781 40 1 189.22.36.165 194.28.157.50 6 9008 80 40 1 189.22.36.165 194.28.157.50 6 9008 80 40 1 194.28.157.50 189.22.36.165 6 80 9008 80 2 189.22.36.165 194.28.157.50 6 6625 80 80 2 194.28.157.50 189.22.36.165 6 80 6625 80 2 189.22.36.165 82.135.96.18 6 445 59423 80 2 82.135.96.18 189.22.36.165 6 59423 445 96 2 189.22.36.165 72.167.161.47 6 80 51428 40 1 72.167.161.47 189.22.36.165 6 51404 21 84 2 ...

This machine is an Ubuntu box. It regularly contacts random Internet sites to check for updates. The developer also browses the Web from it. If I’m to have any luck, I must exclude Web browsing traffic from this host. (To the best of my knowledge, there is not yet a Web site that will automatically root any Unix-like system. I might be wrong.) I normally configure most filtering on the command line, but this is complicated enough that I need to write an actual filter for it.

filter-primitive port80 type ip-port permit 80

filter-primitive victim type ip-address permit 189.22.36.165

filter-definition victim-browsing invert match ip-source-address victim match ip-destination-port port80 or match ip-destination-address victim match ip-source-port port80

We match all traffic from the victim machine to port 80, and from port 80 to the victim machine, then invert the filter to exclude everything that matches. Add this filter to the command line and we get:

srcIP            dstIP            prot  srcPort  dstPort  octets      packets
189.22.36.165    82.135.96.18     6     445      59423    80          2
82.135.96.18     189.22.36.165    6     59423    445      96          2
189.22.36.165    72.167.161.47    6     80       51428    40          1
72.167.161.47    189.22.36.165    6     51404    21       84          2
72.167.161.47    189.22.36.165    6     49768    21       296         6
189.22.36.165    72.167.161.47    6     21       49768    262         3
72.167.161.47    189.22.36.165    6     51428    80       40          1
...

Some interesting things here. This machine shouldn’t be running a SMB server, but the first two flows show that someone connected to us on port 445, we answered, and we sent a bunch of data. The developer owner probably installed Samba as a dependency of something else she installed, and never even noticed. Nobody on the outside world should be talking to this machine’s Web site, but it’s not that surprising that someone did. There’s a small FTP query next; I suspect it’s one of the innumerable FTP scanners.

There’s still 1,690 lines of this stuff; far too much to assess by eye. Let’s trim it down by assuming this is the most common sort of intrusion.

Generally, an intruder attacks a service on a machine. He would then send the code for the exploit or IRC bouncer to the machine through that service. Let’s make the (uncertain and unreliable) assumption that one or the other of these is larger than 1 packet. Most DNS transactions, pings, etc, are 1 packet, so by looking for flows larger than 1 packet we exclude this innocuous traffic. The following primitive and filter only passes flows larger than 1 packet.

filter-primitive gt1packet type counter permit gt 1

filter-definition gt1packet match packets gt1packet

Now add |flow-nfilter -F gt1packet to the command line and see what remains. The following immediately stands out:

...
189.22.36.165    79.115.103.225   6     22       4382     3703        19
189.22.36.165    79.115.103.225   6     22       4383     3095        11
189.22.36.165    79.115.103.225   6     6667     4384     120         3
189.22.36.165    79.115.103.225   6     6667     4385     120         3
...

The first port 6667 connections are to a host 79.115.103.225, a Romanian system. Let’s strip out all of the previous filters and see what traffic these two hosts have exchanged. There’s a lot of SSH traffic, more than we see from the usual brute-force guesser.

# flow-cat 2010-11-09/ft* | flow-nfilter -F ip-addr -v ADDR=189.22.36.165 | \
   flow-nfilter -F ip-addr -v ADDR=79.115.103.225  | flow-print | less
srcIP            dstIP            prot  srcPort  dstPort  octets      packets
79.115.103.225   189.22.36.165    6     4381     22       371         6
189.22.36.165    79.115.103.225   6     22       4381     394         7
79.115.103.225   189.22.36.165    6     4383     22       1984        14
189.22.36.165    79.115.103.225   6     22       4382     3703        19
189.22.36.165    79.115.103.225   6     22       4383     3095        11
189.22.36.165    79.115.103.225   6     6667     4384     120         3
189.22.36.165    79.115.103.225   6     6667     4385     120         3
79.115.103.225   189.22.36.165    6     4384     6667     192         3
79.115.103.225   189.22.36.165    6     4382     22       11804       118
79.115.103.225   189.22.36.165    6     4385     6667     192         3
189.22.36.165    79.115.103.225   6     22       4382     12688       103
79.115.103.225   189.22.36.165    6     4382     22       1664        19
79.115.103.225   189.22.36.165    6     4382     22       5564        64
189.22.36.165    79.115.103.225   6     22       4382     9708        50
79.115.103.225   189.22.36.165    6     4382     22       14956       169
189.22.36.165    79.115.103.225   6     22       4382     16060       129
79.115.103.225   189.22.36.165    6     4382     22       1040        12
189.22.36.165    79.115.103.225   6     22       4382     928         8
189.22.36.165    79.115.103.225   6     8888     4470     120         3
79.115.103.225   189.22.36.165    6     4470     8888     192         3
79.115.103.225   189.22.36.165    6     4382     22       4316        49
189.22.36.165    79.115.103.225   6     22       4382     11344       42
79.115.103.225   189.22.36.165    6     4382     22       1924        23
189.22.36.165    79.115.103.225   6     22       4382     8800        20
...

Using flow-print -f 5, I can view the timestamps and verify that the IRC activity started shortly after the SSH activity started using larger amounts of bandwidth.

Can I be certain that 79.115.103.225 is my attacker? No. Is this activity suspicious? Absolutely. I can examine the hacked machine, or a disk image thereof, and identify the account used to penetrate the machine.

This is not proof, but it’s a place to start. In assessing the rest of the data, I can now exclude this host. This will further reduce the pool of data I am assessing.

While I can’t use this as grounds for flying to Romania with body armor, a machine gun, and a machete, I can realistically act on this information. I can report the activity to the IP address owner. I can check my network for other connections from this host, and verify the integrity of any machines it’s connected to. I can use this a a part of my business case to firewall off this part of the network. It will support my argument to forbid passwords for SSH connections on dev machines.

In retrospect, I could have made other assumptions that might have let me find this more quickly, e.g., I could have investigated the first hosts contacted on the questionable ports. But every puzzle is easy once you’ve solved it. After this, I’d have to say that backtracking intrusion vectors through flow data is very practical, even when you don’t have much experience.

January 10, 2011January 10, 2011

dating an intrusion with flow data

One of my Ubuntu dev boxes was broken into. While the box isn’t vital, I’ll still need to reinstall an operating system and set it back up for the developer. I want to know where the attack came from and what the intruder did. I cannot trust the logs on the system, but I can trust the flow data from our upstream router.

I’ve changed my IP addresses, but remote addresses are left unchanged. Here I examine my flow data from 1 January 2011, and remove IP addresses I expect to contact this machine.

# flow-cat ft* | flow-nfilter -F ip-addr -v ADDR=189.22.36.165 | flow-print | grep -v mgmt.ip.ad.dr | grep -v dev.ip.add.dr | less srcIP dstIP prot srcPort dstPort octets packets 189.22.36.165 208.83.20.130 6 60702 6667 198 3 189.22.36.165 208.83.20.130 6 60703 6667 196 3 208.83.20.130 189.22.36.165 6 6667 60702 152 3 208.83.20.130 189.22.36.165 6 6667 60703 152 3 ...

So, what do I learn here? This system was compromised on or before New Years’ Day. 208.83.20.130 is an IRC server, and 6667 is an IRC port. Someone is using my system to play IRC games. Bastards. Other checks show that the intruders are also using port 7000.

I don’t know exactly when the system was compromised. Fortunately, I have my old flow records. I go back and check the first of each previous month, narrowing down the time window. 1 December looks like 1 January, but 1 November looks different:

srcIP dstIP prot srcPort dstPort octets packets 189.22.36.165 189.22.37.222 17 123 123 76 1 189.22.36.222 189.22.36.165 17 123 123 76 1 206.80.36.88 189.22.36.165 17 65015 5060 368 1 189.22.36.165 206.80.36.88 1 0 771 396 1 129.82.138.38 189.22.36.165 1 0 2048 28 1 ...

The port 123 UDP traffic is NTP. And someone poked at me with a SIP client, but we didn’t answer. This is about what I’d expect to see on a machine sitting naked on the Internet.

Next, search to narrow down the time window. When I find the first day the IRC server traffic appears, I know when to start looking for the actual intrusion activity. When was the first day that port 6667 and 7000 traffic appeared? It was present on 1 December, but not 1 November. Check November 15: present, November 7: present, etc, etc. Eventually, I see the traffic is present on 9 November, but not on 8 November.

The intrusion happened on or before 9 November 2011.

Next I will examine the traffic for 8 and 9 November and see if I can determine where the intruders came from and their attack vector. I haven’t done that analysis yet, so who knows what I’ll find, if anything, but I’ll post on my efforts one way or another.

UPDATE: Oh, right, I’m an author. While I shouldn’t blatantly pimp myself out here, when I do an on-book-topic post, I should at least say “Hey, if you want to do this too, you can learn how by reading my newest book.” Sheesh. Being non-commercial is one thing, being actively daft is another.

January 5, 2011January 19, 2011

mod_security rule upgrades and logging

I recently installed mod_security2 on my personal Web server to block out the most annoying referral spam. It blocked the worst offenders. Then I found that mod_security also included the ability to block all access from sites in a DNS-based RBL. This would further reduce my comment and referral spam problems, at the cost of making the site slightly slower. There are also rules to block SQL injection attacks and other known attack vectors. If you’re trying to read this from a machine on DNS blacklists, stop reading and go get yourself off the blacklist.

The RBL rules aren’t in the base FreeBSD package. They’re in the newer mod_security2 ruleset, available from mod_security’s Sourceforge download page. Get the newest file.

Move your existing rules to a safe place, and put the new rules where the old rules were. Do not delete your old rules, you’ll want them for reference. In my case, the active rules directory is /usr/local/etc/apache22/Includes/mod_security2. I moved the existing directory to /usr/local/etc/apache22/Includes/old-mod_security2, created a new mod_security2 directory, and unzipped the rules it there.

The rules directory contains an example rule file, modsecurity_crs_10_config.conf.example. Apache will read any file that ends in .conf as a config file, so copy (not move) that example to modsecurity_crs_10_config.conf. Edit that file to include changes from the original setup, e.g.:

SecRuleEngine On SecDataDir /var/run/modsecurity

Copy your referer.conf referrer blacklist into the new rules directory. Then reload Apache. If Apache won’t restart, read the error messages and correct them.

Now that you have the base rules upgraded, you can add rules from the optional_rules directory. I specifically want the comment spam blocking, so I copied modsecurity_crs_42_comment_spam.conf to the main directory and reloaded Apache.

Then use wget to test my work, using one of the less offensive referral spam sites as a referrer. (I’ve changed the name of the site to avoid giving them any more links.)

avarice/tmp$ wget http://www.michaelwlucas.com/ --referer=http://www.fishingscum.com

–2011-01-04 17:04:06– http://www.michaelwlucas.com/
Resolving www.michaelwlucas.com (www.michaelwlucas.com)… 198.22.63.8
Connecting to www.michaelwlucas.com (www.michaelwlucas.com)|198.22.63.8|:80… connected.
HTTP request sent, awaiting response… 500 Internal Server Error
2011-01-04 17:04:06 ERROR 500: Internal Server Error.

That’s what I want.

As I had to take the time to upgrade, I wanted to also get a log of what hits I was blocking. This only took adding two lines to the configuration:

SecDebugLogLevel 1 SecDebugLog /var/log/modsecurity.log

My wget request generated this log entry:

[04/Jan/2011:17:04:06 --0500] [www.michaelwlucas.com/sid#801948060][rid#801aa20a0][/][1] Access denied with code 500 (phase 2). Pattern match "fishingscum" at REQUEST_HEADERS:Referer. [file "/usr/local/etc/apache22/Includes/mod_security2/referer.conf"] [line "35"]

Setting SecDebugLogLevel to 2 gave me details on how mod_security2 processed its logs. That will be useful if I ever have to write my own mod_security2 rules. I suspect that if I have to do that, though, I’m solving the wrong problem.

One interesting thing I saw here was how the log statement in mod_security2 rules is applied. If you use the log keyword in a rule, a log message appears in the standard Apache access and error logs as well as the mod_security2. If you do not use the log statement, a message appears in the modsecurity log but not in the Apache logs. An anti-referral-spam rule should look like this:

SecRule REQUEST_HEADERS:REFERER "ezinearticles" deny,status:500

24 hours later, WordPress shows only 5 comments in my anti-spam queue. Another annoyance quashed.

UPDATE: More here.

December 6, 2010December 6, 2010

TechChannel interview published

The video interview I did last month is now available on-line. It’s about NetFlow, and is based on the Network Flow Analysis book.

I can’t bring myself to watch it.

(Two posts in one day. This can’t be good.)

UPDATE: No, it’s not good. Apparently, WordPress doesn’t show the links on the front page, even though it shows the complete article. You must click to the individual article to see the link to the interview. I’m sure there’s a perfectly good reason WP behaves this way, but it still feels bogus.

December 6, 2010December 6, 2010

RANCID, Mikrotik, and SSH

I’m a big fan of RANCID. While RANCID is best known as a management tool for automatically backing up Cisco configs, it also supports much other handware, and is fairly easily extensible. I’m responsible for several Mikrotik routers, and need to back up their configurations. People have written scripts for Mikrotik support in RANCID… but they don’t work with SSH, only telnet. And they don’t work if you run SSH on an unusual port.

After trials, errors, advice from Chris Falz, and more errors and trials, I found that the following RANCID configuration works.

add password YourRouter YourPasswordHere add user YourRouter YourUsername+ct add method YourRouter ssh add sshcmd YourRouter {/usr/local/scripts/microtiklogin.sh} add noenable YourRouter {1}

Adding +ct to your username turns off color. Setting an SSH port in RANCID’s usual way didn’t work with the third-party mtlogin script, and the sshcmd variable doesn’t cope with spaces well, so I used an external SSH command script. This script is just:

#!/bin/sh exec ssh -p PortNumber $@

My Mikrotik configs are now automatically backed up over SSH.

If you’re looking for a good Perl project, fixing the actual underlying mtlogin and mtrancid SSH functions would be appreciated.

December 2, 2010

Ubuntu server 10.04 LTS diskless filesystem

A diskless server needs a copy of the operating system files, served from an NFS server. The Ubuntu docs have a general-purpose tutorial on diskless systems, which suggests copying the files from your NFS server. My NFS servers are not Ubuntu boxes. Also, I don’t want to copy from a live system; too many things can happen. I want a set of Ubuntu server files that I can use to deploy a functional server in a known good state, that complies with the requirements of my environment. And I need to script it, so I can boot and update my “golden image” server and easily reproduce the same file set. And I want all the routine changes taken care of automatically.

This problem isn’t hard, but I’ve spent a fair amount of time building and rebuilding diskless systems lately, so you get to hear about it.

Install an actual Ubuntu system. I prefer to install on a virtual machine. This will become your “golden image.” When the Ubuntu installer asks for a machine profile, choose OpenSSH server.

apt-get update && apt-get upgrade
Install required software, such as emacs, tcsh, and configure .
install portmap and nfs-common.
Install and configure LDAP auth and sudo against LDAP
Install and configure ufw. I’ve seen many attacks against Ubuntu boxes lately, and highly recommend very restrictive firewall rules. Do not let the world talk to your Ubuntu servers!
Make a VM snapshot of your base image, so you can revert to this core functionality
Install anything else required to make this a nice clean template for the purpose of this server.

Now mount a directory on another server on the clean server’s /mnt via NFS and tar up the server.

# cd / # tar -cvpf /mnt/ubuntu1004.tar --one-file-system .
Wait.

The resulting tarball has a few problems. I don’t want the diskless hosts to all have the same SSH keys, so those files need to be removed. Ubuntu caches the MAC address of attached NICs to maintain consistent interface names across reboots. This cached MAC address will be wrong for the diskless machine. The existing interface configuration will not work on a diskless machine (see below). Finally, the fstab is wrong for any diskless machine. The machine will get its hostname from DHCP, rather than from a file. I therefore remove the troublesome files from the tarball.

# tar --delete -f /mnt/ubuntu1004.tar ./etc/ssh/ssh_host_rsa_key ./etc/ssh/ssh_host_rsa_key.pub ./etc/ssh/ssh_host_dsa_key ./etc/ssh/ssh_host_dsa_key.pub ./etc/udev/rules.d/70-persistent-net.rules ./etc/fstab ./etc/network/interfaces ./etc/hostname
The difficult file is /etc/network/interfaces. I don’t want to use the server’s network configuration. My test server boots from either DHCP or with a static IP, and neither will work for a diskless server. A diskless server needs an /etc/network/interfaces like this:

auto lo iface lo inet loopback auto eth0 iface eth0 inet manual
I want to replace the existing ./etc/network/interfaces with one of my own choosing. Tar won’t let you replace a file in an existing archive, but it will let you add another file of the same name. I change to a config directory and add this file to my tarball. Similarly, I need a blank etc/fstab. I create a fake etc directory in another location, touch etc/fstab, and create a suitable etc/network/interfaces.

# tar --append -f /mnt/ubuntu1004.tar etc/network/interfaces etc/fstab
To use this file, log into NFS server, go to the mount point for the diskless system, and run:

# tar -xpf /path/ubuntu1004.tar
The machine will then boot, is easily cloned, built to my standards, and the only customization needed is to run dpkg-reconfigure openssh-server.

As I installed on a virtual server I can snapshot the golden image and build custom filesystems for different purposes.

Lots of long commands? Yep. This basically screams “8-line shell script, please.” It’s a pretty trivial script, but if you’ve made it this far, you’re either interested in what I’m doing or astonished at my inanity. In either case, you should get the script too.
#!/bin/sh

mount nfs1:/tmpmount /mnt
cd /
tar -cvpf /mnt/ubuntu1004.tar –one-file-system .

tar –delete -vf /mnt/ubuntu1004.tar ./etc/ssh/ssh_host_rsa_key ./etc/ssh/ssh_host_rsa_key.pub ./etc/ssh/ssh_host_dsa_key ./etc/ssh/ssh_host_dsa_key.pub ./etc/udev/rules.d/70-persistent-net.rules ./etc/fstab ./etc/network/interfaces ./etc/hostname

cd /home/mwlucas/fakeroot
tar –append -f /mnt/ubuntu1004.tar etc/network/interfaces etc/fstab

Yes, this shell script is a good example of fault-oblivious computing. But it suits my minimal needs, and performs the same task the same way every time.

November 24, 2010

“Page Cannot Be Displayed” and Internet Explorer

I detest this IE error message, especially when a user calls to complain that a Web site is down. Internet Explorer deliberately hides actual HTTP error messages on the grounds that the Web offers unfriendly but useful error messages. Apparently this generic message is much less likely to cause the user to flee in terror from insanity-inducing text such as “404 – Page Not Found.” They effectively shift the induced sanity from the end user to the sysadmin.

There’s a way to turn off this generic friendly message and replace it with the actual error. It’s under Tools-> Internet Options -> Advanced -> Browsing -> Show friendly HTTP error messages. Uncheck this and restart the browser to get user-hostile but troubleshooting-friendly error messages.

Every time I need this, I have to scramble to find it. Perhaps now that I’ve documented this, I’ll remember where it is. But I doubt it.

On an unrelated note: tomorrow is the Thanksgiving holiday in the US. I’d like to remind my readers that the holiday buffet is not a challenge, and that leaving food uneaten is not a threat to your masculinity (or femininity, or whatever).

November 19, 2010November 22, 2010

Firewalling diskless Ubuntu

I have diskless Ubuntu 10.04 servers sitting naked on the Internet. They’re for internal use only, but I don’t have a firewall in that facility, so any firewalling must be done on the host itself. Ubuntu includes UFW, the “uncomplicated firewall,” a front end to iptables. I don’t know how anything can claim to make iptables uncomplicated, but I suppose nobody would use the tool if they called it “less appalling firewall.”

These servers need to be able to contact the Internet, to get updates and such, but nobody except myself and my coworkers need to access these servers. The coworkers and I only come from a limited range of IP addresses.

On a disk-based server, I would define rules in UFW and then run ufw default deny incoming, much like this:

# ufw enable
# ufw allow from 10.0.1.0/24
# ufw allow from 172.16.5.0/24
# ufw default deny

If you do this on a diskless Ubuntu server, the system loses disk — even if you have a rule that specifically permits access to the diskless server. The obvious thing to try is to rip out the “default deny” and replace it with a rule to block unwanted traffic at the end.

# ufw deny from 0.0.0.0/0

Your resulting rules look like this:

# ufw status
Status: active

To                         Action      From
--                         ------      ----
Anywhere                   ALLOW       10.0.1.0/24
Anywhere                   ALLOW       172.16.5.0/24
Anywhere                   DENY        Anywhere

This looks like it should work. I attempt to connect to the SSH server from an IP not in the permitted list, however, and can connect. It’s not blocking traffic from denied hosts. Huh?

Go to the file that contains the user rules, /lib/ufw/user.rules. This is actually a script to feed to iptables. There are several lines like this, one for each block of management addresses:

### tuple ### allow any any 0.0.0.0/0 any 10.0.1.0 in
-A ufw-user-input -s 10.0.1.0 -j ACCEPT

My last rule, however, looks different.

### tuple ### deny any any 0.0.0.0/0 any 0.0.0.0/0 in
-A ufw-user-input -j DROP

The “all other IP addresses” is probably implied in that last rule, but… it really couldn’t be that simple, could it? I edit the script to explicitly specify the source IP addresses:

-A ufw-user-input-s 0.0.0.0/0 -j DROP

and reboot.

And yes, it is that simple. The firewall comes up at boot. ufw status displays exactly the same rules as before. But now, I can only connect from my management IP addresses.

The problem with tools that make things “uncomplicated” is that rather than removing the underlying complexity, they hide it. I probably need to break down and learn iptables, but I think I’d rather figure out how to get these hosts behind a PF box.

November 15, 2010January 19, 2011

mod_security on FreeBSD

The constant stream of referrer spam isn’t sufficiently annoying; no, now worms constantly nibble at my WordPress install. I could avoid worrying about this by, say, having a third party host my content and control my work, but if I did that I’d get a punch on both my geek card and my writer card. And I still wouldn’t know who is linking to me. Some of the referral spam I get hits 10-15 times a day, flooding actual links.

Fortunately, Apache’s mod_security can help lock down my server. While you’ll find tutorials on using mod_security to stop referrer spam, mod_security can do much more. Here I’m installing mod_security on my FreeBSD server running Apache 2.2.

# cd /usr/ports/www/mod_security
# make all install clean

Look in /usr/local/etc/apache22/Includes afterwards. You’ll find the file mod_security2.conf and the directory mod_security2. Initially, mod_security is loaded into Apache but doesn’t block anything. Go into the mod_security2 directory and edit the main config file, modsecurity_crs_10_config.conf. Change the SecRuleEngine to On, and create a SecDataDir, like so:

SecRuleEngine On
SecDataDir /var/run/modsecurity

You’ll need to create the security data directory and make it writable by Apache. Then restart Apache.

# mkdir /var/run/modsecurity
# chown www:www /var/run/modsecurity
# apachectl restart

Now test your Web server, and verify that it still functions. Bad Web applications can trip over mod_security2. If your Web app fails, I’d suggest talking to the vendor about why your application doesn’t work securely.

If your site still works with mod_security2, you can start to block referrers that bug you. In the mod_security2 directory, create the file referer.conf for rules to block bogus referrers. The rule has this general syntax:

SecRule REQUEST_HEADERS:REFERER “REGEX” deny,log,status:500

mod_security will evaluate each incoming request by its header. If the referrer matches the regular expression in quotes, the browser will return a 500 error. The sample rules below show a small slice of the things I’m blocking.

…
SecRule REQUEST_HEADERS:REFERER “write\-a\-resume” deny,log,status:500
SecRule REQUEST_HEADERS:REFERER “wigmall” deny,log,status:500
SecRule REQUEST_HEADERS:REFERER “windowsphone” deny,log,status:500
SecRule REQUEST_HEADERS:REFERER “windows\-phone” deny,log,status:500
SecRule REQUEST_HEADERS:REFERER “zune” deny,log,status:500

It’s possible that this would block legitimate traffic, but I have a hard time imagining being linked from a weight loss or Windows Phone site. It’ll take a while to accumulate a list of suitable regexes for my site. And it’s a limited technique — I’m enumerating badness. But mod_security also protects me against the various WordPress worms, and it can also block traffic from addresses on an RBL. I’ll do that at a later date.

UPDATE: Your SecRule should not include the “log” keyword. See the later posting here.

UPDATE2: more here.

November 4, 2010

fixing ESXi “failed with error N7Vmacore15SystemExceptionE”

An ESXi server failed this morning. As there’s a couple critical services on this piece of hardware, the power in the new data center isn’t up to where we want it yet, and the radio said it was snowing near the office, I drove in expecting to find some unspeakable power situation. The power was fine, but the ESXi server was sitting at a panic screen. Power cycle the machine. It comes up, but none of the VMs start. The vSphere client won’t connect. The server Web page is blank.

Fortunately, tech support mode works. Hit alt-F1, type unsupported, and enter the root password when asked. Whenever I tried to connect to the server with vSphere, my “tail -f /var/log/messages” said something like:

Nov  4 23:35:09 Hostd: [2010-11-04 23:35:09.117 25233B90 warning 'Proxysvc Req00011'] 
Error reading from client while waiting for header: 
N7Vmacore15SystemExceptionE(Connection reset by peer)

This is not good. No, not good at all. I wanted to spend the day converting a machine from OpenSolaris to FreeBSD and installing my router for my new bandwidth. Instead Fate has decreed today Wedgie Day.

Mailing list archives and forum posts showed that many people have had this problem. Lots of the forums end with “did anyone ever solve this?” A few people reinstalled ESXi to solve the problem. A couple folks claimed it was a DNS issue.

Our DNS setup hadn’t changed, but I followed the advice and made the following changes.

In /etc/hosts, remove the real address for the machine and replace it with 127.0.0.1
Remove all DNS servers from /etc/resolv.conf

I rebooted. The machine came up, and the VMs started. Everything seems fine, but we’ll have to see what happens later.

I have no idea why this worked. Three cheers for “occult IT”! Sigh.