fixing ESXi “failed with error N7Vmacore15SystemExceptionE”

An ESXi server failed this morning.  As there’s a couple critical services on this piece of hardware, the power in the new data center isn’t up to where we want it yet, and the radio said it was snowing near the office, I drove in expecting to find some unspeakable power situation.  The power was fine, but the ESXi server was sitting at a panic screen.  Power cycle the machine.  It comes up, but none of the VMs start.  The vSphere client won’t connect.  The server Web page is blank.

Fortunately, tech support mode works.  Hit alt-F1, type unsupported, and enter the root password when asked.  Whenever I tried to connect to the server with vSphere, my “tail -f /var/log/messages” said something like:

Nov  4 23:35:09 Hostd: [2010-11-04 23:35:09.117 25233B90 warning 'Proxysvc Req00011'] 
Error reading from client while waiting for header: 
N7Vmacore15SystemExceptionE(Connection reset by peer)

This is not good.  No, not good at all.  I wanted to spend the day converting a machine from OpenSolaris to FreeBSD and installing my router for my new bandwidth.  Instead Fate has decreed today Wedgie Day.

Mailing list archives and forum posts showed that many people have had this problem.  Lots of the forums end with “did anyone ever solve this?”  A few people reinstalled ESXi to solve the problem.  A couple folks claimed it was a DNS issue.

Our DNS setup hadn’t changed, but I followed the advice and made the following changes.

  • In /etc/hosts, remove the real address for the machine and replace it with 127.0.0.1
  • Remove all DNS servers from /etc/resolv.conf

I rebooted.  The machine came up, and the VMs started.  Everything seems fine, but we’ll have to see what happens later.

I have no idea why this worked.  Three cheers for “occult IT”!  Sigh.

nested pf.conf macros

Many of my FreeBSD servers are not behind a firewall.  They sit naked on the Internet, and I protect their services with PF.  I have several “trusted” networks, and want to use them in macros.  Keeping track of serveral networks in a macro is error-prone, however.  Previously, I used macros like this one:

#lucas_house=10.20.20.0/28
#main_office=192.168.1.0/25
#monitor=17.16.1.1
#boss_house=10.20.30.0/24
mgmt_networks ="{ 10.20.20.0/28, 192.168.1.0/25, 172.16.1.1, 10.20.30.0/24 "}"

This meant entering each IP address twice.  Complicated numbers hurt my feeble brain, and the result is errors.  Entering each address multiple times is begging for an error.  I found that you can nest macros, however, with careful placement of single and double quotes.

lucas_house='"10.20.20.0/28"'
main_office='"192.168.1.0/25"'
monitor='"17.16.1.1"'
boss_house='"10.20.30.0/24"'
mgmt_networks ="{" $lucas_house $main_office $monitor $boss_house "}"

Note that each address is in single quotes (‘), enclosed by double quotes (“).  In the mgmt_networks macro, put double quotes around the enclosing brackets. This is in the man page example, but you have to look very closely at it.

I can then allow SSH, SNMP, SIP, etc, from my management networks to the server, and my addresses will be consistent.

opennebula with one iscsi target per VM

OpenNebula users know that NFS is just too slow for virtual machine disk images.  Fiber Channel works, but is too expensive for me.  Rather than deal with disk image speed issues, I’m using NFS on ZFS for file storage and booting my systems diskless.  Diskless servers have a lot of advantages, but speed isn’t one of them.  This is fine for most applications, but a few things (databases come to mind) perform better on a speedy disk.  I want the ability to use diskless machines where appropriate, but use cheap networked disk when necessary.  Ideally, I want iSCSI on top of ZFS.  Short of ideal, I’ll take iSCSI any way I can get it.  I want the virtualization server to attach to the iSCSI target, and then offer that target to the VM as if it was a local disk.

There’s an alpha one-iSCSI-target-per-VM transfer manager driver.  It’s intended for a Linux iSCSI server, which I don’t have and don’t intend to run.  Instead, I have a stack of cheap NAS appliances.  Here’s how I got one target per VM running in my OpenNebula instance. Continue reading “opennebula with one iscsi target per VM”

Network collisions running hosts under KVM

I use KVM and OpenNebula on Ubuntu for virtualization. Getting such a cluster up and running is easy, but making it perform well takes much more work.  Many times, the statement “my virtualization cluster works well” is equivalent to “I’m not paying attention.”  My FreeBSD hosts help point out problems, though.  All of my FreeBSD servers send me a daily email to tell me they’re still alive and to point out potential issues.  That’s how I found out I was getting network collisions on my virtualized hosts, and here’s how I investigated them. Continue reading “Network collisions running hosts under KVM”

Finding a SIP DoS attack via flow analysis

I’m leaving my getting hit in the head lesson when the boss calls.  Some unmentionable orifice is firing DOS attacks at a couple of our SIP servers.  My mission, should I choose to accept it, is to find and block the attackers.  (Should I choose to not accept it, then my mission will be to listen to Fearless Leader whine about it.  I can’t stand whining.)  Fortunately, I have flow data for one of the servers under attack. Continue reading “Finding a SIP DoS attack via flow analysis”