Moving Static Sites from Apache to nginx

My more complex Web sites run atop WordPress on Apache and MySQL. Every so often, Apache devours all available memory and the server becomes very very slow. I must log in, kill Apache, and restart it. The more moving parts something has, the harder it is to debug. Apache, with all its modules, has a lot of moving parts.

After six months of intermittent debugging, I decided that with the new hardware I would switch Web server software, and settled on nginx. I’d like to switch to Postgres as well, but WordPress’s official release doesn’t yet support Postgres. WordPress seems to be the best of the available evils — er, Web site design tools. The new server runs FreeBSD 9/i386 running on VMWare ESXi. According to the documentations I’ve dug up, it should all Just Work.

Before making this kind of switch, check the nginx module comparison page. Look for the Apache modules you use, and see if they have an nginx equivalent. I know that nginx doesn’t use .htaccess for password protection; I must put my password protection rules directly in the nginx configuration. Also, nginx doesn’t support anything like the mod_security application firewall. I’ll have to find another way to deal with referrer spam, but at least the site will be up more consistently.

To start, I’m moving my static Web sites to the new server. (I’ll cover the WordPress parts in later posts.) I expect to get all of the functionality out of nginx that I have on Apache.

For many years, blackhelicopters.org was my main Web site. It’s now demoted to test status. Here’s the Apache 2.2 configuration for it.

<VirtualHost *:80>
    ServerAdmin webmaster@blackhelicopters.org
    DocumentRoot /usr/local/www/data/bh
    ServerName blackhelicopters.org
    ServerAlias www.blackhelicopters.org
    ErrorDocument 404 /index.html
    ErrorLog "|/usr/local/sbin/rotatelogs /var/log/bh/bh_error_log.%Y-%m-%d-%H_%M_%S 86400 -300"
    CustomLog "|/usr/local/sbin/rotatelogs /var/log/bh/bh_spam_log.%Y-%m-%d-%H_%M_%S 86400 -300" combined env=spam
    CustomLog "|/usr/local/sbin/rotatelogs /var/log/bh/bh_access_log.%Y-%m-%d-%H_%M_%S 86400 -300" combined env=!spam
Alias /awstatclasses "/usr/local/www/awstats/classes/"
Alias /awstatscss "/usr/local/www/awstats/css/"
Alias /awstatsicons "/usr/local/www/awstats/icons/"
ScriptAlias /awstats/ "/usr/local/www/awstats/cgi-bin/"
<Directory "/usr/local/www/awstats/">
    Options None
    AllowOverride AuthConfig
    Order allow,deny
    Allow from all
</Directory>
</VirtualHost>

/usr/local/etc/nginx/nginx.conf is a sparse, C-style hierarchical configuration file. It’s laid out basically like this:

general nginx settings: pid file, user, etc.
http {
    various web-server-wide settings; log formats, include files, etc.
    server {
        virtual server 1 config here
    }
    server {
        virtual server 2 config here
    }
}

The first thing I need to change is the nginx error log. I rotate my web logs daily, and retain them indefinitely, in a file named by date. In Apache, I achieve this with rotatelogs(8), a program shipped with Apache. nginx doesn’t have this functionality; I must rotate my logs with an external script.

In the http section of the configuration file, I tell nginx where to put the main server logs.

http {
...
error_log /var/log/nginx/nginx-error.log;
access_log /var/log/nginx/nginx-access.log;

Define a virtual server and include the log statements:

http {
...
    server {
        server_name blackhelicopters.org www.blackhelicopters.org;
        access_log /var/log/bh/bh-access.log;
        error_log /var/log/bh/bh-error.log;
        root      /var/www/bh/;
    }
}

That brings up the basic site and its logs. I don’t need to worry about the referral spam log, as I cannot separate it out. nginx doesn’t need ServerAlias entries; just list multiple server names.

To test the basic site, make an /etc/hosts entry on your desktop pointing the site to the new IP address, like so:

139.171.202.40 www.blackhelicopters.org

You desktop Web browser should use /etc/hosts over the DNS entry for that host, letting you call up the test site in your Web browser. Verify the site comes up, and that nginx is actually serving your content. Verify that the site’s access log contains your hits.

To rotate these logs regularly, create a script /usr/local/scripts/nginx-logrotate.sh.

#!/bin/sh

DATE=`date +%Y%m%d`

#main server
mv /var/log/nginx/nginx-error.log /var/log/nginx/nginx-error_$DATE.log
mv /var/log/nginx/nginx-access.log /var/log/nginx/nginx-access_$DATE.log

#bh.org
mv /var/log/bh/bh-error.log /var/log/bh/bh-error_$DATE.log
mv /var/log/bh/bh-access.log /var/log/bh/bh-access_$DATE.log

killall -s USR1 nginx

Run at 11:59 each night via cron(8).

59 23 * * * /usr/local/scripts/nginx-logrotate.sh

This won’t behave exactly like Apache’s logrotate. The current log file won’t have the date in its name. There will probably be some traffic between 11:59 PM and the start of the new day at 12:00AM. But it’s close enough for my purposes.

I must add entries for every site whose logs I want to rotate.

Now there’s the aliases. I don’t have awstats running on this new machine yet, but I want the Web server set up to support these aliases for later. Besides, you probably have aliases of your own you’d like to put in place. Define an alias within nginx.conf like so:

location ^~/awstatsclasses {
    alias /usr/local/www/awstats/classes/;
}
location ^~/awstatscss {
    alias /usr/local/www/awstats/css/;
}
location ^~/awstatsicons {
    alias /usr/local/www/awstats/icons/;
}

Finally, I need my home directory’s public_html available as http://www.blackhelicopters.org/~mwlucas/. This doesn’t update, but people link here. The following snippet uses nginx’s regex functionality to simulate Apache’s mod_userdir.

location ~ ^/~(.+?)(/.*)?$ {
    alias /home/$1/public_html$2;
    index  index.html index.htm;
    autoindex on;
}

For most sites, I would define a useful error page. The purpose of this site is to say “don’t look here any more, look at the new Web site,” so pointing 404s to the index page is reasonable. Defining an error page like so:

error_page 404 /index.html;

The configuration for this entire site accumulates to:

server {
    server_name blackhelicopters.org www.blackhelicopters.org;
    access_log /var/log/bh/bh-access.log;
    error_log /var/log/bh/bh-error.log;
    root      /var/www/bh/;
    error_page 404 /index.html;
    location ^~/awstatsclasses {
        alias /usr/local/www/awstats/classes/;
    }
    location ^~/awstatscss {
        alias /usr/local/www/awstats/css/;
    }
    location ^~/awstatsicons {
        alias /usr/local/www/awstats/icons/;
    }
    location ~ ^/~(.+?)(/.*)?$ {
        alias /home/$1/public_html$2;
        index  index.html index.htm;
        autoindex on;
    }
}

While I’m happy with nginx performance so far, I’m only running a couple of static sites on it. The real test will start once I use dynamic content.

mirroring FreeBSD-9 disks with GPT

I recently tried to mirror my hard drives in a new machine. The Handbook instructions, and those in my own Absolute FreeBSD, didn’t work well. (The Handbook now warns about this in a big, friendly, hard-to-miss red box.) So how can I mirror my disk? By using per-partition mirroring rather than full-disk mirroring.

I should note up front that this article is the result of my researches and testing. I am not a filesystem developer. I’m not even a FreeBSD committer any more. You should check the FreeBSD Handbook for updated documentation before trying this approach.

First, I need to partition my disks identically. My system has two disks, da0 and da1. da0 has an installed system, da1 is blank. Use gpart(8) to copy the GPT.

# gpart backup da0 > da0.gpt

The file should look something like this:

# cat da0.gpt
GPT 128
1 freebsd-boot 34 128
2 freebsd-ufs 162 201326464
3 freebsd-swap 201326626 8388540

Now copy this to the second disk.

# gpart restore -F /dev/da1 < da0.gpt

The -F flag tells gpart to destroy any existing GPT on the target disk. Now verify the GPT on both disks.

# gpart show
=> 34 209715133 da0 GPT (100G)
34 128 1 freebsd-boot (64k)
162 201326464 2 freebsd-ufs (96G)
201326626 8388540 3 freebsd-swap (4G)
209715166 1 - free - (512B)

=> 34 209715133 da1 GPT (100G)
34 128 1 freebsd-boot (64k)
162 201326464 2 freebsd-ufs (96G)
201326626 8388540 3 freebsd-swap (4G)
209715166 1 - free - (512B)

These look pretty identical to me. I now have a separate device node for each partition on each disk. Mirroring these works much like mirroring an entire disk.

Unlike mirroring MBR disks, to mirror GPT partitions you must be in single-user mode. Now that GPT is the default partitioning scheme I’m sure someone will figure out a clever way around this, but for now reboot into single-user mode.

# gmirror label -vb round-robin p1 /dev/da0p1
# gmirror label -vb round-robin p2 /dev/da0p2
# gmirror label -vb round-robin p3 /dev/da0p3

/dev/da0p3 is the default swap partition. You must decide if you want to mirror your swap partitions as well. I’m choosing to do so. If one of my disks fails, the system has a fighting chance to continue running. Having two independent swap areas, one on each disk, means that a disk failure will yank a swap space out from under the otherwise-working system.

Now add the second disk’s partitions to your mirror devices.

# gmirror insert p1 /dev/da1p1
# gmirror insert p2 /dev/da1p2
# gmirror insert p3 /dev/da1p3

This will mirror your boot blocks, your root disk, and your swap space.

Now you must update /etc/fstab to boot from your mirror. The tricky bit here is that the system now has /dev/da0p2 mounted as root, read-only. You don’t want to write to /dev/da0p2 again; all writes should go to the mirror. Instead, mount /dev/mirror/p2 to a temporary location.

# mount /dev/mirror/p2 /mnt
# cd /mnt/etc
# cp fstab fstab-old
# ee fstab

vi requires a read-write /var/tmp, but ee works just fine.

Let the system run until the disks are synchronized. Check your disk status with “gmirror status.”

# gmirror status
Name Status Components
mirror/p1 COMPLETE da0p1 (ACTIVE)
da1p1 (ACTIVE)
mirror/p2 DEGRADED da0p2 (ACTIVE)
da1p2 (SYNCHRONIZING, 17%)
mirror/p3 DEGRADED da0p3 (ACTIVE)
da1p3 (SYNCHRONIZING, 7%)

After your disks synchronize, reboot. Don’t just exit single-user mode, as you have the wrong root partition mounted. (I found that rebooting before mirror synchronization meant the system came up with a read-only /.)

Your drives are now mirrored. Don’t forget to add mirror checks to your daily status mails. Add the following to /etc/periodic.conf:

daily_status_gmirror_enable="YES"

Hopefully, we’ll have a faster way to do this soon.

UPDATE 7/12/2011: You must have geom_mirror_load=YES in /boot/loader.conf to run gmirror commands.

Recovering from Failing to Mirror Disks on FreeBSD 9.0-RC2

I’m installing a new FreeBSD server, and want to mirror the root disks. According to the instructions in the Handbook and my own Absolute FreeBSD, it’s a simple process. The instructions are not valid for FreeBSD 9, however. It was late. I was tired. I tried anyway.

The first clue should have been that the disk devices now have different names. Rather than /dev/da0s1, they now look like /dev/da0p1. What difference does a letter make? Well, my test instance is virtualized. I took a snapshot and tried to follow the geom_mirror instructions, including updating /boot/loader.conf and /etc/fstab. My next boot failed with:

GEOM: mirror/gm0: corrupt or invalid GPT detected.
GEOM: mirror/gm0: GPT rejected -- may not be recoverable

Cue the familiar sinking feeling in my gut.

Reading the release notes tells me that the new installer writes GPT partitions. It’s about time this change was made; GPT has been used on non-x86 hardware for years now, and overcomes many of the limitations of MBR partitions.

But geom_mirror and GPT cannot coexist on whole disks. Both write to the last sector of the disk. If you follow the mirroring instructions in the Handbook, blithely ignoring the fact that the device names have changed, you overwrite the GPT.

Oops. I’d already installed a bunch of software on this machine. I’d rather not redo that. Let’s see how to recover.

Fortunately, recovery is fairly easy. Boot the installation CD, but rather than installing, choose the live system. You’ll get a command prompt.

Now look at your disk’s GPT.

# gpart show
=> 34 209715133 da0 GPT (100G) [CORRUPT]
...

The scary bit is the last word. You can’t boot a corrupt disk. Try to recover the GPT.

# gpart recover da0
da0 recovered
#

Another gpart show should show the word CORRUPT is missing.

With the FreeBSD 9 installer, the disk’s root filesystem is /dev/da0p2.

# mount /dev/da0p2 /media
# cd /media/etc
# cp fstab-old fstab
# cd /media/boot
# vi loader.conf

Remove the geom_mirror_load=YES line from loader.conf.

Theoretically, you’ve recovered. Reboot the live CD, boot onto the hard disk. If you were following the instructions, this error should be recoverable.

I must still figure out how to mirror my boot disk correctly, but this at least got the system back up.

notes from my FreeBSD and Nagios upgrade

My Nagios system ran FreeBSD-current/i386 from October 2010 and Nagios 3.0.6. Business factors drove me to make some changes, and I decided to upgrade the server before making those changes. Here’s some things I observed. I don’t know if these is useful to you, but I’ll need them for other upgrades, so what the heck.

Back up before you start. (Yes, obvious, but everyone needs a reminder.)

Building 9-stable on a -current box that old is tricky. You have to do a variety of ugly things. So don’t. I NFS-mounted another machine running 9-BETA2/i386 and installed from that.

Remove the old libraries and obsolete programs from the core system. While you have a full backup, I find it useful to have a separate, convenient backup of removed libraries on the existing system.

# cd /usr/src
# make check-old-libs | grep '^/' | tar zcv -T - -f $HOME/2010Oct-old-libs.tgz
...
# yes | make delete-old-libs

In the event that I cannot recompile some program for FreeBSD 9, I can install the necessary libraries under /usr/lib/compat and get on with my life.

I ran portmaster-L > ports.txt to get a list of all installed software in hierarchical order, deleted what I didn’t need any longer, then used portmaster -d --no-confirm portname on my leaf ports.

I had trouble building a couple of ports. I elected to use packages for these ports. FreeBSD-9’s packages are built against Perl 5.12. In 2010, they were built against Perl 5.8. It was simpler to remove all Perl ports and reinstall them from scratch. The ports that were giving me trouble worked fine with the newer Perl.

Then there’s Nagios. Ah, there’s nothing like upgrading Nagios. Actually, the Nagios upgrade itself ran perfectly with portmaster. The problem with the upgrade is all of the additional NagiosExchange scripts I installed. Lots of them ran fine under Perl 5.8, but choked when run by Nagios in Perl 5.12. The problem scripts started with #/usr/bin/perl -w. By removing the -w (warnings) flag, they ran under Nagios again.

When you reactivate Nagios after this upgrade, either turn off email or redirect all email to /dev/null. Do not leave email on. Nagios might well generate spurious errors, spam your coworkers, and cause either alarm or annoyance, depending on their temperament.

Once I fixed all the scripts that were failing, Nagios generated intermittent errors. All of the scripts that failed were SNMP-based. I ran snmpwalks from the Nagios box, and they all died partway through. I ran tcpdump -vv -i em0 udp port 161 on the target machines, and saw that they all reported “bad UDP checksum.” The server was still running 9-BETA2. Rather than tracking down an error on an older version, I upgraded the system to 9-RC2. The problem disappeared. I dislike not understanding the problem’s cause, but obviously someone else fixed it between BETA2 and RC2.

The only plugins that still failed were check_snmp_proc and check_snmp_disk, from the nagios-snmp-plugins port. Every one of them failed consistently.

Running the plugins by hand showed that they were generating correct answers, but they were also picking up MIB file errors from my $MIBS and $MIBDIRS. I have a whole bunch of MIB files that I use for developing and testing Nagios plugins. I normally restart Nagios with sudo. On a hunch, I used su - to become root with a clean environment and restarted Nagios. The errors stopped, and Nagios ran perfectly.

I suspect that this is the same problem that broke the perl -w plugins. The newer Nagios apparently chokes on extra debugging output. I’ve gone through the release notes for the versions I skipped, but didn’t find that. In all fairness, I probably just missed it.

FreeBSD 9 PF macro & table changes

I secure my BSD servers with PF. In FreeBSD 9, PF has been updated to the same version as in OpenBSD 4.5.

I use lists in my PF configuration, as shown in this /etc/pf.conf snippet:

mgmt_hosts="{ 10.0.1.0/24, 172.19.8.0/24}"
...
pass in on $ext_if from $mgmt_hosts
...

When I have new management hosts, I add their IP address or subnets to the mgmt_hosts list. When PF reads this configuration file, every place that a rule references the list, an additional rule is created for each member of the list. Here, every subnet in the mgmt_hosts list gets a “pass in” rule. When I list these rules on a running FreeBSD 8 host, they’ll look something like this:

# pfctl -sr
...
pass in on em0 inet from 10.0.1.0/24 to any flags S/SA keep state
pass in on em0 inet from 172.19.8.0/24 to any flags S/SA keep state
...

Very useful for maintaining a readable rule file.

I updated a host to FreeBSD 9, and saw the following in my rules.

# pfctl -sr
...
pass in on vr0 inet from <__automatic_4c6aed29_0> to any flags S/SA keep state
...

Wait a minute. What is this __automatic crap? And where are my management hosts?

This version of PF automatically converts lists to tables. If you have a big rule set, using a table makes the rules shown by pfctl more readable. (I seem to recall that tables perform better than lists, but I can’t find a reference for that, so take that with a grain of sand.)

You can name tables, but tables created by PF have a name that starts with __automatic. To view all the tables, run:

# pfctl -sT
__automatic_4c6aed29_0
#

To see the hosts in this table, use pfctl’s -t and -T flags.

# pfctl -t __automatic_4c6aed29_0 -T show
10.0.1.0/24
172.19.8.0/24
#

Wow. This works, but it doesn’t look like fun. If I have to routinely type __automatic_4c6aed29_0, I will increase my subvocalized swearing by at least ten percent. But it does not interrupt service. Old rule sets continue to work. (I don’t mind needing to update my rules with a new OS version, but I need to know about it beforehand rather than just blindly updating.)

To make my life easier I can convert my PF rules to use tables instead of lists. Here’s the same pf.conf using a table instead of a list.

table <mgmt_hosts> const {10.0.1.0/24, 172.19.8.0/24}
...
pass in on $ext_if from <mgmt_hosts>
...

Unlike a list, a table is explicitly declared as a table. The name always appears in angle brackets.

I use the const keyword to tell PF that the contents of this table cannot be changed at the command line. PF tables can be adjusted at the command line without reloading the rules, which is a handy feature for, say, automatically blocking port scanners, feeding IDS data to your firewall, or DOSing yourself.

When I look at my parsed rules now, I’ll see:

# pfctl -sr
...
pass in on vr0 from <mgmt_hosts> to any flags S/SA keep state
...

I can now read my rules more easily.

(Bootnote: OpenBSD just came out with 5.0, so FreeBSD 9 is five versions behind. OpenBSD PF develops quickly. But thirty-month-old PF is better than a lot of other firewall software.)

Installing a DragonFly BSD Jail

I’m installing a jail on a freshly upgraded DragonFly BSD 2.13-DEVELOPMENT box. There’s instructions in the DragonFly manual, and on the Web site. They’re fine as far as they go, but to make the jail truly useful you need to do a little more.

Before starting, decide some important facts about your jail.

  • Root directory for the jail filesystem
  • IP address used by the jail
  • hostname of your jail
  • My jail hostname will be mwltest4, on the IP 192.0.2.9, in the directory /jail/mwltest4.

    A jail requires exclusive use of a single IP address. That IP must be bound to the server as an alias. Make an appropriate alias entry in /etc/rc.conf. Note that an alias needs an all-ones netmask. While we’re there, enable jails and tell the host server that we’re building the jail mwltest4.

    ifconfig_em0_alias0="inet 192.0.2.9 netmask 255.255.255.255"
    jail_enable="YES"
    jail_list="mwltest4"

    rc.conf also needs entries for each jail, so that the various jail management utilities can find and configure the jail.

    jail_mwltest4_rootdir="/jail/mwltest4"
    jail_mwltest4_hostname="mwltest4"
    jail_mwltest4_ip="192.0.2.9"

    Start by seeing what network ports your server listens on. I’ve removed all of the entries with remote addresses, because those are live network sessions; I’m only interested in what ports the server is listening on.

    # sockstat -4
    USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS
    ...
    root sendmail 670 4 tcp4 127.0.0.1:25 *:*
    root sshd 656 5 tcp4 *:22 *:*

    Any entry where the local address is an asterisk followed by a colon and a port number will be a problem. We need to bind those daemons to the server’s main IP address. In this example, the only problem daemon is SSH. Bind SSH to a single IP address with a ListenAddress directive in /etc/ssh/sshd_config.

    ListenAddress 192.0.2.8

    Run /etc/rc.d/sshd restart, and sshd will bind only to the specified IP.

    I want my jails on their own filesystem, so I create a new HAMMER PFS and a directory for this particular jail.

    # hammer pfs-master /jail
    Creating PFS #9 succeeded!
    /jail
    sync-beg-tid=0x0000000000000001
    sync-end-tid=0x00000001068ea510
    shared-uuid=34cc9fbe-ffc2-11e0-9527-010c29ce51d2
    unique-uuid=34cc9fdd-ffc2-11e0-9527-010c29ce51d2
    label=""
    prune-min=00:00:00
    operating as a MASTER
    snapshots directory defaults to /var/hammer/

    # mkdir /jail/mwltest4

    Now install the userland, exactly as per the jail instructions.

    # setenv D /jail/mwltest4
    # cd /usr/src/
    # make installworld DESTDIR=$D

    Go get more caffiene. By the time you return you should see:

    ===> etc
    ===> etc/sendmail
    install -o root -g wheel -m 644 /usr/src/Makefile_upgrade.inc /jail/mwltest4/etc/upgrade/
    #

    It finished successfully. Now install /etc.

    # cd etc/
    # make distribution DESTDIR=$D -DNO_MAKEDEV_RUN

    Now mount a device filesystem for the jail.

    # cd $D
    # ln -sf dev/null kernel
    # mount_devfs $D/dev

    Edit /etc/fstab to have the host mount the jail devfs whenever the system starts.

    devfs /jail/mwltest4/dev devfs rw 0 0

    Our jail should be ready. Start it in single-user mode.

    # jail /jail/mwltest4/ mwltest4 127.0.0.1,192.0.2.9 /bin/sh
    # uname -a
    DragonFly mwltest4 2.13-DEVELOPMENT DragonFly v2.13.0.49.gf6ce8-DEVELOPMENT #0: Tue Oct 18 10:51:40 EDT 2011 mwlucas@mwltest2.blackhelicopters.org:/usr/obj/usr/src/sys/GENERIC i386
    #

    Before starting your jail in multiuser mode

  • enable SSH
  • configure /etc/resolv.conf
  • set a root password
  • and add a user
  • As I use LDAP for central account administration, but the jail isn’t yet LDAPilated, I manually set my new user ID to be identical to that on the host, and I add that account to the wheel group. Also modify /etc/ssh/sshd_config to listen only to the jail’s IP address. (While this isn’t strictly necessary, it will simplify managing the host server.)

    On the host, with my unprivileged account, I run:

    $ cp -rp .ssh /jail/mwltest4/usr/home/mwlucas/
    $ cp .cshrc /jail/mwltest4/usr/home/mwlucas/

    My jail account now has my authorized_keys file and my SSH configuration, with correct permissions, along with my preferred shell environment.

    Start the jail in multiuser mode:

    # /etc/rc.d/jail start mwltest4
    Configuring jails:.
    Starting jails: mwltest4.
    #

    I can now SSH to the jail, become root, and install pkgsrc.

    # cd /usr/src
    # make pkgsrc-create
    If problems occur you may have to rm -rf pkgsrc and try again.

    mkdir -p /usr/pkgsrc
    cd /usr/pkgsrc && git init
    git: not found
    *** Error code 127

    Stop in /usr.

    Crap. The DragonFly install installs git via package as part of the OS install. git is used for installing pkgsrc. You use pkgsrc to install git. How can we bootstrap git? pkg_radd lets you install remote packages, but it is built on pkg_add, part of pkgsrc.

    Find a FTP server (or mount an ISO) with the version of the scmgit package that runs on your host server. I would up getting the scmgit-base-1.7.4.1 package from the 2011Q1 pkgsrc. This is the same package that was originally installed on my DragonFly machine, and it still runs on the DragonFly installed on this host, so it should be okay.

    # pkg_add -f -P /jail/mwltest4/ ftp://ftp.allbsd.org/pub/DragonFly/packages/i386/DragonFly-2.10/pkgsrc-2011Q1/devel/scmgit-base-1.7.4.1.tgz
    pkg_add: Warning: package `scmgit-base-1.7.4.1' was built for a platform:
    pkg_add: DragonFly/i386 2.10.0 (pkg) vs. DragonFly/i386 2.13 (this host)
    pkg_add: Warning: package `p5-Error-0.17016nb1' was built for a platform:
    pkg_add: DragonFly/i386 2.10.0 (pkg) vs. DragonFly/i386 2.13 (this host)
    ...

    You’ll see many more warnings. The package wants to install TK and Python, but those packages are not available on this particula FTP server. But the -f flag means “Go ahead and install even if some dependencies are missing.” I use the -P to assign the package a new installation root directory in my jail’s root.

    Do I like these errors? No. But if I can install a working git, I can install pkgsrc and build a current package with all the dependencies. Log back into the jail and see if it works.

    # cd /usr
    # make pkgsrc-create
    If problems occur you may have to rm -rf pkgsrc and try again.

    mkdir -p /usr/pkgsrc
    cd /usr/pkgsrc && git init
    warning: templates not found /usr/pkg/share/git-core/templates
    Initialized empty Git repository in /usr/pkgsrc/.git/
    ...

    Wait a while, and you’ll have a working pkgsrc tree. From here, you can bootstrap pkgsrc:

    # cd /usr/pkgsrc/bootstrap
    # ./bootstrap
    # ./cleanup

    This gets you /usr/pkg/sbin/pkg_add.

    At this point, I consider my jail complete. While it doesn’t have all the third-party programs I need, I can now easily install them from within the jail, either from pkgsrc or with pkg_radd.

    Upgrading DragonFly BSD

    I have two DragonFly BSD boxes that I want to upgrade to the latest rev. At the moment, they’re running:

    $ uname -a
    DragonFly screw.lodden.com 2.10-RELEASE DragonFly v2.10.1.1.gf7ba0-RELEASE #1: Mon Apr 25 19:48:10 UTC 2011 root@pkgbox32.dragonflybsd.org:/usr/obj/usr/src/sys/GENERIC i386

    Unlike most other BSDs, DragonFly uses git for source code management. DragonFly provides make wrappers to git updates, however. If you don’t have the source code already installed, get it with:

    $ cd /usr
    $ make src-create

    mkdir -p /usr/src
    cd /usr/src && git init
    Initialized empty Git repository in /usr/src/.git/
    cd /usr/src && git remote add origin git://git.dragonflybsd.org/dragonfly.git

    Walk away for a little while, and you’ll come back to see:

    ...
    Checking out files: 100% (31175/31175), done.
    Already on 'master'
    cd /usr/src && git pull
    Already up-to-date.
    $

    This will get you the latest DragonFly BSD source code.

    Before going any further, look at /usr/src/UPDATING. This contains warnings and instructions for avoiding bumps in the upgrade process. For example, as I write this the post-2.10 UPDATING notes list several ISA-only device drivers that have been removed from the system. If I was running on an ISA system, I’d care about that. But I’m not, so I don’t. On to building the system!

    $ cd /usr/src
    $ make buildworld

    Once your world is built, follow up with:

    $ make kernel
    $ make installworld

    Those of us from other BSDs would expect an etcmerge or mergemaster here, but DragonFly replaces that with:

    $ make upgrade

    The make upgrade process is much faster and less interactive than any merge tool.

    After this is done, reboot. Log back in and you’ll find:

    $ uname -a
    DragonFly mwltest2.lodden.com 2.13-DEVELOPMENT DragonFly v2.13.0.49.gf6ce8-DEVELOPMENT #0: Tue Oct 18 10:51:40 EDT 2011 mwlucas@mwltest2.lodden.com:/usr/obj/usr/src/sys/GENERIC i386

    We’re running.

    My next task is to build a few jails and make them usable. But that’s for another post.

    sudo environment purging and OpenSSH

    I recommend using sudo for privileged access to systems. I also recommend requiring keys for SSH authentication, with agent forwarding to trusted systems. The default settings in these two programs collide head-on when you become superuser via sudo and want to copy files from one server to another with scp or sftp.

    If you’re using an SSH agent, your environment contains the location of your authentication socket.

    # env | grep SSH
    SSH_CLIENT=192.0.2.2 51502 22
    SSH_CONNECTION=192.0.2.2 51502 198.0.2.10 22
    SSH_TTY=/dev/pts/1
    SSH_AUTH_SOCK=/tmp/ssh-aJpJNwwOTk/agent.35699
    #

    When you copy files with scp(1) or sftp(1), the client checks for a SSH authentication socket. If the client doesn’t find one, and the user account doesn’t have a private key on this system, and the remote server doesn’t support password auth, the client will not be able to log in.

    All as you would expect, right? But like any good firewall, sudo(8) removes all environment variables not explicitly permitted. To see what sudo(8) does to your environment, as well as all of sudo’s other settings, become root and run sudo -V.

    # sudo -V
    Sudo version 1.6.9p20

    Sudoers path: /usr/local/etc/sudoers
    Authentication methods: 'pam'
    Syslog facility if syslog is being used for logging: local2
    ...
    Environment variables to check for sanity:
    TERM
    LINGUAS
    LC_*
    LANGUAGE
    LANG
    COLORTERM
    Environment variables to remove:
    RUBYOPT
    RUBYLIB
    PYTHONINSPECT
    ...
    Environment variables to preserve:
    XAUTHORIZATION
    XAUTHORITY
    TZ
    PS2
    PS1
    PATH
    ...

    sudo sanity-checks some environment variables, deliberately strips others, and explicitly preserves a few.

    To use agent forwarding for SSH authentication while running as root, add the SSH environment variables to sudo’s configuration. While I could restrict this by groups, I’ll make this a default setting. Call up visudo and add a new default.

    Defaults env_keep += "SSH_CLIENT SSH_CONNECTION SSH_TTY SSH_AUTH_SOCK"

    Exit superuser, use sudo to become superuser again, and your environment will retain your SSH environment.

    While sudo can preserve any environment variables you wish, sudo strips the environment for very good reasons. Don’t retain environment variables unless you’re sure what they will do. And don’t retain easily-abused environment variables, such as LD_PRELOAD. If the superuser needs dangerous environment variables, put them in a separate configuration file and source that file after becoming superuser.

    DragonFly BSD Introduction

    As a long-time IT guy, I’ve grown accustomed to randomly discovering that the boss has purchased some new toy and wants me to put it into production. Usually, both the application and the underlying platform are completely incompatible with everything else we have. This demonstrates that one can grow accustomed to anything. This job is a little different, though. I came into the office to find that Fearless Leader installed a pair of new Dragonfly BSD machines and left me a shopping list of stuff I was to accomplish on them.

    As surprises go, it could be a lot worse.

    Why did Fearless Leader do this? As so much in this field, it started with annoyance — specifically, annoyance at ZFS requiring gigs and gigs of RAM for deduplication, even on OpenSolaris. HAMMER promised snapshots with more modest equipment requirements. This should help us sync different multiple servers.

    For the most part, Dragonfly is configured just like any other BSD. I thought it might be worth giving a quick run-through on how to start with Dragonfly, however. Besides, this is the most interesting thing I’ve done for a while. (Debugging multicast on Ubiquiti radios is both tedious and unproductive).

    First, let’s get these machines properly on the network. DHCP is fine for an install, but a server needs a static address. Dragonfly is based on the tail end of FreeBSD 4, also known as “what Lucas wrote his first tech book about,” so the configuration is fairly familiar. In /etc/rc.conf, add:

    hostname="red.example.com"
    ifconfig_em0="inet 192.0.2.151 netmask 255.255.255.128"
    defaultrouter="192.0.2.129"

    Reboot, and the network still works.

    One thing we noticed right away is that Dragonfly’s SSH server ships with passwords disabled. You must use public key auth or explicitly enable password auth. This presents a certain chicken-and-egg annoyance for us, because we distribute our public keys and our accounts via LDAP. When you install a Dragonfly machine, I suggest copying your authorized_keys file to the server before leaving the console.

    Now I need to install a whole bunch of software, such as text editors, a SNMP agent, and so on. Some of these programs will work as needed when installed from packages, but some will require special builds. I’ll start with the special builds. Dragonfly uses pkgsrc, NetBSD’s cross-platform ports project. I like a lot of things about pkgsrc, most obviously that it installs software in /usr/pkg. Install pkgsrc on your machine like so.

    # cd /usr/
    # make pkgsrc-create

    This downloads and installs the current pkgsrc tree. When complete, you can go to the package’s build directory and do the usual BSD-style bmake all install clean to install the package. (Note that you need bmake, not make.)

    If you’re happy with precompiled binary packages, just use pkg_radd.

    # pkg_radd net-snmp

    Wait a moment, and the package is installed from the remote FTP server.

    Both precompiled packages and packages you compile put their configuration information in /usr/pkg/etc.

    Other things I noticed:

  • Dragonfly has its own NTP daemon, dntpd. Enable it with dntpd_enable=YES in /etc/rc.conf. When you start dntpd, it forcibly syncs the clock if necessary.
  • Like FreeBSD, Dragonfly supports three firewall programs: ipfilter, IPFW, and PF. As of this time, PF is based on OpenBSD 4.4.
  • Dragonfly still mounts /proc by default.
  • Both Fearless Leader and I noticed that Dragonfly feels fast. This is a purely subjective statement, but both of the new machines feel very responsive. I look forward to seeing how much our typical load slows them down.

    UPDATE: Hello, Reddit’s Teeming Hordes! I’m not sure why this blurb on my intro to Dragonfly was worthy of sharing, but never let it be said that I’m a churlish host.

    Book updates, August 2011

    I completed a first draft of the OpenSSH book last night around 10:30PM EDT. It’s out for tech edit now. At this point, I’m going systematically through the tech edits and making sure I’ve corrected the earlier chapters. After that, the manuscript goes to copyediting. Once copyedit is complete, I’ll release the ebook and start contracting out the POD version.

    I normally write both nonfiction and fiction simultaneously. When I get frustrated with one project, I switch to the other. The context switch clears my brain. When I return to the vexing project, I can approach the problem fresh and work through it quickly.

    I decided to do two nonfiction projects simultaneously this summer. In retrospect, this was a mistake. When I got frustrated with one project, I switched to the other… and found myself still frustrated. Perhaps I can do two nonfiction projects simultaneously, but OpenSSH and OpenBSD have a lot in common. One is just a subset of the other. My frustrations would probably be reduced if I knew what I was doing, but if I knew what I was doing, I wouldn’t write the book.

    Lesson learned. If I want to write two nonfiction books simultaneously, they must be wildly diverse.

    The OpenBSD book has therefore moved slowly. It’s further complicated by moving over the next couple weeks. I’ll be full-out cranking on the OpenBSD book this fall, however.

    I predicted that the OpenSSH book would be 30,000 words. The first draft came in at 29,977 words. I am amazed; usually my books come in at 25-50% over the predicted word count. Perhaps I’m learning. But I’m probably just lucky.