New story: Savaged by Systemd

Yesterday, I put a short story up as an ebook. This was a wild experiment that I wrote on a whim.

When I say “wild experiment,” I don’t mean I decided to play with tenses and point of view. No, I decided to spend one day writing a lunatic piece, something that I’ve never written before.

Erotica. Sort of.

Computer erotica, to be specific.

Linux sysadmin erotica, to be more specific.

OK, fine. Systemd erotica.

It’s called Savaged by Systemd. And while it certainly contains erotic content, it’s got a bunch of other things in it too.

I try to wait to announce these things until they’re available in both print and ebook. But some weird things happened.

Here’s last night’s Amazon “hot new erotica releases.”

#2? This is madness.

Hot new releases only include titles released in the last little while. It doesn’t mean much compared to the general “books that are best-selling right now” list. So here’s a screenshot of Amazon’s erotica best-sellers from last night.

I’m #15. This, also, is madness.

But this book involves technical topics. It’s also showing up in the Unix category. Here’s a screenshot of Amazon.ca’s Unix Hot New Best-Sellers.

Hitting #1 in the Hot New Unix Books category is no big deal for me.

This morning’s SF Erotica best-seller list?

Strangest of all?

Here’s my Amazon ebook sales dashboard for the last three months, displaying unit sales.

Those little yellow marks all along the bottom? That’s my living. I’m not complaining about it. I get a reasonable middle-class wage compared to non-tech workers. Don’t get me wrong–if you offer me money I’ll take it. Reader donations in the form of book sponsorships have helped me out in desperate times.

And Savaged by Systemd is only $2.99, so I don’t make nearly as much off of one of those as I do a tech book.

But still: that’s a mighty pretty looking set of spikes, over there on the right hand side.

And it’s getting reviews. Goodreads is famous for giving lower reviews than most other book sites. Not only is SbS getting all five-star reviews there, it’s getting long and detailed reviews.

One of those reviews captures what I was trying to do.

Writers are pretty good about hanging on in the face of adversity. We’re accustomed to the larger world pretty much ignoring our feeble efforts.

Success disorients us.

My social media accounts are flooded. My email is worse than ever. Everybody on LinkedIn suddenly wants a piece of me. (I only LinkedIn with people I’ve actually worked with, btw.)

Despite the heady rush of literary success, I need to make some words. The deadline for the new Absolute FreeBSD is pretty inflexible, if I want it in print for BSDCan.

“Hi, I’m jkh and I’m a d**k”

I don’t do guest posts here. This blog is my private soapbox. You want to scream into the void? Go get your own soapbox.

Yesterday, I was privy to a private email message discussing a topic I care deeply about. I contacted the author and said “You really need to make this public and give this a wider audience.” His response boiled down to “if I wanted it to get a wider audience, I was welcome to do so myself.” So here’s my first ever guest post, from Jordan K Hubbard, one of the founders of the FreeBSD Project. While this discussion focuses on FreeBSD, it’s applicable to any large open source project.

The email discussion was about the FreeBSD Project recently giving someone the boot. I’m not linking to who it was; you can dig up the controversy elsewhere.

I did my first install of FreeBSD, version 2.0.5, in late 1995. I started reading FreeBSD mailing lists shortly afterwards. Allow me to provide some context to say that when jkh says he was a dick: yes. He was.

Like any good member of the press, I’ll give my anchor commentary after the footage. Again, it’s my soapbox.


My, what an interesting thread this has been, as well as an interesting (and probably controversial) recent talk by Benno on much the same topic.

I’m known for my long and overly verbose PhD thesis style postings, so I’ll try to make this one short(er) with a few pithy points:

1. Some of the FreeBSD project’s most energetic, motivated, and capable people have also, when viewed through the long lens of history, been total dicks, at least in electronic form. They just can’t seem to help themselves from coming off that way, one person’s “passionate concern for topic X” being another person’s “totally over-the-top behavior concerning topic X”, with neither side usually having the benefit of all the information while they form conclusions about which of the two it is.

2. The project needs driven individuals capable of achieving “10X productivity” (a software industry term, not mine) in driving various agendas, inspiring others by their progress and allowing important opportunities for project growth to be seized rather than squandered, just as it needs nice, cooperative, team-players who go out of their way to avoid stepping on toes or driving more junior, perhaps easily intimidated, volunteers away from the project. It would also be awesome to find both attributes in the same people, obviously, but that goal is usually more aspirational than one immediately (if ever) achieved.

So, how to reconcile these two seemingly fundamentally opposed goals in project membership management? “What is core going to do about it?”*

First, let me be very honest: I can only speak to you from the perspective of someone who has committed many of the sins in paragraph 1.

I have said many things I subsequently regretted. I have engaged in furious, pitched battles over topics that subsequently proved to be almost nonsensically trivial. I have definitely alienated people. The fact that I have Asperger’s syndrome also made it easy for me to be both highly driven and insensitive to other people’s feelings at the same time (it helps when you don’t even notice them) but that’s certainly no excuse because I have also learned, along the way, to grasp intellectually what I did not always grasp instinctively: Just don’t be a dick. Take a deep breath, swallow the irritation that is often my first response, and try to figure out another way of expressing myself that will lead to a better long-term outcome with less friction with my colleagues / bosses / end-users.

Does that work 100%? Heck no, I’m still a work in progress, but I’m definitely better than I was 22 years ago, and pretty much the only reason that I’m better is that people took the time to talk to me about being a dick. They sent me (oh so many) private emails saying, in effect, “Dude! Really??” They called me on the phone when it was clear I really needed a Healthy Dose of Perspective and email just wasn’t doing the job. All of my fellow developers and colleagues (and yes, occasionally HR departments) have collectively conspired to slap me across the face with the Trout of Truth when it was clear I was going, or had gone, off the rails where interpersonal communications and decision making skills were concerned. I have, in short, learned some hard lessons about being more responsible for my actions on a number of levels and I’m glad I managed to stick around long enough to learn them. I am, as I said, a work in progress.

* That’s where all of you come in. You can’t just say “What is Core Doing About It?” when it comes to addressing problems like this, because by the time Core gets involved, it’s already too late. The damage is done and probably irrevocably so because it’s been done over a long period of time. People complained and complained and finally core wearily stepped in and pulled the trigger. Bang. Too late for anything else.

If you want better outcomes than this, then you simply need to start mentoring one another. You need to take extra time to call your fellow developers on the phone / Skype / WhatsApp / whatever works when it’s clear one of them is having a bad day, or escalating a situation that doesn’t warrant escalation, or simply being a dick when they don’t need to be (and probably don’t even realize they’re being). We had that kind of close and frequent communication a lot in the early days of the project, and I absolutely know that it held things together through some rather tempestuous times. It’s also no excuse to say that the project is bigger and has outgrown this now, either, because it only takes one person to call one other person at the right time for ad-hoc mentorship to work. Don’t just wait until you see someone at the next conference when it’s clear they are struggling to interact successfully with others in the here-and-now, reach out, just as so many reached out to me!

Please also take my word for it when I say that a truly successful FreeBSD project will continue to need driven people, people who are often tempted to drive right over others who won’t get out of their way or otherwise tend to show “less than perfect patience”, just as it will continue to need quieter folks who are content to follow someone else’s vision, assuming that there is one to follow, and instinctively do a better job of getting along with others. Each “type” can benefit and learn from the example the other provides, assuming there is a real commitment to doing so.

I’ll leave you with an analogy: This is like a marriage. If both in the couple are very passive, then that will probably be a long-lived but rather boring relationship where both ultimately wind up just counting the days until death comes for them. If both are fiery and impetuous, the relationship will probably be exciting but equally short-lived. The most successful marriages are usually some combination of the two extremes, the worst impulses of one being kept reasonably in check while the other gets to experience new and exciting things they just wouldn’t have thought to do (or had the will to do) on their own. Assuming that both also commit to communicating on a frequent basis and don’t just assume Everything Is Fine, it works.

What kind of marriage do you folks want?


Jordan is absolutely right here.

The open source community has some incredibly smart people in it. You folks are brilliant.

When Jordan says that he’s a “work in progress,” though, that’s applicable to every one of us. Including myself.

I won’t say that the open source community is full of people with problems like ADD, Aspergers, and so on. I will say that of the adults I know that I happen to have these conditions, I met every single one of them through the open source community. I strongly suspect they gravitate there because computers are comparatively easy compared to people.

Other groups have their own issues. The writers I know run really heavy into depression and social anxiety. (I’m a writer and a techie. Thanks to my writing career and Amazon Prime’s free two-day shipping, I almost never leave the house. That’s just best.)

Brilliance is great. I admire really really smart people.

But to belong to a community, a person must be able to work with that community. I’m using “must” in the RFC sense here. It’s an absolute, non-negotiable requirement.

BSD, and open source in general, is full of brilliant but incomplete people. Everyone is incomplete. In open source, the incompleteness is often in social skills and the understanding of how to behave.

Social correction, and the establishment of social norms, comes only from the community. It’s entirely bottom-up. One on one.

While you can go to a counselor to help develop those skills, the best advice comes from peers who have been in your exact situation, who have faced those problems, and who have developed those skills.

Are you good at communicating in your open source community? You have another contributor you like, but who has social problems? Unofficially mentor them.

Are you an open source contributor who keeps getting messages from people saying something like “Dude, that’s really messed up,” or “You were really inappropriate here, stop it,” or similar? One message might not be a big deal. But if you keep getting them, it’s a sign that you’re missing a skill. A skill that can be learned. If someone you get along with offers to help: listen.

And it’s far better done via voice than electronic text. Text communication strips vital context, and it’s much much slower than voice. If a person has problems communicating via email, more email isn’t going to solve it.

One of the hardest things to do is listen when someone calls you a dick. Yes, it’s happened to me. When it comes from people I respect, I listen. It makes me less incomplete.

And if Jordan can learn to not be a dick, anyone can.

So why am I not naming the person who got booted from FreeBSD? Because he, like everyone else, is an incomplete person who lacks a particular skill. I hope he will develop that skill. And I don’t want a blog post from 2017 to hurt his chances of getting a job in 2037, or even 2018, when he’s had an opportunity to add those skills.

You have the power to make that brilliant but poorly socialized contributor a better community member. Even if that brilliant member is you.

See Me in 2016

I have two more public appearances in 2016.

October 7-8, I’ll be at Ohio LinuxFest. They’ve asked me to speak on Introducing ZFS.

November 8, mug.org has invited me to talk about PAM. This is election day in the United States, so the talk is on how PAM is Un-American.

Sadly, family commitments prevent me from going to MeetBSD in Berkeley. Plus, there’s the whole “get on a plane” thing, which I try really really hard to avoid. I’d probably do it to see Berkeley, though. I’m pretty sure a pilgrimage to Berkeley is required once during my lifetime.

Other than that, you can catch me at a Semibug meeting.

Cover reveal for “PAM Mastery”

For the first Tilted Windmill Press tech books, I elected to create covers from photographs. Some went over well, some less so.

For the FreeBSD Mastery books, I persuaded Eddie Sharam to create parodies of classic art. It’s far more expensive than photos, but reaction has been positive.

PAM Mastery is almost ready to go to copyedit, which means I need a cover for it. I’ve elected to continue the parody art. Without further ado, here’s the cover.

Sysadmin Gothic
Sysadmin Gothic

I’ve gotten some great feedback from DES, author of OpenPAM, and need to incorporate that into the manuscript. Once that’s complete, I can send it to copyedit!

Installing and Using Tarsnap for Fun and Profit

Well, “profit” is a strong word. Maybe “not losing money” would be a better description. Perhaps even “not screwing over readers.”

I back up my personal stuff with a combination of snapshots, tarballs, rsync, and sneakernet. This is fine for my email and my personal web site. Chances are, if all four of my backup sites are simultaneously destroyed, I won’t care.

A couple years ago I opened my own ecommerce site, so I could sell my self-published books directly to readers. For the record, I didn’t expect Tilted Windmill Press direct sales to actually, y’know, go anywhere. I didn’t expect that people would buy books directly from me when they could just go to their favorite ebookstore, hit a button, and have the book miraculously appear on their ereader.

I was wrong. People buy books from me. Once every month or two, someone even throws a few bucks in the tip jar, or flat-out overpays for their books. I am pleasantly surprised.

So: I was wrong about self-publishing, and now I was wrong about author direct sales. Pessimism is grand, because you’re either correct or you get a pleasant surprise. Thank you all.

But now I find myself in a position where I actually have commercially valuable data, and I need to back it up. Like a real business. I need offsite backups. I need them automated. And I need to be able to recover them, so that people who have bought my books can continue to download them, in the off chance that the Detroit area is firebombed off the Earth while I’m at BSDCan.

So it’s time for Tarsnap.

Why Tarsnap?

  • It works very much like tar, so I don’t have to learn any new command-line arguments. (If you’re not familiar with tar, you need to be.)
  • The terms of service are readable by human beings and more reasonable than other backup services.
  • The code is open and auditable.
  • When Tarsnap’s author screws up, he admits it and handles it correctly.
  • It’s cheap. Any backup priced in picodollars gets my attention.

    I also see the author regularly at regular BSD conferences. I can slap him in person if he does anything truly daft.

    Tarsnap has a quick Getting Started page. We’ll do the easy things first. Sign up for a Tarsnap account. Once your account is active, put some money in it–$5 will suffice.

    Now let’s check your prerequisites. You need:

  • GnuPG

    BSD systems come with everything else you need.

    Linux users must install

  • a compiler, like gcc or clang
  • make
  • OpenSSL (including header files)
  • zlib (including header files)
  • System header files
  • OpenSSL header files
  • The ext2fs/ext2_fs.h (not the linux/ext2_fs.h header)

    The Tarsnap download page lists specific packages for Debian-based and Red Hat-based Linuxes.

    Go to the download page and get both the source code and the signed hash file. Tarsnap is only available as source code, so that you can verify the code integrity yourself. So let’s do that.

    Start by using GnuPG to verify the integrity of the Tarsnap code. If you’re not familiar with GnuPG and OpenPGP, some daftie wrote a whole book on PGP & GPG. Once you install GnuPG, run the gpg command to get the configuration files.

    # gpg
    gpg: directory `/home/mwlucas/.gnupg' created
    gpg: new configuration file `/home/mwlucas/.gnupg/gpg.conf' created
    gpg: WARNING: options in `/home/mwlucas/.gnupg/gpg.conf' are not yet active during this run
    gpg: keyring `/home/mwlucas/.gnupg/secring.gpg' created
    gpg: keyring `/home/mwlucas/.gnupg/pubring.gpg' created
    gpg: Go ahead and type your message ...
    ^C
    gpg: signal Interrupt caught ... exiting

    Hit ^C. I just wanted the configuration and key files.

    Now edit $HOME/.gnupg/gpg.conf. Set the following options.

    keyserver hkp://keys.gnupg.net
    keyserver-options auto-key-retrieve

    See if our GPG client can verify the signature file tarsnap-sigs-1.0.35.asc.

    # gpg --decrypt tarsnap-sigs-1.0.35.asc
    SHA256 (tarsnap-autoconf-1.0.35.tgz) = 6c9f6756bc43bc225b842f7e3a0ec7204e0cf606e10559d27704e1cc33098c9a
    gpg: Signature made Sun Feb 16 23:20:35 2014 EST using RSA key ID E5979DF7
    gpg: Good signature from "Tarsnap source code signing key (Colin Percival) " [unknown]
    gpg: WARNING: This key is not certified with a trusted signature!
    gpg: There is no indication that the signature belongs to the owner.
    Primary key fingerprint: 634B 377B 46EB 990B 58FF EB5A C8BF 43BA E597 9DF7

    Some interesting things here. The most important line here is the statement ‘Good signature from “Tarsnap source code signing key”.’ Your GPG program grabbed the source code signing key from a public key server and used it to verify that the signature file is not tampered with.

    As you’re new to OpenPGP, this is all you can do. You’re not attached to the Web of Trust, so you can’t verify the signature chain. (I do recommend that you get an OpenPGP key and collect a few signatures, so you can verify code signatures if nothing else.)

    Now that we know the signature file is good, we can use the cryptographic hash in the file to validate that the tarsnap code we downloaded is what the Tarsnap author intended. Near the top of the signature file you’ll see the line:

    SHA256 (tarsnap-autoconf-1.0.35.tgz) = 6c9f6756bc43bc225b842f7e3a0ec7204e0cf606e10559d27704e1cc33098c9a

    Use the sha256(1) program (or sha256sum, or shasum -a 256, or whatever your particular Unix calls the SHA-256 checksum generator) to verify the source code’s integrity.

    # sha256 tarsnap-autoconf-1.0.35.tgz
    SHA256 (tarsnap-autoconf-1.0.35.tgz) = 6c9f6756bc43bc225b842f7e3a0ec7204e0cf606e10559d27704e1cc33098c9a

    The checksum in the signature file and the checksum you compute match. You have valid source code, and can proceed.

    Extract the source code.

    # tar -xf tarsnap-autoconf-1.0.35.tgz
    # cd tarsnap-autoconf-1.0.35
    # ./configure
    ...
    configure: creating ./config.status
    config.status: creating Makefile
    config.status: creating config.h
    config.status: executing depfiles commands
    #

    If the configure script ends any way other than this, you’re on Linux and didn’t install the necessary development packages. The libraries alone won’t suffice, you must have the development versions.

    If configure completed, run

    # make all install clean

    Tarsnap is now ready to use.

    Start by creating a Tarsnap key for this machine and attaching it to your Tarsnap account. Here I create a key for my machine www.

    # tarsnap-keygen –keyfile /root/tarsnap.key –user mwlucas@michaelwlucas.com –machine pestilence
    Enter tarsnap account password:
    #

    I now have a tarsnap key file. /root/tarsnap.key looks like this:

    # START OF TARSNAP KEY FILE
    dGFyc25hcAAAAAAAAAAzY6MEAAAAAAEAALG8Ix2yYMu+TN6Pj7td2EhjYlGCGrRRknJQ8AeY
    uJsctXIEfurQCOQN5eZFLi8HSCCLGHCMRpM40E6Jc6rJExcPLYkVQAJmd6auGKMWTb5j9gOr
    SeCCEsUj3GzcTaDCLsg/O4dYjl6vb/he9bOkX6NbPomygOpBHqcMOUIBm2eyuOvJ1d9R+oVv
    ...

    This machine is now registered and ready to go.

    This key is important. If your machine is destroyed and you need access to your remote backup, you will need this key! Before you proceed, back it up somewhere other than the machine you’re backing up. There’s lots of advice out there on how to back up private keys. Follow it.

    Now let’s store some backups in the cloud. I’m going to play with my /etc/ directory, because it’s less than 3MB. Start by backing up a single directory.

    # tarsnap -c -f wwwetctest etc/
    Directory /usr/local/tarsnap-cache created for "--cachedir /usr/local/tarsnap-cache"
    Total size Compressed size
    All archives 1996713 382896
    (unique data) 1946025 366495
    This archive 1996713 382896
    New data 1946025 366495

    Nothing seems to happen on the local system. Let’s check and be sure that there’s a backup out in the cloud:

    # tarsnap --list-archives
    wwwetctest

    I then went into /etc and did some cleanup, removing files that shouldn’t have ever been there. This stuff grows in /etc on any long-lived system.

    # tarsnap -c -f wwwetctest-20140716-1508 etc/
    Total size Compressed size
    All archives 3986206 765446
    (unique data) 2120798 403833
    This archive 1989493 382550
    New data 174773 37338

    # tarsnap --list-archives
    wwwetctest
    wwwetctest-20140716-1508

    Note that the compressed size of this archive is much smaller than the first one. Tarsnap only stored the diffs between the two backups.

    If you want more detail about your listed backups, add -v to see the creation date. Add a second -v to see the command used to create the archive.

    # tarsnap --list-archives -vv
    wwwetctest 2014-07-16 15:02:41 tarsnap -c -f wwwetctest etc/
    wwwetctest-20140716-1508 2014-07-16 15:09:38 tarsnap -c -f wwwetctest-20140716-1508 etc/

    Let’s pretend that I need a copy of my backup. Here I extract the newest backup into /tmp/etc.

    # cd /tmp
    # tarsnap -x -f wwwetctest-20140716-1508

    Just for my own amusement, I’ll extract the older backup as well and compare the contents.

    # cd /tmp
    # tarsnap -x -f wwwetctest

    The files I removed during my cleanup are now present.

    What about rotating backups? I now have two backups. The second one is a differential backup against the first. If I blow away the first backup, what happens to the older backup?

    # tarsnap -d -f wwwetctest
    Total size Compressed size
    All archives 1989493 382550
    (unique data) 1938805 366149
    This archive 1996713 382896
    Deleted data 181993 37684

    It doesn’t look like it deleted very much data. And indeed, a check of archive shows that all my files are there.

    And now, the hard part: what do I need to back up? That’s a whole separate class of problem…

  • ifup-local on bridge members on CentOS

    I run a bunch of CentOS 6 physical servers as QEMU virtualization devices. These hosts have two NICs, one for management and one for virtual machine bridges.

    When you use Linux for virtualization, it’s important to increase the amount of memory for network transmit and receive buffers. You also need to disable GSO and TSO, to improve performance and to avoid gigabytes of kernel error messages every day. You can do this with ethtool(8). First, let’s check the existing ring sizes.

    # ethtool -g eth0
    Ring parameters for eth0:
    Pre-set maximums:
    RX: 16384
    RX Mini: 0
    RX Jumbo: 0
    TX: 16384
    Current hardware settings:
    RX: 512
    RX Mini: 0
    RX Jumbo: 0
    TX: 512

    Similarly, use ethtool -k eth0 to check GSO and TSO settings.

    The card is using much less memory than it can. When you have a bunch of virtual machines pouring data through the card, you want the card to work as efficiently as possible. Fixing this on a running system is easy enough:

    # ethtool -G eth0 tx 16384 rx 16384
    # ethtool -K eth0 gso off tso off

    Repeat the process for eth1.

    How do you make this happen automatically at boot? Adding the commands to /etc/rc.local isn’t reliable. By the time the system gets that much stuff running, the ethtool command might fail with a “Cannot allocate memory” error. If you try again it’ll probably work, but it’s not deterministic. And I’m against running a single command four times in rc.local in the hopes that one of them will work.

    Enter /sbin/ifup-local. CentOS runs this script after bringing up an interface, with the interface name as an argument. The problem is, it doesn’t run this script on bridge member interfaces. We can adjust eth0 and br0 at boot just fine, but eth1 (the physical interface underlying br0) doesn’t get run.

    You can’t run ethtool -G eth0 tx 16384 rx 16384 on br0. Interface br0 doesn’t have any transmit or receive rings. It’s a logical interface. You can disable TSO and GSO on br0, but that won’t disable it on eth1. You can’t wait to reconfigure eth1 in rc.local until the system is running, because increasing the memory doesn’t always work once the system is running full-out multiuser. And Red Hat says this is by design. Apparently network bridges on CentOS/Red Hat are supposed to perform poorly. That’s good to know.

    So, what to do?

    I adjust the eth1 ring size in ifup-local when bringing up br0, but before any processes send any traffic over the bridge. My /sbin/ifup-local looks like this:

    #!/bin/bash

    case "$1" in
    eth0)
    echo "Configuring eth0..."
    /sbin/ethtool -G eth0 tx 16384 rx 16384
    /sbin/ethtool -K eth0 gso off tso off
    ;;

    br0)
    echo "Configuring br0..."
    /sbin/ethtool -G eth1 tx 16384 rx 16384
    /sbin/ethtool -K eth1 gso off tso off
    /sbin/ethtool -K br0 gso off tso off
    ;;

    esac
    exit 0

    This appears to work consistently. Of course, the values for the NIC need to be set on a per-machine basis. I have Ansible do that work for me.

    Hopefully, this will save someone else the pain I’ve been through trying to make this work…

    iptables and ipsets

    I’m dragging my work environment from “artisan system administration” to mass-managed servers. Part of this is rationalizing, updating, and centralizing management of packet filter rules on individual hosts. Like many environments, I have a list of “management IP addresses” with unlimited access to every host. Managing this is trivial on a BSD machine, thanks to pf.conf’s ability to include an outside file — you upload the new file of management addresses and run pfctl to read it. A PF rules file looks something like this:

    ext_if="em0"
    include "/etc/pf.mgmt.conf"
    ...
    pass in on $ext_if proto icmp from any to any
    #mgmt networks can talk to this host on any service
    pass in on $ext_if from to any
    ...

    The file pf.mgmt.conf looks like this:

    table const { 192.0.2.0/24, 198.51.100.128/25 }

    When I add new management addresses I copy pf.mgmt.conf to each machine, run pfctl -f /etc/pf.conf, and the new addresses can connect.

    But surely there’s some similar function on a Linux box?

    To complicate matters further, our environment includes both Ubuntu and CentOS machines. (Why? Because we don’t run operating systems, we run applications, and applications get picky about what they run on.) Each version has its own way of saving and restoring iptables rules. I want to use the same method for both operating systems. What we’ve used is a single rules file, /etc/iptables.rules, read by iptables-restore at boot. We specifically don’t want to trust a copy of the packet filter rules saved by the local machine, as problems can persist across reboots. The current iptables.rules looks something like this:

    *filter
    #mgmt addrs
    -A INPUT -s 192.0.2.0/24 -i eth0 -j ACCEPT
    -A INPUT -s 198.51.100.128/25 -i eth0 -j ACCEPT
    #keep state
    -A INPUT -p tcp -m state --state ESTABLISHED -j ACCEPT
    -A OUTPUT -p tcp -m state --state NEW,ESTABLISHED -j ACCEPT
    -A INPUT -p udp -m state --state ESTABLISHED -j ACCEPT
    -A OUTPUT -p udp -m state --state NEW,ESTABLISHED -j ACCEPT
    #local stuff here
    ...
    #permit ICMP
    -A INPUT -p icmp -j ACCEPT
    -A OUTPUT -p icmp -j ACCEPT
    -A INPUT -i eth0 -j DROP
    COMMIT

    I don’t want to change /etc/iptables.rules for each machine at this point. They all vary slightly. (One day the machines will be classified by roles, but we’re in an intermediate stage right now.) Instead, I want to have the list of management addresses in a separate file. I want to copy the new file to the server, run a command, and have the new list of management addresses be live.

    ipsets seems to be the way to do this. Let’s find out.

    On my crashbox, I’ll create an ipset. I’m using an ipset of type nethash, because it takes CIDR blocks rather than individual IP addresses. The ipset is called mgmt, just like the management addresses on my BSD machines.

    # ipset create mgmt nethash

    It returns silently. Did it create the ipset?

    # ipset list
    Name: mgmt
    Type: hash:net
    Header: family inet hashsize 1024 maxelem 65536
    Size in memory: 16760
    References: 0
    Members:

    OK, it’s in memory. Now add some addresses.

    # ipset add mgmt 192.0.2.0/24
    # ipset add mgmt 198.51.100.128/25

    Are those addresses really in the set? Let’s ask again.

    # ipset list mgmt
    Name: mgmt
    ...
    Members:
    192.0.2.0/24
    198.51.100.128/25

    Now, export this to a file.

    # ipset save mgmt > iptables.mgmt.conf

    I use the file iptables.mgmt.conf to mirror pf.mgmt.conf. That file should contain something like this:

    create mgmt hash:net family inet hashsize 1024 maxelem 65536
    add mgmt 192.0.2.0/24
    add mgmt 198.22.63.128/25
    add mgmt 198.51.100.128/25

    Can I restore the ipset from the file? Destroy the set.

    # ipset destroy mgmt
    # ipset list

    It’s gone. Now to restore it from memory.

    # ipset restore < iptables.mgmt.conf # ipset list
    ...

    All my rules are there.

    Now, let’s teach iptables how to use an ipset. Rather than defining addresses, we use the -m set option.

    # iptables -A INPUT -i eth0 -m set --match-set mgmt src -j ACCEPT

    In the iptables.rule file, it would look like this.

    *filter
    #allow mgmt IPs
    -A INPUT -i eth0 -m set --match-set mgmt src -j ACCEPT
    ...

    When you have several management networks, this is certainly much shorter and easier to read.

    When you update the iptables.mgmt.conf file, read it in with ipset restore. You must use the -! flag. This tells ipset to ignore that the ipset already exists, and restore the contents of the ipset from the file.

    # ipset restore -! < iptables.mgmt.conf

    I can now copy this file to my hosts, run a command, and the packet filter rules are updated, without touching my main rules file.

    I don’t recall anyone using a symbol as a command-line flag like this before, but I actually kind of like this one. “I said DO IT, damn you!”

    Easy Security Project: standalone ssh-ldap-helper

    I’ve been waiting for quite a while for an official way to centrally manage user authentication keys in OpenSSH. If you have a dozen servers, copying authorized_keys files around is a pain. If you have more than that, it’s really really painful. The OpenSSH guys have had good reasons for not wanting to link LDAP libraries straight into OpenSSH. They also gave some general guidance of what they’d want to see in a patch that supported LDAP authentication.

    Jan Chadima from Redhat took OpenSSH up on this, wrote a patch as per spec, and submitted it to OpenSSH. And Damien Miller committed it. LDAP support for OpenSSH will be in 6.2…

    …sort of.

    The patch adds support for getting a user’s authorized_keys file from a helper program. Redhat includes a helper program, ssh-ldap-helper. That program is not in the OpenSSH patch. And, truthfully, there’s no reason it should be in the main OpenSSH distribution. We’ll see helpers for LDAP, for database lookups, for FUSE and HTTP and whatever weird data storage people come up with. I don’t want the OpenSSH guys spending their time writing these helpers.

    But the source code for ssh-ldap-helper is in the Red Hat source RPM. As far as I can tell, it’s under a BSD license.

    If you’re looking for a way to contribute to the OpenSSH user community, however, digging into the RPM (it’s just a tarfile), extracting the included OpenSSH code, and adding the patch for ssh-ldap-helper, ssh-ldap-wrapper, and the man page is pretty easy. I got that far, after all! I imagine that someone with a little bit of knowledge could make it compile on xBSD. Or at least, it’s a place to start.

    You’d make my life a lot easier. And give me more time to finish the new edition of Absolute OpenBSD. That’s what you lot want me to do with my time, isn’t it? (I’ll have a post on that status in a few days.)

    I also have to give props to Red Hat on this. They had a need in OpenSSH. They were given the requirements for that need to be met in mainline OpenSSH. And they met those needs and submitted the patch. Everyone cooperated, everyone gets what they need. That is how open source should work. Given how some other open source companies and projects are behaving lately, this makes me feel pretty good about the BSD community.

    SolusVM KVM offline migration with shared storage

    I’m building a new virtualization cloud with SolusVM, KVM, and a bit of Xen (to make use of older hardware). Each machine has its own hard disk, but it only holds the local operating system. All virtual machines reside on cheap iSCSI storage, so I can easily migrate VMs from one compute node to another. The goal being, of course, to separate service failures from hardware failures. (I still have to deal with possible storage failures, of course, but hot-swap hard drive arrays reduce my risk somewhat.)

    SolusVM provides a nice front end to the whole Linux virtualization tangle. It does exactly what it claims, and at a reasonable price. I’m happy to pay someone a couple bucks a year per physical server to give me a non-sucky cloud front end that Just Works. One feature that it lacks is live migration for KVM and Xen hosts. Live VM failover is nice, but not essential for my purposes. As part of our Redundant Array of Inexpensive Crap strategy, I cluster VMs as well as physical servers: multiple mail servers, multiple DNS servers, and so on.

    While there’s documentation on how to cold-migrate Xen VMs, there’s no documentation on how to migrate a KVM VM from one node to another, however. Let alone how to do this with shared storage. But the forum says that the Xen method should work with KVM. Let’s try it and see what happens!

    The Xen page talks about replicating the LVM container on the new node. With shared storage, you can skip this step; I defined my SolusVM groups based on the iSCSI device they’re attached to. I imagine the same migration process would work with unshared storage, if you duplicated the disk data first.

    Go into the SolusVM GUI and note the VM number and the node number. For my test, want to move VM 2 onto node 4. Log onto the master server, become root, and run:

    # cd /scripts
    # ./vm-migrate 2 4
    Virtual server information updated!
    #

    I then tried to start the VM via the GUI, and it wouldn’t boot. Log onto the compute node to find out why. Any time I have a virtualization problem involving multiple pieces of hardware, I check /var/log/libvirt/libvirtd.log. Starting the virtual machine generated this log message:

    14:36:13.417: 1443: error : qemuMonitorOpenUnix:290 : failed to connect to monitor socket: No such process
    14:36:13.417: 1443: error : qemuProcessWaitForMonitor:1289 : internal error process exited while connecting to monitor: inet_listen_opts: bind(ipv4,0.0.0.0,5901): Address already in use
    inet_listen_opts: FAILED

    The KVM instance could not use port 5901, because something else was using it. KVM uses VNC to offer console access, and attaches to a port above 5900. Machine number one’s console is on VNC on port 5901, machine number two on port 5902, and so on.

    The migrate-vm script didn’t change the console port. I went into the VM entry, changed the port by hand, and brought up the machine without trouble. Annoying, but not insurmountable.

    Hopefully this helps the next sysadmin searching for this topic.