(Part of an intermittent continuing series on tech writing. People have urged me to write a book on how to write tech books for years. If I collect enough of these, I just might.)
I’m working on FreeBSD Mastery: Advanced ZFS, and am desperately hoping to have a first draft finished and out for tech review before my September writing workshop. And I’ve hit a situation I’ve hit many times before, but not in a way that will be so obvious to so many readers.
OpenZFS runs on many operating systems. FreeBSD is considered a Tier 1 OpenZFS platform, which is cool. But even so, not everything works quite the way you’d expect.
The current master document on tuning OpenZFS for better performance is Adam Leventhal’s article on the Delphix blog. The performance chapter needs to cover everything in that article and more, in a slower and more easily accessible manner.
But not everything on that blog post works on FreeBSD.
Some problems are straightforward to fix. The DTrace scripts are written for illumos, so some of them need minor tweaks to work on FreeBSD. I can ask around and get a knowledgeable kernel person to fix them (thank you, Ryan Stone!). Many parts need more explanation and context than Leventhal provides–which is great, because otherwise I’m out of a job.
But there’s the hard category: things that just don’t work on FreeBSD.
You can tune almost everything in ZFS, including how async writes perform. But the async tuning knobs are not useful without the speedometer that measures the impact of those changes on your hardware.
Performance tuning without measuring is not tuning: it’s voodoo. And not the voodoo religion that people sincerely practice, but the cheesy comic book thing that appears in any number of B movies when the filmmaker couldn’t be bothered to do any actual research.
The DTrace script that measures the effects of tuning async writes won’t work on FreeBSD. There’s a PR on this issue, but it’s sat there for three months now without anyone claiming it.
The Advanced ZFS book must address this. The issue for me isn’t so much the actual FreeBSD issue, but “how should the book address this?”
I could say any of the following.
- “FreeBSD is not as tunable as illumos.”
- “If you want to tune async performance, you’ll need to apply the patch in bug 200316 and rebuild your kernel. Unless someone happens to merge this patch into the kernel after this book comes out.”
- “Async performance? Who says you can tune async performance? I know nothing of this.”
Which is the correct path?
The first rule of writing a tech book is: serve your reader.
Or, as Mickey Spillane famously said, “I have no fans. You know what I got? Customers. And customers are your friends.” I must tell them the truth.
Number 3 is the easiest to write–just pretend it doesn’t exist. How many people actually need to tune async performance anyway? But the feature is in OpenZFS, and it’s fair for my readers to ask about it. Many readers will read my warnings about the feature, dismiss said warnings, play with async write tuning, and reluctantly concede that the warnings were correct.
Ignoring the matter is a disservice to my customers, so it’s out. As in so many things, the easiest thing to do is the wrong thing to do.
For most writers, in most situations, the correct answer is both #1 and #2. Admit the weakness up front. Don’t try to cover it up. State it flat-out for your readers:
“FreeBSD’s ZFS async writes are not as tunable as illumos’.”
Or, more accurately:
“Yes, you can tune FreeBSD’s ZFS async writes, but at the time I write this you cannot measure the effect of that tuning. So don’t do it. It won’t help most of you anyway. If you really must, look at this bug and study this blog post, if it’s still there.”
Yes, I’m a BSD guy. I’ve been a FreeBSD committer. I’m friends with a bunch of OpenBSD committers. I believe in the BSD philosophy–it comes straight out of my Boy Scout days. (Yes, I’m an Eagle Scout with the first palm, believe it or not!) I don’t want to diss any BSD, even in such a tiny matter.
But as a writer, it’s my job to speak the truth. (I’d argue this applies for all kinds of writing, not just tech writing, but it’s unquestionably 100% true for tech books.)
I could also take action before the book comes out, by using my influence as “the biggest BSD author” to whine at people until someone fixed it. Most authors don’t have that option, but I know these folks, and I could tag them all on Facebook until someone changed it. I could go to my fans and say “Fly, my flock! Fly to every FreeBSD developer you know, and throw this bug in their faces until they pay attention!”
But this method breaks my “don’t be a jerk” rule. (Yes, I have that rule. Shut up, Bob. And Warner! And–oh, fine, never mind.)
Another problem with this approach is, there might well be a very good reason why this patch isn’t merged. FreeBSD is not illumos. This apparently simple patch might to strange and terrible things to hosts in certain circumstances. Maybe it boosts latency or launches ICBMs at the nearest penguin sanctuary. I’m a writer, and totally unqualified to make this judgment.
Nevertheless, I’m going to try a little bit of the influence approach. Maybe I’ll write a blog post about the issue, hoping that someone of influence will see it. A few influential FreeBSD folks follow my blog. Perhaps someone with the necessary skills will take interest and either close the bug with an explanation or commit it.
(Me, passive-agressive? Moi? Never!)
It can take a very long time before patches to ZFS get accepted.
Even if they work.
Just take a look at this one:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187594
I’ve used this patch on a system that showed this problem (10.0) and it works fabulously.
It’s even mentioned (candidly) in one of the Quarterly Status Reports as having been discussed by Core.
Just from the discussion on this bug I learned that ZFS touches so many different places in the kernel and has so many different “knobs” (that each again affect a lot of place in the kernel), that the shepherds of that part of the source are very, very careful when accepting patches.
You may come across the reason why for several years not to implement similar functionality [zfs] [patch] ZFS “sharenfs” doesn’t allow different “exports” options for different hosts. illumos and ZoLinux have this, though a bit differently implemented.