Building Mastodon Bots is Stupid Easy

I just updated the footnote fortune file for Patronizers. Yes, my Patronizers get a Unix fortune file containing all the footnotes from my nonfiction books. I thought it was daft, but apparently a few readers actually use the dang thing. My exhausted brain wondered, “How hard would it be to build a Mastodon bot that posted one of these every few hours?” Turns out: not hard at all.

First, install toot (https://toot.bezdomni.net/). FreeBSD packages it as py311-toot.

Then register an account for your bot, using the regular Mastodon web interface. I registered @quotebot@io.mwl.io. (Yes, I have my own fedi instance. My main account is @mwl@io.mwl.io. No, you can’t have an account on it.)

$ toot login
Enter instance URL [https://mastodon.social]: https://io.mwl.io
This authentication method requires you to log into your Mastodon instance in
your browser, where you will be asked to authorize toot to access your
account. When you do, you will be given an authorization code which you need
to paste here.

Login URL:
https://io.mwl.io/oauth/authorize/?response_type=code&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=read+write+follow&client_id=FPzkCcqnGBLNO5Vo4V95CvfilcyRlMIrOSN1ncgxZmI
Open link in default browser? [Y/n]: n

This server’s default browser is Lynx. For whatever reason it can’t display the entire authorization code. Lynx is used for low-vision accessibility testing, so I suspect that the masto interface has an accessibility problem. I copied the link, opened it in my desktop’s Firefox, and copied the authorization code.


Authorization code: i6OsrQq77knbO4Gq.....

✓ Successfully logged in.

I can now toot from the command line.

$ toot post "test from toot cli"
Toot posted: https://io.mwl.io/@quotebot/113249574738108310

Go look in the web interface, and you’ll see the post. Easy enough.

Posting from a program is easy enough.

$ quote-source | toot post

Now I need a quote source. I could use something database-driven but I happen to have the mwlfortune file handy, so I’ll stick it in a mwlquotes directory. I’d like more than the footnotes so here’s a sample of another quotes file. Each quote is plain text, separated by a percent sign. I won’t be methodically adding to this, but if I’m digging through something old and see a suitable line I’ll add it.


Someone had brought cake. Someone was a bastard.
%
The only universal configuration language is despair.

Now build the fortune data files.

$ strfile -c '%' mwlfortune
"mwlfortune.dat" created
There were 582 strings
Longest string: 421 bytes
Shortest string: 6 bytes
$ strfile -c '%' bodyquotes
"bodyquotes.dat" created
There were 2 strings
Longest string: 54 bytes
Shortest string: 49 bytes

If you give the directory as an argument to fortune(1), it will pick a fortune at random from the combined files.

$ fortune /home/mwl/mwlquotes
Yes, that's megabytes--you know, the unit below gigabytes. Yes,
megabytes can apply to disks.

Try it a couple more times and you’ll see we get random quotes.

Dumping this into our bot is pretty simple.

$ fortune mwlquotes/ | toot post

Initial tests show a problem, though. Fortunes respect terminal standards, and include mid-sentence newlines. Fediverse posts do not. We need to get rid of the newlines. I wound up with this bot script.

#!/bin/sh

fortune /usr/local/share/mwlquotes/ | tr '\n' ' ' | toot post

Why put this in a script? So I can edit it easily later.

Now put this in my personal cron. Most folks said posting every six hours would be reasonable, so that’s where I’m starting.

13 */6 * * * /usr/local/scripts/quotebot.sh

That’s it. Every six hours, at thirteen minutes past the hour, the bot followers get a random quote from one of my books. Took about two hours to fully implement, including writing this post.

Moving Virtual Machines to Jails

I recently learned that I could rent a dedicated machine from bloom.host for less than I’ve been paying for my virtual machines. Time to move some VMs to jails! Here’s the notes I’ve left for myself. All of my VMs run ZFS.

First, clean up unneeded boot environments, remove any unnecessary crap that lingered on the VM, apply all security updates, and in general tidy up the source VM.

Then decide how you want to flip services over. The cleanest way is to shut down all services and start the migration, but you might need to guarantee uptime. It’s up to you. I chose to leave services running during an initial replication, shut down services, do an final snapshot with an incremental replication, start the new jail, and change DNS to the new addresses. Figure out your own uptime requirements.

Start by creating a recursive snapshot of the system.

# zfs snapshot -r zroot@bloom

At a convenient time, I’d go to destination host and pull the snapshots over. The snapshots need to go into a directory on the zroot/jails dataset, named after the VM the jail will replace.

$ ssh mwlucas@www.mwl.io zfs send -Rc zroot@bloom | zfs recv -v -o mountpoint=/www zroot/jails/www

This might take a while, so follow up with an incremental right before you want the actual the migration.

$ ssh mwlucas@www.mwl.io zfs send -Rci zroot@bloom2 zroot@bloom3 | zfs recv -v -o mountpoint=/jails/mail zroot/jails/www

if you’ve tampered with new datasets between copies, you’ll get an error.

receiving incremental stream of www/ROOT@bloom3 into zroot/jails/www/ROOT@bloom3
cannot receive incremental stream: destination zroot/jails/www/ROOT has been modified
since most recent snapshot
warning: cannot send 'www/ROOT/default@bloom3': signal received
Broken pipe

Roll back the problem dataset.

# zfs rollback zroot/jails/mail/ROOT@bloom2

Data’s moved over, but there’s trouble.

$ zfs list
...
zroot/www 39.6G 776G 132K /www
zroot/www/ROOT 22.5G 776G 132K /www/ROOT
zroot/www/ROOT/default 22.5G 776G 21.8G /www/ROOT/default
zroot/www/usr 10.9G 776G 132K /www/usr
zroot/www/usr/home 9.37G 776G 384K /www/usr/home
zroot/www/usr/home/acme 7.10M 776G 7.10M /www/usr/home/acme ...

The jail boots from the boot environment /www/ROOT/default, but the jail’s root dataset is /zroot/www. It’s empty. Shuffling datasets and rearranging inheritance is a pain. I just duplicated the contents

# zfs mount zroot/jails/mail/ROOT/default

$ tar cfC - /jails/www/ROOT/default/ . | tar xvpfC - /jails/www/

# zfs list zroot/www
NAME USED AVAIL REFER MOUNTPOINT
zroot/www 41.4G 774G 132K /www
zroot/www/ROOT 22.5G 774G 132K /www/ROOT
zroot/www/usr 10.9G 774G 132K /www/usr
zroot/www/var 7.96G 774G 132K /www/var

Go into the jail’s root directory. Edit /etc/sysctl.conf to remove non-jail settings. You can also edit rc.conf for the new network interface and the new IP.

I’m using VNET, because otherwise I must configure on-system daemons to avoid binding to localhost. (Remember, in a non-VNET jail localhost is aliased to the public IP!) That means I need a bridge interface. This host has one live Ethernet, igb0 so I make it a bridge.

autobridge_interfaces="bridge0"
autobridge_bridge0="igb*"
cloned_interfaces="bridge0"
ifconfig_igb0="UP"

I then add a public IP to the bridge, for the host’s use.

Now for jail.conf for a VNET install. I need to allow devfs for running named(8) on some of the VMs, and I want raw sockets.

path = "/jails/$name";
mount.devfs;
devfs_ruleset=5;
exec.clean;
allow.mount.devfs=1;
allow.raw_sockets=1;

exec.consolelog="/jails/$name/var/log/console.log";

vnet;
exec.prestart += "/sbin/ifconfig epair${jid} create up";
exec.prestart += "/sbin/ifconfig epair${jid}a descr 'vnet-${name}'";
exec.prestart += "/sbin/ifconfig bridge0 addm epair${jid}a up";
vnet.interface="epair${jid}b";

exec.start = "sh /etc/rc";

exec.created="logger jail $name has started";

exec.stop = "sh /etc/rc.shutdown";
exec.poststop += "ifconfig epair${jid}a destroy";
exec.poststop +="logger jail $name has stopped";

.include "/etc/jail.conf.d/*.conf";

This reduces individual jail.conf entries to this.


www {
jid = 80 ;
}

At this point, I could start the jail and see what broke. Some common errors included /tmp losing the sticky bit and MariaDB directories being owned by root rather than mysql.

Change the DNS, and watch traffic shift to the new host.

Am I confident in this process? No. That’s why I make sure I have a last backup in Tarsnap, and wait 30 days to delete the source VM.

Vultr Just Betrayed Us

(followup at https://mwl.io/archives/23504)

I suppose the hip kids would say this is enshittification, but it’s certainly a betrayal.

According to their new Terms of Service:

You hereby grant to Vultr a non-exclusive, perpetual, irrevocable, royalty-free, fully paid-up, worldwide license (including the right to sublicense through multiple tiers) to use, reproduce, process, adapt, publicly perform, publicly display, modify, prepare derivative works, publish, transmit and distribute each of your User Content, or any portion thereof, in any form, medium or distribution method now known or hereafter existing, known or developed, and otherwise use and commercialize the User Content in any way that Vultr deems appropriate, without any further consent, notice and/or compensation to you or to any third parties, for purposes of providing the Services to you.

This is unacceptable. No other hosting company does this.

The TLDR on the side is deceitful. They say that we own our stuff. Fine. The fine print declares that we license our stuff to them, for full exploitation, without compensation.

I have not agreed to those ToS. Fortunately, I can migrate off their systems without console access, so I do not have to agree. If you use vultr, I suggest you do the same. Also contact them through their contact page and state your refusal. I’m told you can cancel via their page without logging in, but haven’t yet tried it.

I am now investigating alternatives for hosting FreeBSD and OpenBSD systems. Preferably that take custom installs.

Mail Software Projects for You

Working through the tail of Run Your Own Mail Server has led me to a couple things I’d like to see. Maybe some reader would like to hack on one of them.

1) The best way to generate a list of hosts that should bypass Postfix’s intrusive protocol checks, or anything that resembling greylisting, is the postwhite. Postwhite has been abandoned for years, though. This isn’t exactly a problem, as it’s feature-complete and does the job. The configuration is clunky, though. It supports a long-obsolete list of Yahoo mailer addresses. The list of domains it generates lists for is hard-coded in the script, and artificially broken up into categories like “legit bulk mailers,” “social media,” and so on. You should not have to edit the script to remove a domain, because who accepts mail from LinkedIn these days? You shouldn’t have to edit the script for anything. The last edit to this was six years ago, so I suspect it’s basically abandoned.

Moving the domains to an external file and dropping the defunct Yahoo page would be good. If you have to fork it, using a meaningful name like “greyskip” or somesuch would be nice.

2) Postfix on FreeBSD supports blacklistd. That’s grand. Log parsers are inherently fragile, and libblacklist is the smart way for an application to declare that an IP address is misbehaving. The Postfix support only applies to authentication attempts on smtpd, however. I’m in favor of that, but I’d also like to see postscreen grow libblacklistd support. A host on a trusted DNSBL pokes our mail port? Block it.

I could do #1, but I lack the time and refuse to recommend my fault-oblivious code for production. I lack both skills and time for #2.

The truth is, we’ve limped along like this for years. We could limp for many more years. But hey, someone out there might want to make the world suck slighly less.

Proof You Should Not Run My Code: my SNMP agent

I’ve included bits of code in my books, sure. Always with warnings to not run it in production, as I am a firm devotee of fault-oblivious computing. You should not follow my example. But after a Fediverse (Mastodon) discussion last night, I’ve decided to share the code of a program I wrote and deployed. In production. When writing SNMP Mastery, I needed to understand how to integrate a custom agent into net-snmp. I also needed to go through the process of getting my own enterprise OID. I submitted the OID request right before Christmas 2019, and 55030 was assigned the next day.

So I promptly wrote my own SNMP agent, use top-notch state-of-the-art Perl 4. There’s some compatibility glue to make it run under Perl 5, but it’s basically Perl 4. Yes, there’s other languages–but Perl is eternal and timeless. Like Cobol and SNMP, that is not dead which can eternal sleeping lie.

This agent is the single source of truth for my published bibliography. Instructions for accessing it are in the SNMP book, but if I’m sharing the code I should provide context.

Browse to https://cdn.mwl.io/snmp/ and you’ll find the MIB file TWP.mib. Put that in your SNMP browser or MIB directory. If you’re running net-snmp you can pull the table with:

$ snmptable -v2c -c megadweeb snmp.mwl.io mwlBooksTable

The file agent-TWP.pl includes the agent proper.

This code has been called “comically evil,” which warms my bitter heart. Yes, I could use a database. But why? The data changes 3-4 times a year, if I’m productive. And yes, the data is in columns, not rows. SNMP doesn’t do rows. It doesn’t really do tables. It only has columns, which you could choose to arrange side-by-side, but that’s a feeble human thing and irrelevant to this primordial protocol.

In writing this I had to choose between complex code and simple data, or simple code and complex data. Given that updates consist of adding an entry to the end of each column, I chose simple code. Yes, there’s an occasional painful update where I realize that I missed one of my old books, but those are increasingly rare.

Anyway. If you want a truly complete checklist of what I’ve written, here it is. Other writers have spreadsheets or text documents or perhaps, if they’re truly prolific, desktop databases. But noooo, I had to do this.

TLDR: do not run my code. That’s one reason I don’t use github; this is not a sample or example, it an inspiration for you to recoil in horror and do better. (I also insist on controlling my platforms, and I don’t control github.) I’m certain this will be presented as an exhibit in my inevitable, eventual sanity hearing.

ChatGPT versus Facts

A friend asked ChatGPT about me. It provided something that looks like an answer, but is not an answer.

This illustrates how these large language models produce things that LOOK like answers, but are not actual answers. I’m a public figure with a fair amount of information in public, but it can’t get the details correct.

I thought of going through this and highlighting everything incorrect, but I have no interest in helping train ChatGPT or in making more details of my life available. Every paragraph has multiple glaring inaccuracies.

Michael W Lucas is a technical author and FreeBSD advocate who has written many books and articles on computer systems administration, networking, and information security. He was born on June 6, 1968, in Detroit, Michigan, and currently resides in Detroit, Michigan.

Lucas is best known for his contributions to the FreeBSD operating system community, including his books “Absolute FreeBSD”, “FreeBSD Mastery: Storage Essentials”, “FreeBSD Mastery: Advanced ZFS”, and “FreeBSD Mastery: Jails”. He has also written extensively on other technical topics, including network security, system administration, and SSH.

In addition to his writing, Lucas has been a speaker at technical conferences such as BSDCan, EuroBSDCon, and BSDTW. He is also a member of the FreeBSD Foundation and the FreeBSD Security Team.

Lucas has been involved with various open-source projects over the years, including OpenSSH, OpenNTPD, and OpenBGPD. He has also been a regular contributor to the BSD Now podcast, where he discusses news and developments in the BSD community.

Overall, Michael W Lucas is a respected figure in the technical writing and open-source communities, known for his expertise in FreeBSD and his clear and engaging writing style.

Yes, that last paragraph is wrong. I’m the biggest fish in a minuscule pond. Most technology writers, and a large part of the open source community, consider my corner irrelevant. ChatGPT is being kind, and facts are neither kind nor cruel.

If you based decisions on this, you would go very wrong. If you use this for your business? That’s your problem.

Also, I’m told that it has declared Michael Warren Lucas dead. ChatGPT obviously wants me to spend more time writing books that can be used to improve it.

Fediverse Servers, plus mac_portacl on FreeBSD

One of my business mantras is “control your platform.” If you build your business around a site like Facebook, they can de-prioritize you and disappear you. Twitter’s implosion served as a fierce reminder of that, so I’m blogging more here.

Before Twitter’s implosion, the Fediverse (Mastodon, PixelFed, and all the other ActivityPub-powered systems) drove just as much traffic to my site as Twitter. Other social networking sites are negligible. If I want to follow my business mantra, I must run my own Fediverse server. I tested three options: Mastodon, pleroma, and GoToSocial.

Mastodon is huge, clunky, and handles like a tank made out of chicken wire, tar, and lobsters. I spoke with a few Mastodon operators, and none of them recommended it.

Pleroma? I followed the instructions. They didn’t work. I went looking into support, but I discovered that Pleroma seems to be the server of choice for TERFs, racists, and related jerks. Their recommended servers for new users are all on my personal blocklist. I don’t care to help those folks debug their instructions.

GoToSocial was a joy. Except it’s not only in development, it’s in alpha. They are very clear about this. The features that exist are beautifully done, but certain features I find critical are incomplete.

I have decided to wait to deploy a production fediverse server until GoToSocial enters beta.

For incomplete software, though, GoToSocial is surprisingly complete. It has its own web server and Let’s Encrypt implementation. If it can bind to ports 80 and 443, you don’t need a web server or ACME agent. The catch is, gotosocial(8) runs as an unprivileged user. It can’t bind to privileged ports.

Enter mac_portacl(4).

In the BSD tradition, the man page details everything you can do with this Mandatory Access Control kernel module, but in short it lets you permit particular users or group to bind to privileged network ports. I don’t care for mac_portacl in production, as the rules are hard to read when you’re debugging. If you want me to use an access control program, the output better be no harder to read than pfctl -sr. But here’s how you do it.

Enable the module in /boot/loader.conf.

mac_portacl_load="YES"

You can now write port ACL rules. Each rule has four parts:

uid or group : numerical identifier : tcp or udp : port number

The gotosocial user has uid 209. I want uid 209 to be able to bind to TCP ports 80 and 443, so I need these rules.

uid:209:tcp:80
uid:209:tcp:443

Set the access control rules in /etc/sysctl.conf.

net.inet.ip.portrange.reservedhigh=0
security.mac.portacl.rules=uid:209:tcp:443,uid:209:tcp:80

The first sysctl disables the traditional “reserved port” behavior and allows unprivileged programs to bind to ports below 1024.

The second sysctl installs our rules in the kernel. When you write to this sysctl you must include all rules you want active, separated by commas.

Would I use this in production? If the software has a solid security track record and is designed to be directly exposed to the Internet, sure. If you’re running a web server, some program has to listen on port 80. GoToSocial is brand new, though, and I’d like to see a bit of a track record before I completely trusted it.

When GoToSocial enters beta next year and I deploy it for real, I’ll put an nginx or httpd in front of it so I can filter when needed.

Are there other options other than Mastodon, pleroma, and GoToSocial? Sure. But I’m out of time, and really need to make some words this week.

Why Mastodon/the Fediverse kind of sucks right now

I’m a big fan of the fediverse. As of right now (8 November 2022), it deeply sucks. Why?

Because nobody expected Elon Musk to be this stupid.

We expected some daftness, sure. But actions like cutting the entire human rights team, accessibility team, and AI ethics team, plus limiting moderation, have people abandoning Twitter and searching for alternatives.

Nobody wants to live in a free-for-all wasteland. “The right to free speech” is built on “the right to take the consequences.” Without moderators, Twitter is a cesspit.

The Fediverse resembles Twitter[1], except it is run by volunteers on donated equipment. Every time Twitter did something stupid, we got a few thousand folks looking for a better way. We’ve grown steadily as a result.

Almost hourly Musk demonstrates that he doesn’t understand people, doesn’t understand how Twitter is used, and picking stupid fights. I’m told that last Friday, the biggest Mastodon server got 70,000 new users. If you add in all the hundreds of other servers, we’re looking at hundreds of thousands of new accounts. Many servers doubled or tripled in usage.

Here’s a graph of the number of users interacting with our server.

Nobody expected Musk to be this stupid.

Nobody expected this flood of new users.

If you get an account and find it’s slow? The volunteers are working as hard as they can. Scotty is shouting “She canna take any more!” over the roar of the struggling servers. New servers are being installed, but physical equipment must be shipped and mounted and plugged in.

The servers that are doing well, ironically, are the alt-right ones. The worst Nazis already fled Twitter, so they set up their own Mastodon servers. The rest of the fedi automatically blocks those monsters, but they’re actively recruiting both abusers and victims. I’ve seen more than one LGBT person innocently sign up for a disguised white supremacist instance and get a torrent of abuse.

Be patient with the volunteers. They’re doing the best they can. We’ll catch up as soon as we can.

The truth is, nobody can prepare for a stupid billionaire.

[1] No, the Fediverse isn’t exactly like twitter. Each server is a community of interest, like “BSD Unix folks” or “book lovers” or “LGBT in tech.” They can all talk to each other. We have content warnings, so that people can interact with difficult content as they wish rather than having it jammed into their face. Each server does its own moderation. (My server blocks the alt-right, TERFs, racists, ableist jerks, and cryptocurrency scammers.) Where Twitter has been increasingly negative and stressful over the last few years, local control means the Fediverse is downright sweet.