Well, “profit” is a strong word. Maybe “not losing money” would be a better description. Perhaps even “not screwing over readers.”
I back up my personal stuff with a combination of snapshots, tarballs, rsync, and sneakernet. This is fine for my email and my personal web site. Chances are, if all four of my backup sites are simultaneously destroyed, I won’t care.
A couple years ago I opened my own ecommerce site, so I could sell my self-published books directly to readers. For the record, I didn’t expect Tilted Windmill Press direct sales to actually, y’know, go anywhere. I didn’t expect that people would buy books directly from me when they could just go to their favorite ebookstore, hit a button, and have the book miraculously appear on their ereader.
I was wrong. People buy books from me. Once every month or two, someone even throws a few bucks in the tip jar, or flat-out overpays for their books. I am pleasantly surprised.
So: I was wrong about self-publishing, and now I was wrong about author direct sales. Pessimism is grand, because you’re either correct or you get a pleasant surprise. Thank you all.
But now I find myself in a position where I actually have commercially valuable data, and I need to back it up. Like a real business. I need offsite backups. I need them automated. And I need to be able to recover them, so that people who have bought my books can continue to download them, in the off chance that the Detroit area is firebombed off the Earth while I’m at BSDCan.
So it’s time for Tarsnap.
Why Tarsnap?
I also see the author regularly at regular BSD conferences. I can slap him in person if he does anything truly daft.
Tarsnap has a quick Getting Started page. We’ll do the easy things first. Sign up for a Tarsnap account. Once your account is active, put some money in it–$5 will suffice.
Now let’s check your prerequisites. You need:
BSD systems come with everything else you need.
Linux users must install
The Tarsnap download page lists specific packages for Debian-based and Red Hat-based Linuxes.
Go to the download page and get both the source code and the signed hash file. Tarsnap is only available as source code, so that you can verify the code integrity yourself. So let’s do that.
Start by using GnuPG to verify the integrity of the Tarsnap code. If you’re not familiar with GnuPG and OpenPGP, some daftie wrote a whole book on PGP & GPG. Once you install GnuPG, run the gpg command to get the configuration files.
# gpg
gpg: directory `/home/mwlucas/.gnupg' created
gpg: new configuration file `/home/mwlucas/.gnupg/gpg.conf' created
gpg: WARNING: options in `/home/mwlucas/.gnupg/gpg.conf' are not yet active during this run
gpg: keyring `/home/mwlucas/.gnupg/secring.gpg' created
gpg: keyring `/home/mwlucas/.gnupg/pubring.gpg' created
gpg: Go ahead and type your message ...
^C
gpg: signal Interrupt caught ... exiting
Hit ^C. I just wanted the configuration and key files.
Now edit $HOME/.gnupg/gpg.conf. Set the following options.
keyserver hkp://keys.gnupg.net
keyserver-options auto-key-retrieve
See if our GPG client can verify the signature file tarsnap-sigs-1.0.35.asc.
# gpg --decrypt tarsnap-sigs-1.0.35.asc
SHA256 (tarsnap-autoconf-1.0.35.tgz) = 6c9f6756bc43bc225b842f7e3a0ec7204e0cf606e10559d27704e1cc33098c9a
gpg: Signature made Sun Feb 16 23:20:35 2014 EST using RSA key ID E5979DF7
gpg: Good signature from "Tarsnap source code signing key (Colin Percival)
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 634B 377B 46EB 990B 58FF EB5A C8BF 43BA E597 9DF7
Some interesting things here. The most important line here is the statement ‘Good signature from “Tarsnap source code signing key”.’ Your GPG program grabbed the source code signing key from a public key server and used it to verify that the signature file is not tampered with.
As you’re new to OpenPGP, this is all you can do. You’re not attached to the Web of Trust, so you can’t verify the signature chain. (I do recommend that you get an OpenPGP key and collect a few signatures, so you can verify code signatures if nothing else.)
Now that we know the signature file is good, we can use the cryptographic hash in the file to validate that the tarsnap code we downloaded is what the Tarsnap author intended. Near the top of the signature file you’ll see the line:
SHA256 (tarsnap-autoconf-1.0.35.tgz) = 6c9f6756bc43bc225b842f7e3a0ec7204e0cf606e10559d27704e1cc33098c9a
Use the sha256(1) program (or sha256sum, or shasum -a 256, or whatever your particular Unix calls the SHA-256 checksum generator) to verify the source code’s integrity.
# sha256 tarsnap-autoconf-1.0.35.tgz
SHA256 (tarsnap-autoconf-1.0.35.tgz) = 6c9f6756bc43bc225b842f7e3a0ec7204e0cf606e10559d27704e1cc33098c9a
The checksum in the signature file and the checksum you compute match. You have valid source code, and can proceed.
Extract the source code.
# tar -xf tarsnap-autoconf-1.0.35.tgz
# cd tarsnap-autoconf-1.0.35
# ./configure
...
configure: creating ./config.status
config.status: creating Makefile
config.status: creating config.h
config.status: executing depfiles commands
#
If the configure script ends any way other than this, you’re on Linux and didn’t install the necessary development packages. The libraries alone won’t suffice, you must have the development versions.
If configure completed, run
# make all install clean
Tarsnap is now ready to use.
Start by creating a Tarsnap key for this machine and attaching it to your Tarsnap account. Here I create a key for my machine www.
# tarsnap-keygen –keyfile /root/tarsnap.key –user mwlucas@michaelwlucas.com –machine pestilence
Enter tarsnap account password:
#
I now have a tarsnap key file. /root/tarsnap.key looks like this:
# START OF TARSNAP KEY FILE
dGFyc25hcAAAAAAAAAAzY6MEAAAAAAEAALG8Ix2yYMu+TN6Pj7td2EhjYlGCGrRRknJQ8AeY
uJsctXIEfurQCOQN5eZFLi8HSCCLGHCMRpM40E6Jc6rJExcPLYkVQAJmd6auGKMWTb5j9gOr
SeCCEsUj3GzcTaDCLsg/O4dYjl6vb/he9bOkX6NbPomygOpBHqcMOUIBm2eyuOvJ1d9R+oVv
...
This machine is now registered and ready to go.
This key is important. If your machine is destroyed and you need access to your remote backup, you will need this key! Before you proceed, back it up somewhere other than the machine you’re backing up. There’s lots of advice out there on how to back up private keys. Follow it.
Now let’s store some backups in the cloud. I’m going to play with my /etc/ directory, because it’s less than 3MB. Start by backing up a single directory.
# tarsnap -c -f wwwetctest etc/
Directory /usr/local/tarsnap-cache created for "--cachedir /usr/local/tarsnap-cache"
Total size Compressed size
All archives 1996713 382896
(unique data) 1946025 366495
This archive 1996713 382896
New data 1946025 366495
Nothing seems to happen on the local system. Let’s check and be sure that there’s a backup out in the cloud:
# tarsnap --list-archives
wwwetctest
I then went into /etc and did some cleanup, removing files that shouldn’t have ever been there. This stuff grows in /etc on any long-lived system.
# tarsnap -c -f wwwetctest-20140716-1508 etc/
Total size Compressed size
All archives 3986206 765446
(unique data) 2120798 403833
This archive 1989493 382550
New data 174773 37338
# tarsnap --list-archives
wwwetctest
wwwetctest-20140716-1508
Note that the compressed size of this archive is much smaller than the first one. Tarsnap only stored the diffs between the two backups.
If you want more detail about your listed backups, add -v to see the creation date. Add a second -v to see the command used to create the archive.
# tarsnap --list-archives -vv
wwwetctest 2014-07-16 15:02:41 tarsnap -c -f wwwetctest etc/
wwwetctest-20140716-1508 2014-07-16 15:09:38 tarsnap -c -f wwwetctest-20140716-1508 etc/
Let’s pretend that I need a copy of my backup. Here I extract the newest backup into /tmp/etc.
# cd /tmp
# tarsnap -x -f wwwetctest-20140716-1508
Just for my own amusement, I’ll extract the older backup as well and compare the contents.
# cd /tmp
# tarsnap -x -f wwwetctest
The files I removed during my cleanup are now present.
What about rotating backups? I now have two backups. The second one is a differential backup against the first. If I blow away the first backup, what happens to the older backup?
# tarsnap -d -f wwwetctest
Total size Compressed size
All archives 1989493 382550
(unique data) 1938805 366149
This archive 1996713 382896
Deleted data 181993 37684
It doesn’t look like it deleted very much data. And indeed, a check of archive shows that all my files are there.
And now, the hard part: what do I need to back up? That’s a whole separate class of problem…
Michael, on OpenBSD there is a port of tarsnap. It simplifies installation, if one trusts the ports tree (which uses a SHA256 checksum for the source and is available via public AnonCVS servers that provide fingerprints).
http://ports.su/sysutils/tarsnap
Josh, Michael,
Tarsnap is also in ports and pkgs for FreeBSD…but I agree it’s a good practice to verify the signature anyway.
Hi Michael,
I noticed a small error on your site, maybe it’s intentional to save space.
The keygen command actually requires double dashes on all of the arguments:
usage: tarsnap-keygen –keyfile key-file –user user-name –machine machine-name
So yours would be this:
#tarsnap-keygen –-keyfile /root/tarsnap.key –-user mwlucas@michaelwlucas.com –-machine pestilence
Best,
sean
Sean — aargh! WordPress smoothed away my nice double dashes. Thanks for pointing that out, I’ll figure out how to make WP show what I typed.