Storage is a pain. I can spend lots of money to solve this problem, or I can find less expensive alternatives. I’ve been using diskless servers lately, all served off of a big OpenSolaris machine. (Why OpenSolaris? It has a newer ZFS than FreeBSD.) Performance is mediocre on NFSv2/3, and I want faster. One obvious thing to try is iSCSI.
iSCSI requires targets (servers) and initiators (clients). For this test I’m using OpenSolaris as the target and FreeBSD as the initiator. For testing, both machines are running on ESXi, on older hardware that’s been removed from production.
ZFS iSCSI uses opensolaris’ older iSCSI framework. COMSTAR has all kinds of advantages, so install COMSTAR:
target# pkg install storage-server SUNWiscsit
Reboot. Then make sure that you’ve disabled the older iSCSI service, and enable the new COMSTAR STMF service.
target# svcadm disable iscsitgt
target# svcadm enable stmf
target# svcs -a | grep -i stmf
online Jun_01 svc:/system/stmf:default
Now that we have the software, create a ZFS to share with iSCSI. Here I create a 10GB ZFS /rpool/iscsitest.
target# zfs create -V 10g rpool/iscsitest
Then create a SCSI block device and attach it to the ZFS.
target# sbdadm create-lu /dev/zvol/rdsk/rpool/iscsitest
Created the following LU:
GUID DATA SIZE SOURCE
——————————– ——————- —————-
600144f096b2c40000004c10e52c0001 10737352704 /dev/zvol/rdsk/rpool/iscsitest
Does STMF see this device?
target# stmfadm list-lu
LU Name: 600144F096B2C40000004C10E52C0001
target# stmfadm list-lu -v
LU Name: 600144F096B2C40000004C10E52C0001
Operational Status: Online
Provider Name : sbd
Alias : /dev/zvol/rdsk/rpool/iscsitest
View Entry Count : 0
Yep, it’s in there.
Now we need access control, so we can restrict iSCSI disks to particular hosts. Create the hostgroup “testgroup.”
target# stmfadm create-hg testgroup
To add a host to this hostgroup, I need to know how the client initiator will identify itself. I’ve set up a FreeBSD amd64-current initiator with an iSCSI kernel. Use iscontrol(8) to manage the iSCSI initiator. You can find a discussion of iSCSI on FreeBSD here.
initiator# iscontrol -vt target
I-: cmd=0x3 len=398
SessionType=Normal
InitiatorName=iqn.2005-01.il.ac.huji.cs::initiator.example.com
TargetName=(null)
MaxBurstLength=131072
HeaderDigest=None,CRC32C
DataDigest=None,CRC32C
MaxRecvDataSegmentLength=65536
ErrorRecoveryLevel=0
DefaultTime2Wait=0
DefaultTime2Retain=0
DataPDUInOrder=Yes
DataSequenceInOrder=Yes
MaxOutstandingR2T=1
MaxConnections=1
FirstBurstLength=65536
InitialR2T=Yes
ImmediateData=Yes
T-: cmd=0x23 len=0
0x0203: Not found
The key line is the InitiatorName, in bold. This is hold the initiator identifies itself to the target.
target# stmfadm add-hg-member -g testgroup iqn.2005-01.il.ac.huji.cs::initiator.example.com
Host Group: testgroup
target# stmfadm list-hg -v testgroup
Host Group: testgroup
Member: iqn.2005-01.il.ac.huji.cs::initiator.example.com
Now add a view, so that our host group can see this disk.
target# stmfadm add-view -h testgroup 600144F096B2C40000004C10E52C0001
target# stmfadm list-view -l 600144F096B2C40000004C10E52C0001
View Entry: 0
Host group : testgroup
Target group : All
LUN : 0
Check from our client:
initiator# iscontrol -d targetaddress=target
TargetName=iqn.1986-03.com.sun:02:210bcdba-c89c-6f81-f584-a0de390ae632
TargetAddress=192.0.2.1,1
The client can see the server’s iSCSI disk. Huzzah! Make a config file, /usr/local/etc/iscsi.conf:
testdisk {
TargetName = iqn.1986-03.com.sun:02:210bcdba-c89c-6f81-f584-a0de390ae632
TargetAddress = target:3260,1 #target hostname, port, and number
}
Now try to mount it:
target# iscontrol -c /usr/local/etc/iscsi.conf -n testdisk
iscontrol[2841]: running
iscontrol[2841]: (pass1:iscsi0:0:0:0): tagged openings now 0
iscontrol[2841]: cam_open_btl: no passthrough device found at 1:0:1
iscontrol[2841]: cam_open_btl: no passthrough device found at 1:0:2
iscontrol[2841]: cam_open_btl: no passthrough device found at 1:0:3
iscontrol: supervise starting main loop
da1 at iscsi0 bus 0 scbus1 target 0 lun 0
da1: <SUN COMSTAR 1.0> Fixed Direct Access SCSI-5 device
da1: 10239MB (20971392 512 byte sectors: 255H 63S/T 1305C)
target# ls /dev/da*
/dev/da0 /dev/da0s1a /dev/da0s1d /dev/da0s1f
/dev/da0s1 /dev/da0s1b /dev/da0s1e /dev/da1
The disk shows up. Let’s try to use it. Slice the disk:
target# fdisk -BI /dev/da1s1
******* Working on device /dev/da1 *******
fdisk: invalid fdisk partition table found
fdisk: Class not found
If a blank ZFS-backed iSCSI disk started off with a valid slice table, I’d be shocked. Now, add a label.
target# disklabel -w /dev/da1s1
We now have a valid label, and need to add a partition to the label.
target# disklabel -e /dev/da1s1
The default disklabel includes a single partition, a, that fills the entire disk. It’s of type “unused.” Change that to 4.2BSD, save and exit, and check your work.
target# disklabel /dev/da1s1
# /dev/da1s1:
8 partitions:
# size offset fstype [fsize bsize bps/cpg]
a: 20964746 16 4.2BSD 0 0 0
c: 20964762 0 unused 0 0 # “raw” part, don’t edit
Now create a file system, using -U to get soft updates:
target# newfs -U /dev/da1s1a
target# mount /dev/da1s1a /mnt/
But will this really work? Let’s do some real work:
target# tar cfC – /usr/ports/ . | tar xpfC – /mnt
Yes, it works.
Performance? I’m running both servers in VMWare ESXi. The physical hardware was used in production for a few years, and then shifted to testing. All traffic goes to and from the same, older, disk. My FreeBSD client is amd64 -current, with all the default debugging options turned on. With all of this, I expected performance to be abominable. And my expectations were fulfilled: I ran the tar command with time(1), and got:
5.982u 41.868s 46:19.45 1.7% 71+4062k 176023+4213io 7pf+0w
Remember, a ports tree contains thousands of little files, and several large files in /usr/ports/distfiles, for a total size of 651MB. Real throughput was about 240kB/second, or very roughly 2Mbs.
I’ll be trying this on more current hardware, with the client and server on different hardware, and expect better results.