OpenSolaris iSCSI ZFS COMSTAR target, FreeBSD initiator

Storage is a pain.  I can spend lots of money to solve this problem, or I can find less expensive alternatives.  I’ve been using diskless servers lately, all served off of a big OpenSolaris machine.  (Why OpenSolaris?  It has a newer ZFS than FreeBSD.)  Performance is mediocre on NFSv2/3, and I want faster.  One obvious thing to try is iSCSI.

iSCSI requires targets (servers) and initiators (clients).  For this test I’m using OpenSolaris as the target and FreeBSD as the initiator.  For testing, both machines are running on ESXi, on older hardware that’s been removed from production.

ZFS iSCSI uses opensolaris’ older iSCSI framework.  COMSTAR has all kinds of advantages, so install COMSTAR:

target# pkg install storage-server SUNWiscsit

Reboot.  Then make sure that you’ve disabled the older iSCSI service, and enable the new COMSTAR STMF service.

target# svcadm disable iscsitgt

target# svcadm enable stmf

target# svcs -a | grep -i stmf

online         Jun_01   svc:/system/stmf:default

Now that we have the software, create a ZFS to share with iSCSI.  Here I create a 10GB ZFS /rpool/iscsitest.

target# zfs create -V 10g rpool/iscsitest

Then create a SCSI block device and attach it to the ZFS.

target# sbdadm create-lu /dev/zvol/rdsk/rpool/iscsitest
Created the following LU:

GUID                    DATA SIZE           SOURCE

——————————–  ——————-  —————-
600144f096b2c40000004c10e52c0001      10737352704      /dev/zvol/rdsk/rpool/iscsitest

Does STMF see this device?

target# stmfadm list-lu

LU Name: 600144F096B2C40000004C10E52C0001

target# stmfadm list-lu -v

LU Name: 600144F096B2C40000004C10E52C0001

Operational Status: Online

Provider Name     : sbd

Alias             : /dev/zvol/rdsk/rpool/iscsitest

View Entry Count  : 0

Yep, it’s in there.

Now we need access control, so we can restrict iSCSI disks to particular hosts.  Create the hostgroup “testgroup.”

target# stmfadm create-hg testgroup

To add a host to this hostgroup, I need to know how the client initiator will identify itself.  I’ve set up a FreeBSD amd64-current initiator with an iSCSI kernel.  Use iscontrol(8) to manage the iSCSI initiator.  You can find a discussion of iSCSI on FreeBSD here.

initiator# iscontrol -vt target

I-: cmd=0x3 len=398
SessionType=Normal
InitiatorName=iqn.2005-01.il.ac.huji.cs::initiator.example.com
TargetName=(null)
MaxBurstLength=131072
HeaderDigest=None,CRC32C
DataDigest=None,CRC32C
MaxRecvDataSegmentLength=65536
ErrorRecoveryLevel=0
DefaultTime2Wait=0
DefaultTime2Retain=0
DataPDUInOrder=Yes
DataSequenceInOrder=Yes
MaxOutstandingR2T=1
MaxConnections=1
FirstBurstLength=65536
InitialR2T=Yes
ImmediateData=Yes
T-: cmd=0x23 len=0
0x0203: Not found

The key line is the InitiatorName, in bold.  This is hold the initiator identifies itself to the target.

target# stmfadm add-hg-member -g testgroup iqn.2005-01.il.ac.huji.cs::initiator.example.com
Host Group: testgroup

target# stmfadm list-hg -v testgroup

Host Group: testgroup

Member: iqn.2005-01.il.ac.huji.cs::initiator.example.com

Now add a view, so that our host group can see this disk.

target# stmfadm add-view -h testgroup 600144F096B2C40000004C10E52C0001

target# stmfadm list-view -l 600144F096B2C40000004C10E52C0001

View Entry: 0

Host group   : testgroup

Target group : All

LUN          : 0

Check from our client:

initiator# iscontrol -d targetaddress=target

TargetName=iqn.1986-03.com.sun:02:210bcdba-c89c-6f81-f584-a0de390ae632

TargetAddress=192.0.2.1,1

The client can see the server’s iSCSI disk.  Huzzah!  Make a config file, /usr/local/etc/iscsi.conf:

testdisk {

TargetName      = iqn.1986-03.com.sun:02:210bcdba-c89c-6f81-f584-a0de390ae632

TargetAddress   = target:3260,1 #target hostname, port, and number

}

Now try to mount it:

target# iscontrol -c /usr/local/etc/iscsi.conf -n testdisk

iscontrol[2841]: running

iscontrol[2841]: (pass1:iscsi0:0:0:0):  tagged openings now 0

iscontrol[2841]: cam_open_btl: no passthrough device found at 1:0:1

iscontrol[2841]: cam_open_btl: no passthrough device found at 1:0:2

iscontrol[2841]: cam_open_btl: no passthrough device found at 1:0:3

iscontrol: supervise starting main loop

da1 at iscsi0 bus 0 scbus1 target 0 lun 0

da1: <SUN COMSTAR 1.0> Fixed Direct Access SCSI-5 device

da1: 10239MB (20971392 512 byte sectors: 255H 63S/T 1305C)

target# ls /dev/da*

/dev/da0        /dev/da0s1a     /dev/da0s1d     /dev/da0s1f

/dev/da0s1      /dev/da0s1b     /dev/da0s1e     /dev/da1

The disk shows up.  Let’s try to use it.  Slice the disk:

target# fdisk -BI /dev/da1s1

******* Working on device /dev/da1 *******

fdisk: invalid fdisk partition table found

fdisk: Class not found

If a blank ZFS-backed iSCSI disk started off with a valid slice table, I’d be shocked.  Now, add a label.

target# disklabel -w /dev/da1s1

We now have a valid label, and need to add a partition to the label.

target# disklabel -e /dev/da1s1

The default disklabel includes a single partition, a, that fills the entire disk.  It’s of type “unused.”  Change that to 4.2BSD, save and exit, and check your work.

target# disklabel /dev/da1s1

# /dev/da1s1:

8 partitions:

#        size   offset    fstype   [fsize bsize bps/cpg]

a: 20964746       16    4.2BSD        0     0     0

c: 20964762        0    unused        0     0         # “raw” part, don’t edit

Now create a file system, using -U to get soft updates:

target# newfs -U /dev/da1s1a

target# mount /dev/da1s1a /mnt/

But will this really work?  Let’s do some real work:

target# tar cfC – /usr/ports/ . | tar xpfC – /mnt

Yes, it works.

Performance?  I’m running both servers in VMWare ESXi.  The physical hardware was used in production for a few years, and then shifted to testing.  All traffic goes to and from the same, older, disk.  My FreeBSD client is amd64 -current, with all the default debugging options turned on.  With all of this, I expected performance to be abominable.  And my expectations were fulfilled: I ran the tar command with time(1), and got:

5.982u 41.868s 46:19.45 1.7%    71+4062k 176023+4213io 7pf+0w

Remember, a ports tree contains thousands of little files, and several large files in /usr/ports/distfiles, for a total size of 651MB.  Real throughput was about 240kB/second, or very roughly 2Mbs.

I’ll be trying this on more current hardware, with the client and server on different hardware, and expect better results.