I woke up today to find a console with:
panic: _mtx_lock_sleep: recursed on non-recursive mutex iscsi-io @ /usr/src/sys/modules/iscsi/initiator/../../../dev/iscsi/initiator/isc_sm.c:324
The initiator is a FreeBSD-current amd64 from 8 May 2011. The iSCSI target is an inexpensive iomega NAS. Other hosts attached to this iSCSI NAS have also had errors, though. The errors clear when I reboot the NAS.
Unfortunately, the FreeBSD box is a diskless system. Dumps aren’t exactly simple. While I heard some rumours about a network dump facility coming soon at the FreeBSD BSDCan devsummit, that’s the future.
How to fix this?
I attended the High Performance FreeBSD Clusters talk at BSDCan 2011. The presenter had originally used FreeBSD servers, then tried OpenSolaris to get better performance. He had OpenSolaris problems, but found that they could not access the bug information without a support contract. They’re now moving towards FreeBSD with EIT, and are happier.
I intend to learn from their mistakes, and replace the iomega with a FreeBSD EIT server. I’ll keep the iomega for, say, a central ports and packages NFS server, where a reboot won’t impact my uptime.
Why bother to blog this? So that the next poor bugger who gets this panic message gets at least one search engine hit.Stalk me on social media