[OmniOS-discuss] zfs/zpool commands have stopped responding

Ryan Kohler kohlerr at ics.uci.edu
Fri Jan 18 11:27:33 EST 2013


What is the output of:

cfgadm -la

On 1/18/2013 7:46 AM, Paul Jochum wrote:
> Thanks Eric.  Any suggestions on how to convert Target 41 into either an sd
> device or cXtXdX type name?  I assume that this is a single device (such as a
> single disk), but maybe my assumption is incorrect.   I have tried the
> following in the /dev/rdsk directory:
>
> ls -al  | grep
> "/pci at 0,0/pci8086,25f8 at 4/pci111d,801c at 0/pci111d,801c at 4/pci1000,3150 at 0"
>
> but this returns 1008 devices, it seems to narrow it down to c15tXd0, where X
> is from 0 to 47.
>
> thanks,
>
> Paul
>
> On 01/18/2013 09:03 AM, Eric Sproul wrote:
>> ZFS is probably unable to make any progress because the hardware is
>> busy freaking out. :)  Target 41 seems like it is misbehaving and
>> causing the HBA some indigestion.  If you can identify target 41's
>> physical location, you could try pulling that device.
>>
>> On Fri, Jan 18, 2013 at 9:29 AM, Paul Jochum
>> <paul.jochum at alcatel-lucent.com>  wrote:
>>> Hi All:
>>>
>>>      I have an OmniOS server running, under which zfs/zpool commands have
>>> stopped responding.  Any activities that involve the external JBODs storage
>>> seem to hang.  Is there a way to kill/reset this other than rebooting the
>>> server?
>>>
>>> Here is some background:
>>>
>>> when logging in, the following is displayed:
>>>
>>> OmniOS 5.11     omnios-79686dc  2012.03.06
>>>
>>> uname -a
>>>
>>> SunOS lss-bkup301 5.11 omnios-eae537b i86pc i386 i86pc Solaris
>>>
>>> hardware:
>>>
>>> server - SUN x4250
>>> LSI SAS HBAs (I believe there are 5 in the system), they are all of the same
>>> type (LSI SAS 9200-8e)
>>> a combination of SUN J4400 JBODs, and one DataOnStorage DNS-1660D
>>>
>>> What I am seeing:
>>>
>>> The system is responding to input (I can log in, view files in the the root
>>> pool, perform commands as long as they are restricted to the root pool,
>>> etc.)
>>> an "ls" command to filesystems on the external JBODs hang
>>> zpool status commands on the external JBOD hang
>>> the following is coming out on the console, rolling about every 10-30
>>> seconds:
>>>
>>> Jan 18 08:20:06 lss-bkup301 scsi: WARNING:
>>> /pci at 0,0/pci8086,25f9 at 6/pci111d,801c at 0/pci111d,801c at 4/pci1000,3150 at 0 (mpt4):
>>>
>>> Jan 18 08:20:06 lss-bkup301     Disconnected command timeout for Target 41
>>>
>>> I am seeing a lot of messages in /var/adm/messages, but they all seem to be
>>> around the following:
>>>
>>> Jan 18 08:22:38 lss-bkup301 scsi: [ID 365881 kern.info]
>>> /pci at 0,0/pci8086,25f9 at 6/pci111d,801c at 0/pci111d,801c at 4/pci1000,3150 at 0 (mpt4):
>>> Jan 18 08:22:38 lss-bkup301     Log info 0x31130000 received for target 41.
>>> Jan 18 08:22:38 lss-bkup301     scsi_status=0x0, ioc_status=0x8048,
>>> scsi_state=0xc
>>> Jan 18 08:22:38 lss-bkup301 scsi: [ID 243001 kern.warning] WARNING:
>>> /pci at 0,0/pci8086,25f9 at 6/pci111d,801c at 0/pci111d,801c at 4/pci1000,3150 at 0 (mpt4):
>>> Jan 18 08:22:38 lss-bkup301     mpt_handle_event_sync: IOCStatus=0x8000,
>>> IOCLogInfo=0x31111000
>>> Jan 18 08:22:38 lss-bkup301 scsi: [ID 243001 kern.warning] WARNING:
>>> /pci at 0,0/pci8086,25f9 at 6/pci111d,801c at 0/pci111d,801c at 4/pci1000,3150 at 0 (mpt4):
>>> Jan 18 08:22:38 lss-bkup301     mpt_handle_event: IOCStatus=0x8000,
>>> IOCLogInfo=0x31111000
>>> Jan 18 08:22:38 lss-bkup301 scsi: [ID 365881 kern.info]
>>> /pci at 0,0/pci8086,25f8 at 4/pci111d,801c at 0/pci111d,801c at 4/pci1000,3150 at 0 (mpt2):
>>> Jan 18 08:22:38 lss-bkup301     Log info 0x31111000 received for target 41.
>>> Jan 18 08:22:38 lss-bkup301     scsi_status=0x0, ioc_status=0x804b,
>>> scsi_state=0xc
>>> Jan 18 08:22:38 lss-bkup301 scsi: [ID 243001 kern.warning] WARNING:
>>> /pci at 0,0/pci8086,25f8 at 4/pci111d,801c at 0/pci111d,801c at 4/pci1000,3150 at 0 (mpt2):
>>> Jan 18 08:22:38 lss-bkup301     SAS Discovery Error on port 4.
>>> DiscoveryStatus is DiscoveryStatus is |Unaddressable device found|
>>> Jan 18 08:22:39 lss-bkup301 scsi: [ID 243001 kern.warning] WARNING:
>>> /pci at 0,0/pci8086,25f9 at 6/pci111d,801c at 0/pci111d,801c at 4/pci1000,3150 at 0 (mpt4):
>>> Jan 18 08:22:39 lss-bkup301     SAS Discovery Error on port 4.
>>> DiscoveryStatus is DiscoveryStatus is |Unaddressable device found|
>>>
>>> Thank you for looking at this, and I appreciate any help that can be
>>> provided.  Please let me know if there is any additional information that
>>> would help to diagnose this.
>>>
>>> Paul
>>>
>>>
>>> _______________________________________________
>>> OmniOS-discuss mailing list
>>> OmniOS-discuss at lists.omniti.com
>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>>>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss


More information about the OmniOS-discuss mailing list