[OmniOS-discuss] Ang: multipath problem when replacing a failed SAS drive

Kevin Swab Kevin.Swab at ColoState.EDU
Wed Oct 30 18:13:01 UTC 2013


The problem drive is currently configured as a hot-spare (it replaced
the old hot-spare, which kicked in when the original drive failed), but
I'll remove it from the pool and do some testing with it and report back...

Thanks!
Kevin

On 10/30/2013 12:02 PM, Johan Kragsterman wrote:
> Hi, Kevin!
> 
> What if you replace the drive with one of the hotspares? I mean, let the
> hotspare stay at its place, and configure it for replacing the
> problematic drive. Then you will find out wether the backplane has a bad
> port or not. Allways start to try to narrow it down.
> 
> Rgrds Johan
> 
> 
> 
> -----"OmniOS-discuss" <omnios-discuss-bounces at lists.omniti.com> skrev: -----
> Till: omnios-discuss at lists.omniti.com
> Från: Kevin Swab
> Sänt av: "OmniOS-discuss"
> Datum: 2013.10.30 18:38
> Ärende: [OmniOS-discuss] multipath problem when replacing a failed SAS drive
> 
> Hello,
> 
> I'm running OmniOS r151006p on the following system:
> 
> - Supermicro X8DT6 board, Xeon E5606 CPU, 48GB ram
> - Supermicro SC847 chassis, 36 drive bays, SAS expanders, LSI 9211-8i
> controller
> - 34 x Toshiba 3T SAS drives MG03SCA300 in one pool w/ 16 mirrored sets
> + 2 hot spares
> 
> 'mpathadm list lu' showed all drives as having two paths to the controller.
> 
> Yesterday, one of the drives failed and was replaced.  The new drive is
> only showing one path in mpathadm, and errors have started showing up
> periodically in /var/adm/messages:
> 
> 
> 
> # mpathadm list lu /dev/rdsk/c1t5000039478CA7150d0
> mpath-support:  libmpscsi_vhci.so
>         /dev/rdsk/c1t5000039478CA7150d0s2
>                 Total Path Count: 1
>                 Operational Path Count: 1
> 
> Oct 30 09:30:22 hagler scsi: [ID 243001 kern.warning] WARNING:
> /pci at 0,0/pci8086,3410 at 9/pci1000,3020 at 0 (mpt_sas0):
> Oct 30 09:30:22 hagler  mptsas_handle_event_sync: IOCStatus=0x8000,
> IOCLogInfo=0x31120101
> Oct 30 09:30:22 hagler scsi: [ID 243001 kern.warning] WARNING:
> /pci at 0,0/pci8086,3410 at 9/pci1000,3020 at 0 (mpt_sas0):
> Oct 30 09:30:22 hagler  mptsas_handle_event: IOCStatus=0x8000,
> IOCLogInfo=0x31120101
> Oct 30 09:30:22 hagler scsi: [ID 365881 kern.info]
> /pci at 0,0/pci8086,3410 at 9/pci1000,3020 at 0 (mpt_sas0):
> Oct 30 09:30:22 hagler  Log info 0x31120101 received for target 89.
> Oct 30 09:30:22 hagler  scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc
> Oct 30 09:30:22 hagler scsi: [ID 243001 kern.warning] WARNING:
> /pci at 0,0/pci8086,3410 at 9/pci1000,3020 at 0 (mpt_sas0):
> Oct 30 09:30:22 hagler  mptsas_handle_event_sync: IOCStatus=0x8000,
> IOCLogInfo=0x31120101
> Oct 30 09:30:22 hagler scsi: [ID 243001 kern.warning] WARNING:
> /pci at 0,0/pci8086,3410 at 9/pci1000,3020 at 0 (mpt_sas0):
> Oct 30 09:30:22 hagler  mptsas_handle_event: IOCStatus=0x8000,
> IOCLogInfo=0x31120101
> Oct 30 09:30:22 hagler scsi: [ID 365881 kern.info]
> /pci at 0,0/pci8086,3410 at 9/pci1000,3020 at 0 (mpt_sas0):
> Oct 30 09:30:22 hagler  Log info 0x31120101 received for target 89.
> Oct 30 09:30:22 hagler  scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc
> Oct 30 09:30:22 hagler scsi: [ID 243001 kern.warning] WARNING:
> /pci at 0,0/pci8086,3410 at 9/pci1000,3020 at 0 (mpt_sas0):
> Oct 30 09:30:22 hagler  mptsas_handle_event_sync: IOCStatus=0x8000,
> IOCLogInfo=0x31120101
> Oct 30 09:30:22 hagler scsi: [ID 365881 kern.info]
> /pci at 0,0/pci8086,3410 at 9/pci1000,3020 at 0 (mpt_sas0):
> Oct 30 09:30:22 hagler  Log info 0x31120101 received for target 89.
> Oct 30 09:30:22 hagler  scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc
> Oct 30 09:30:22 hagler scsi: [ID 243001 kern.warning] WARNING:
> /pci at 0,0/pci8086,3410 at 9/pci1000,3020 at 0 (mpt_sas0):
> Oct 30 09:30:22 hagler  mptsas_handle_event: IOCStatus=0x8000,
> IOCLogInfo=0x31120101
> Oct 30 09:30:22 hagler scsi: [ID 243001 kern.warning] WARNING:
> /pci at 0,0/pci8086,3410 at 9/pci1000,3020 at 0 (mpt_sas0):
> Oct 30 09:30:22 hagler  mptsas_handle_event_sync: IOCStatus=0x8000,
> IOCLogInfo=0x31120101
> Oct 30 09:30:22 hagler scsi: [ID 365881 kern.info]
> /pci at 0,0/pci8086,3410 at 9/pci1000,3020 at 0 (mpt_sas0):
> Oct 30 09:30:22 hagler  Log info 0x31120101 received for target 89.
> Oct 30 09:30:22 hagler  scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc
> Oct 30 09:30:22 hagler scsi: [ID 243001 kern.warning] WARNING:
> /pci at 0,0/pci8086,3410 at 9/pci1000,3020 at 0 (mpt_sas0):
> Oct 30 09:30:22 hagler  mptsas_handle_event: IOCStatus=0x8000,
> IOCLogInfo=0x31120101
> 
> 
> 
> The error messages refer to target 89, which I can confirm corresponds
> to the missing path for my replacement drive using "lsiutil":
> 
> 
> 
> # lsiutil -p 1 16
> 
> LSI Logic MPT Configuration Utility, Version 1.63, June 4, 2009
> 
> 1 MPT Port found
> 
>      Port Name         Chip Vendor/Type/Rev    MPT Rev  Firmware Rev  IOC
>  1.  mpt_sas0          LSI Logic SAS2008 03      200      0d000100     0
> 
> SAS2008's links are 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G
> 
>  B___T     SASAddress     PhyNum  Handle  Parent  Type
> [ ... cut ... ]
>  0  89  5000039478ca7152    17     0059    0032   SAS Target
>  0  90  5000039478ca7153    17     005a    000a   SAS Target
> [ ... cut ... ]
> 
> 
> 
> When I ask "lsiutil" to rescan the bus, I see the following error when
> it gets to target 89:
> 
> 
> 
> # lsiutil -p 1 8
> 
> LSI Logic MPT Configuration Utility, Version 1.63, June 4, 2009
> 
> 1 MPT Port found
> 
>      Port Name         Chip Vendor/Type/Rev    MPT Rev  Firmware Rev  IOC
>  1.  mpt_sas0          LSI Logic SAS2008 03      200      0d000100     0
> 
> SAS2008's links are 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G
> 
>  B___T___L  Type       Vendor   Product          Rev
> [ ... cut ... ]
> ScsiIo to Bus 0 Target 89 failed, IOCStatus = 004b (IOC Terminated)
>  0  90   0  Disk       TOSHIBA  MG03SCA300       0108  5000039478ca7153
>    17
> [ ... cut ... ]
> 
> 
> 
> This problem has happened to me once before on a similar system.  At
> that time, I tried reseating the drive, and tried several different
> replacement drives, all had the same issue.  I even tried rebooting the
> system and that didn't help.
> 
> Does anyone know how I can clear this issue up?  I'd be happy to provide
> any additional information that might be helpful,
> 
> TIA,
> Kevin
> 
> 
> 
> -- 
> -------------------------------------------------------------------
> Kevin Swab                          UNIX Systems Administrator
> ACNS                                Colorado State University
> Phone: (970)491-6572                Email: Kevin.Swab at ColoState.EDU
> GPG Fingerprint: 7026 3F66 A970 67BD 6F17  8EB8 8A7D 142F 2392 791C
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
> 

-- 
-------------------------------------------------------------------
Kevin Swab                          UNIX Systems Administrator
ACNS                                Colorado State University
Phone: (970)491-6572                Email: Kevin.Swab at ColoState.EDU
GPG Fingerprint: 7026 3F66 A970 67BD 6F17  8EB8 8A7D 142F 2392 791C


More information about the OmniOS-discuss mailing list