[OmniOS-discuss] How bad are these controller / io errors??
steve at linuxsuite.org
steve at linuxsuite.org
Tue Aug 13 15:20:22 UTC 2013
Howdy!
This is a SuperMicro JBOD with SATA disks. I am aware of the
issues of having
SATA on SAS, but was wondering just how serious these kinds of errors
are.. a scrub of the pool
completes without noticable problems.. I did a lot of stress testing
earlier and could
not get a failure. Disabling NCQ on the controller was a neccessary.
What is the practical risk to data??
See below info for iostat / syslog
thanx - steve
syslog info
kern.warning<4>: Aug 13 10:39:10 dfs1 scsi: [ID 243001 kern.warning]
WARNING: /pci at 0,0/pci8086,340d at 6/pci1000,3080 at 0 (mpt_sas0):
kern.warning<4>: Aug 13 10:39:10 dfs1 #011mptsas_handle_event_sync:
IOCStatus=0x8000, IOCLogInfo=0x31120303
kern.warning<4>: Aug 13 10:39:10 dfs1 scsi: [ID 243001 kern.warning]
WARNING: /pci at 0,0/pci8086,340d at 6/pci1000,3080 at 0 (mpt_sas0):
kern.warning<4>: Aug 13 10:39:10 dfs1 #011mptsas_handle_event_sync:
IOCStatus=0x8000, IOCLogInfo=0x31120436
kern.warning<4>: Aug 13 10:39:10 dfs1 scsi: [ID 243001 kern.warning]
WARNING: /pci at 0,0/pci8086,340d at 6/pci1000,3080 at 0 (mpt_sas0):
kern.warning<4>: Aug 13 10:39:10 dfs1 #011mptsas_handle_event:
IOCStatus=0x8000, IOCLogInfo=0x31120303
kern.warning<4>: Aug 13 10:39:10 dfs1 scsi: [ID 243001 kern.warning]
WARNING: /pci at 0,0/pci8086,340d at 6/pci1000,3080 at 0 (mpt_sas0):
Blah Blah...
kern.warning<4>: Aug 13 10:39:10 dfs1 #011mptsas_handle_event:
IOCStatus=0x8000, IOCLogInfo=0x31120436
kern.info<6>: Aug 13 10:39:11 dfs1 scsi: [ID 365881 kern.info]
/pci at 0,0/pci8086,340d at 6/pci1000,3080 at 0 (mpt_sas0):
kern.info<6>: Aug 13 10:39:11 dfs1 #011Log info 0x31120303 received for
target 13.
kern.info<6>: Aug 13 10:39:11 dfs1 #011scsi_status=0x0, ioc_status=0x804b,
scsi_state=0xc
kern.info<6>: Aug 13 10:39:11 dfs1 scsi: [ID 365881 kern.info]
/pci at 0,0/pci8086,340d at 6/pci1000,3080 at 0 (mpt_sas0):
kern.info<6>: Aug 13 10:39:11 dfs1 #011Log info 0x31120303 received for
target 13.
kern.info<6>: Aug 13 10:39:11 dfs1 #011scsi_status=0x0, ioc_status=0x804b,
scsi_state=0xc
kern.info<6>: Aug 13 10:39:11 dfs1 scsi: [ID 365881 kern.info]
/pci at 0,0/pci8086,340d at 6/pci1000,3080 at 0 (mpt_sas0):
Output of iostat -En
Looks like "Hard Errors" and "No Device" correspond. What
does "Transport Error" and "Recoverable" mean. I see no evidence
of data corruption/loss, does ZFS deal/recover from these errors in a
good/safe
way?
c5t5000C500489947A8d0 Soft Errors: 0 Hard Errors: 2 Transport Errors: 11
Vendor: ATA Product: ST3000DM001-9YN1 Revision: CC4H Serial No: W1F0AAMA
Size: 3000.59GB <3000592982016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 2 Recoverable: 0
Illegal Request: 2 Predictive Failure Analysis: 0
c5t5000C500525EB2B9d0 Soft Errors: 0 Hard Errors: 5 Transport Errors: 46
Vendor: ATA Product: ST3000DM001-9YN1 Revision: CC4H Serial No: W1F0QM5H
Size: 3000.59GB <3000592982016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 5 Recoverable: 0
Illegal Request: 5 Predictive Failure Analysis: 0
c5t5000C50045561CEAd0 Soft Errors: 0 Hard Errors: 1 Transport Errors: 7
Vendor: ATA Product: ST3000DM001-9YN1 Revision: CC4H Serial No: W1F09G4Q
Size: 3000.59GB <3000592982016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 1 Recoverable: 0
Illegal Request: 1 Predictive Failure Analysis: 0
More information about the OmniOS-discuss
mailing list