[OmniOS-discuss] Drives goes offline in Zpool
Ram Chander
ramquick at gmail.com
Sat Mar 23 04:00:19 EDT 2013
Hi,
I have Dell md1200 connected to two heads ( Dell R710 ). The heads have
Perc H800 card and drives are configured in Raid0 ( Virtual Disk) in the
RAID controller.
One of the drives had crashed and is replaced by a spare. Resilvering was
triggered but fails to complete due to drives going offline. I have to
reboot the head ( R710) and drives comes online. This happened repeatedly
when resilver was 4% done, and again was rebooted , again hung at 27%
done, etc.
The issues happens with both Solaris11.1/ Omnios.
Its a 100Tb pool with 69Tb used. I have critical data and cant afford loss
of data.
Can I recover the data anyway ( atleast partially ) ?
I had verified there is no hardware issue with H800 and also upgraded the
firmware for H800. The issue happens with both the heads.
Current OS: Solaris 11.1
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci at 0
,0/pci8086,340e at 7/pci1028,1f15 at 0/sd at 12,0 (sd26):
Mar 22 21:47:55 solaris Command failed to complete...Device is gone
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci at 0
,0/pci8086,340e at 7/pci1028,1f15 at 0/sd at c,0 (sd20):
Mar 22 21:47:55 solaris Command failed to complete...Device is gone
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci at 0
,0/pci8086,340e at 7/pci1028,1f15 at 0/sd at 18,0 (sd32):
Mar 22 21:47:55 solaris Command failed to complete...Device is gone
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci at 0
,0/pci8086,340e at 7/pci1028,1f15 at 0/sd at 1c,0 (sd36):
Mar 22 21:47:55 solaris Command failed to complete...Device is gone
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci at 0
,0/pci8086,340e at 7/pci1028,1f15 at 0/sd at 1b,0 (sd35):
Mar 22 21:47:55 solaris Command failed to complete...Device is gone
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci at 0
,0/pci8086,340e at 7/pci1028,1f15 at 0/sd at 1e,0 (sd38):
Mar 22 21:47:55 solaris Command failed to complete...Device is gone
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci at 0
,0/pci8086,340e at 7/pci1028,1f15 at 0/sd at 19,0 (sd33):
Mar 22 21:47:55 solaris Command failed to complete...Device is gone
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci at 0
,0/pci8086,340e at 7/pci1028,1f15 at 0/sd at 1d,0 (sd37):
Mar 22 21:47:55 solaris Command failed to complete...Device is gone
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci at 0
,0/pci8086,340e at 7/pci1028,1f15 at 0/sd at 27,0 (sd47):
Mar 22 21:47:55 solaris Command failed to complete...Device is gone
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci at 0
,0/pci8086,340e at 7/pci1028,1f15 at 0/sd at 26,0 (sd46):
Mar 22 21:47:55 solaris Command failed to complete...Device is gone
# zpool status -v
pool: test
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Wed Mar 20 19:13:40 2013
27.4T scanned out of 69.6T at 183M/s, 67h11m to go
2.43T resilvered, 39.32% done
config:
NAME STATE READ WRITE CKSUM
test DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
c8t0d0 ONLINE 0 0 0
c8t1d0 DEGRADED 0 0 0
c8t2d0 DEGRADED 0 0 0
c8t3d0 ONLINE 0 0 0
spare-4 DEGRADED 0 0 0
12459181442598970150 UNAVAIL 0 0 0
c8t45d0 DEGRADED 0 0 0
(resilvering)
raidz1-1 ONLINE 0 0 0
c8t5d0 ONLINE 0 0 0
c8t6d0 ONLINE 0 0 0
c8t7d0 ONLINE 0 0 0
c8t8d0 ONLINE 0 0 0
c8t9d0 ONLINE 0 0 0
raidz1-3 DEGRADED 0 0 0
c8t12d0 ONLINE 0 0 0
c8t13d0 ONLINE 0 0 0
c8t14d0 ONLINE 0 0 0
c8t15d0 DEGRADED 0 0 0
c8t16d0 ONLINE 0 0 0
c8t17d0 ONLINE 0 0 0
c8t18d0 ONLINE 0 0 0
c8t19d0 ONLINE 0 0 0
c8t20d0 DEGRADED 0 0 0
c8t21d0 DEGRADED 0 0 0
spare-10 DEGRADED 0 0 0
c8t22d0 DEGRADED 0 0 0
c8t47d0 DEGRADED 0 0 0
(resilvering)
c8t23d0 ONLINE 0 0 0
raidz1-4 DEGRADED 0 0 0
c8t24d0 DEGRADED 0 0 0
c8t25d0 ONLINE 0 0 0
c8t26d0 ONLINE 0 0 0
c8t27d0 ONLINE 0 0 0
c8t28d0 ONLINE 0 0 0
c8t29d0 DEGRADED 0 0 0
c8t30d0 ONLINE 0 0 0
raidz1-5 DEGRADED 0 0 0
spare-0 DEGRADED 0 0 5
c8t31d0 DEGRADED 0 0 0
c8t46d0 DEGRADED 0 0 0
(resilvering)
c8t32d0 ONLINE 0 0 0
c8t33d0 ONLINE 0 0 0
c8t34d0 ONLINE 0 0 0
c8t35d0 DEGRADED 0 0 0
c8t36d0 DEGRADED 0 0 0
c8t37d0 ONLINE 0 0 0
raidz1-6 DEGRADED 0 0 0
c8t38d0 DEGRADED 0 0 0
c8t39d0 ONLINE 0 0 0
c8t40d0 DEGRADED 0 0 0
c8t41d0 DEGRADED 0 0 0
c8t42d0 ONLINE 0 0 0
c8t43d0 ONLINE 0 0 0
c8t44d0 ONLINE 0 0 0
spares
c8t45d0 INUSE
c8t46d0 INUSE
c8t47d0 INUSE
device details:
c8t1d0 DEGRADED scrub/resilver needed
status: ZFS detected errors on this device.
The device is missing some data that is recoverable.
c8t2d0 DEGRADED scrub/resilver needed
status: ZFS detected errors on this device.
The device is missing some data that is recoverable.
12459181442598970150 UNAVAIL was /dev/dsk/c2t4d0s0
status: ZFS detected errors on this device.
The device was missing.
c8t45d0 DEGRADED scrub/resilver needed
status: ZFS detected errors on this device.
The device is missing some data that is recoverable.
c8t15d0 DEGRADED scrub/resilver needed
status: ZFS detected errors on this device.
The device is missing some data that is recoverable.
c8t20d0 DEGRADED scrub/resilver needed
status: ZFS detected errors on this device.
The device is missing some data that is recoverable.
c8t21d0 DEGRADED scrub/resilver needed
status: ZFS detected errors on this device.
The device is missing some data that is recoverable.
c8t22d0 DEGRADED scrub/resilver needed
status: ZFS detected errors on this device.
The device is missing some data that is recoverable.
The device is missing some data that is recoverable.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20130323/2da4a38e/attachment.html>
More information about the OmniOS-discuss
mailing list