[OmniOS-discuss] Pool degraded
Kevin Swab
Kevin.Swab at ColoState.EDU
Tue Apr 8 20:22:46 UTC 2014
Hello, and sorry for accidentally failing to "reply-all" on your first
message...
The man page seems misleading or incomplete on the subject of
"autoreplace" and spares. Setting 'autoreplace=on' should cause your
hot spare to kick in during a drive failure - with over 1100 spindles
running ZFS here, we've had the "opportunity" to test it many times! ;-)
I couldn't find any authoratative references for this, but here's a few
unautoratative ones:
http://my.safaribooksonline.com/book/operating-systems-and-server-administration/solaris/9780137049639/managing-storage-pools/ch02lev1sec7
http://stanley-huang.blogspot.com/2009/09/how-to-set-autoreplace-in-zfs-pool.html
http://www.datadisk.co.uk/html_docs/sun/sun_zfs_cs.htm
Hope this helps,
Kevin
On 04/08/2014 01:09 PM, Alexander Lesle wrote:
> Hello Kevin Swab and List,
>
> On April, 08 2014, 20:17 <Kevin Swab> wrote in [1]:
>
>> Instead of a 'zpool remove ...', you want to do a 'zpool detach ...' to
>> get rid of the old device.
>
> thats it.
> zpool detach ... "removes" the broken device from the pool.
>
>> If you turn the 'autoreplace' property on
>> for the pool, the spare will automatically kick in the next time a drive
>> fails...
>
> Are you sure? Because the man zpool tell me other:
>
> ,-----[ man zpool ]-----
> |
> | autoreplace=on | off
> |
> | Controls automatic device replacement. If set to "off",
> | device replacement must be initiated by the administra-
> | tor by using the "zpool replace" command. If set to
> | "on", any new device, found in the same physical loca-
> | tion as a device that previously belonged to the pool,
> | is automatically formatted and replaced. The default
> | behavior is "off". This property can also be referred to
> | by its shortened column name, "replace".
> |
> `-------------------
>
> I understand it that when I pull out a device and put a new device in
> the _same_ Case-Slot ZFS make a resilver and ZFS pull out the old one
> automatically.
> When the property if off I have use the command zpool replace ... ...
> what I have done.
>
> But in my case, the spare device was in the Case and _named for_ this
> pool
> So the 'Hot Spares-Section' tells
> ,-----[ man zpool ]-----
> |
> | ZFS allows devices to be associated with pools as "hot
> | spares". These devices are not actively used in the pool,
> | but when an active device fails, it is automatically
> | replaced by a hot spare.
> |
> `-------------------
>
> Or I have misunderstood.
>
>> On 04/08/2014 12:13 PM, Alexander Lesle wrote:
>>> Hello All,
>>>
>>> I have a pool with mirrors and one spare.
>>> Now my pool is degraded and I though that Omnios/ZFS activate the
>>> spare itself and make a resilvering.
>>>
>>> # zpool status -x
>>> pool: pool_ripley
>>> state: DEGRADED
>>> status: One or more devices could not be opened. Sufficient replicas exist for
>>> the pool to continue functioning in a degraded state.
>>> action: Attach the missing device and online it using 'zpool online'.
>>> see: http://illumos.org/msg/ZFS-8000-2Q
>>> scan: resilvered 84K in 0h0m with 0 errors on Sun Mar 23 15:09:08 2014
>>> config:
>>>
>>> NAME STATE READ WRITE CKSUM
>>> pool_ripley DEGRADED 0 0 0
>>> mirror-0 DEGRADED 0 0 0
>>> c1t5000CCA22BC16BC5d0 ONLINE 0 0 0
>>> c1t5000CCA22BEEF6A3d0 UNAVAIL 0 0 0 cannot open
>>> mirror-1 ONLINE 0 0 0
>>> c1t5000CCA22BC8D31Ad0 ONLINE 0 0 0
>>> c1t5000CCA22BF612C4d0 ONLINE 0 0 0
>>> .
>>> .
>>> .
>>>
>>> spares
>>> c1t5000CCA22BF5B9DEd0 AVAIL
>>>
>>> But nothing done.
>>> OK, then I do it myself.
>>> # zpool replace -f pool_ripley c1t5000CCA22BEEF6A3d0 c1t5000CCA22BF5B9DEd0
>>> Resilvering is starting immediately.
>>>
>>> # zpool status -x
>>> pool: pool_ripley
>>> state: DEGRADED
>>> status: One or more devices could not be opened. Sufficient replicas exist for
>>> the pool to continue functioning in a degraded state.
>>> action: Attach the missing device and online it using 'zpool online'.
>>> see: http://illumos.org/msg/ZFS-8000-2Q
>>> scan: resilvered 1.53T in 3h12m with 0 errors on Sun Apr 6 17:48:51 2014
>>> config:
>>>
>>> NAME STATE READ WRITE CKSUM
>>> pool_ripley DEGRADED 0 0 0
>>> mirror-0 DEGRADED 0 0 0
>>> c1t5000CCA22BC16BC5d0 ONLINE 0 0 0
>>> spare-1 DEGRADED 0 0 0
>>> c1t5000CCA22BEEF6A3d0 UNAVAIL 0 0 0 cannot open
>>> c1t5000CCA22BF5B9DEd0 ONLINE 0 0 0
>>> mirror-1 ONLINE 0 0 0
>>> c1t5000CCA22BC8D31Ad0 ONLINE 0 0 0
>>> c1t5000CCA22BF612C4d0 ONLINE 0 0 0
>>> .
>>> .
>>> .
>>> spares
>>> c1t5000CCA22BF5B9DEd0 INUSE currently in use
>>>
>>> After resilvering I made power-off, unplugged the broken HDD from
>>> Case-slot 1 and switched the Spare from Slot 21 to Slot 1.
>>> The pool is still degraded. The broken HDD I cant remove it.
>>>
>>> # zpool remove pool_ripley c1t5000CCA22BEEF6A3d0
>>> cannot remove c1t5000CCA22BEEF6A3d0: only inactive hot spares,
>>> cache, top-level, or log devices can be removed
>>>
>>> What can I do to through out the broken HDD and tell ZFS that the
>>> spare is now member of mirror-0 and remove it from the spare list?
>>> Why does not automatically jump in the Spare device and resilver the
>>> pool?
>>>
>>> Thanks.
>>>
>
>
--
-------------------------------------------------------------------
Kevin Swab UNIX Systems Administrator
ACNS Colorado State University
Phone: (970)491-6572 Email: Kevin.Swab at ColoState.EDU
GPG Fingerprint: 7026 3F66 A970 67BD 6F17 8EB8 8A7D 142F 2392 791C
More information about the OmniOS-discuss
mailing list