[OmniOS-discuss] Pool degraded

Alexander Lesle groups at tierarzt-mueller.de
Fri Apr 11 08:39:05 UTC 2014


Hello Kevin Swab and List,

thanks Kevin your contribution helps me.

It would be nice if an official of Illumos or Omnios would confirm it
and would change the man page of zpool(1m).

On April, 08 2014, 22:22 <Kevin Swab> wrote in [1]:

> The man page seems misleading or incomplete on the subject of
> "autoreplace" and spares.  Setting 'autoreplace=on' should cause your
> hot spare to kick in during a drive failure - with over 1100 spindles
> running ZFS here, we've had the "opportunity" to test it many times! ;-)

> I couldn't find any authoratative references for this, but here's a few
> unautoratative ones:

> http://my.safaribooksonline.com/book/operating-systems-and-server-administration/solaris/9780137049639/managing-storage-pools/ch02lev1sec7

> http://stanley-huang.blogspot.com/2009/09/how-to-set-autoreplace-in-zfs-pool.html

> http://www.datadisk.co.uk/html_docs/sun/sun_zfs_cs.htm

> Hope this helps,
> Kevin

> On 04/08/2014 01:09 PM, Alexander Lesle wrote:
>> Hello Kevin Swab and List,
>> 
>> On April, 08 2014, 20:17 <Kevin Swab> wrote in [1]:
>> 
>>> Instead of a 'zpool remove ...', you want to do a 'zpool detach ...' to
>>> get rid of the old device.
>> 
>> thats it.
>> zpool detach ... "removes" the broken device from the pool.
>> 
>>> If you turn the 'autoreplace' property on
>>> for the pool, the spare will automatically kick in the next time a drive
>>> fails...
>> 
>> Are you sure? Because the man zpool tell me other:
>> 
>> ,-----[ man zpool ]-----
>> |
>> | autoreplace=on | off
>> | 
>> |          Controls automatic device replacement. If set to  "off",
>> |          device  replacement must be initiated by the administra-
>> |          tor by using the "zpool  replace"  command.  If  set  to
>> |          "on",  any  new device, found in the same physical loca-
>> |          tion as a device that previously belonged to  the  pool,
>> |          is  automatically  formatted  and  replaced. The default
>> |          behavior is "off". This property can also be referred to
>> |          by its shortened column name, "replace".
>> |
>> `-------------------
>> 
>> I understand it that when I pull out a device and put a new device in
>> the _same_ Case-Slot ZFS make a resilver and ZFS pull out the old one
>> automatically.
>> When the property if off I have use the command zpool replace ... ...
>> what I have done.
>> 
>> But in my case, the spare device was in the Case and _named for_ this
>> pool
>> So the 'Hot Spares-Section' tells
>> ,-----[ man zpool ]-----
>> |
>> | ZFS allows devices to  be  associated  with  pools  as  "hot
>> | spares".  These  devices  are not actively used in the pool,
>> | but  when  an  active  device  fails,  it  is  automatically
>> | replaced  by  a hot spare.
>> |
>> `-------------------
>> 
>> Or I have misunderstood.
>> 
>>> On 04/08/2014 12:13 PM, Alexander Lesle wrote:
>>>> Hello All,
>>>>
>>>> I have a pool with mirrors and one spare.
>>>> Now my pool is degraded and I though that Omnios/ZFS activate the
>>>> spare itself and make a resilvering.
>>>>
>>>> # zpool status -x
>>>>   pool: pool_ripley
>>>>  state: DEGRADED
>>>> status: One or more devices could not be opened.  Sufficient replicas exist for
>>>>         the pool to continue functioning in a degraded state.
>>>> action: Attach the missing device and online it using 'zpool online'.
>>>>    see: http://illumos.org/msg/ZFS-8000-2Q
>>>>   scan: resilvered 84K in 0h0m with 0 errors on Sun Mar 23 15:09:08 2014
>>>> config:
>>>>
>>>>         NAME                       STATE     READ WRITE CKSUM
>>>>         pool_ripley                DEGRADED     0     0     0
>>>>           mirror-0                 DEGRADED     0     0     0
>>>>             c1t5000CCA22BC16BC5d0  ONLINE       0     0     0
>>>>             c1t5000CCA22BEEF6A3d0  UNAVAIL      0     0     0  cannot open
>>>>           mirror-1                 ONLINE       0     0     0
>>>>             c1t5000CCA22BC8D31Ad0  ONLINE       0     0     0
>>>>             c1t5000CCA22BF612C4d0  ONLINE       0     0     0
>>>>           .
>>>>           .
>>>>           .
>>>>
>>>>         spares
>>>>           c1t5000CCA22BF5B9DEd0    AVAIL
>>>>
>>>> But nothing done.
>>>> OK, then I do it myself.
>>>> # zpool replace -f pool_ripley c1t5000CCA22BEEF6A3d0 c1t5000CCA22BF5B9DEd0
>>>> Resilvering is starting immediately.
>>>>
>>>> # zpool status -x
>>>>   pool: pool_ripley
>>>>  state: DEGRADED
>>>> status: One or more devices could not be opened.  Sufficient replicas exist for
>>>>         the pool to continue functioning in a degraded state.
>>>> action: Attach the missing device and online it using 'zpool online'.
>>>>    see: http://illumos.org/msg/ZFS-8000-2Q
>>>>   scan: resilvered 1.53T in 3h12m with 0 errors on Sun Apr  6 17:48:51 2014
>>>> config:
>>>>
>>>>         NAME                         STATE     READ WRITE CKSUM
>>>>         pool_ripley                  DEGRADED     0     0     0
>>>>           mirror-0                   DEGRADED     0     0     0
>>>>             c1t5000CCA22BC16BC5d0    ONLINE       0     0     0
>>>>             spare-1                  DEGRADED     0     0     0
>>>>               c1t5000CCA22BEEF6A3d0  UNAVAIL      0     0     0  cannot open
>>>>               c1t5000CCA22BF5B9DEd0  ONLINE       0     0     0
>>>>           mirror-1                   ONLINE       0     0     0
>>>>             c1t5000CCA22BC8D31Ad0    ONLINE       0     0     0
>>>>             c1t5000CCA22BF612C4d0    ONLINE       0     0     0
>>>>          .
>>>>          .
>>>>          .
>>>>         spares
>>>>           c1t5000CCA22BF5B9DEd0      INUSE     currently in use
>>>>
>>>> After resilvering I made power-off, unplugged the broken HDD from
>>>> Case-slot 1 and switched the Spare from Slot 21 to Slot 1.
>>>> The pool is still degraded. The broken HDD I cant remove it.
>>>>
>>>> # zpool remove pool_ripley c1t5000CCA22BEEF6A3d0
>>>> cannot remove c1t5000CCA22BEEF6A3d0: only inactive hot spares,
>>>> cache, top-level, or log devices can be removed
>>>>
>>>> What can I do to through out the broken HDD and tell ZFS that the
>>>> spare is now member of mirror-0 and remove it from the spare list?
>>>> Why does not automatically jump in the Spare device and resilver the
>>>> pool?
>>>>
>>>> Thanks.
>>>>
>> 
>> 


-- 
Best Regards
Alexander
April, 11 2014
........
[1] mid:53445A96.8090902 at ColoState.EDU
........



More information about the OmniOS-discuss mailing list