[OmniOS-discuss] ZFS Questions. Is a bug ?
Narayan Desai
narayan.desai at gmail.com
Mon Sep 2 02:07:36 UTC 2013
As far as we've been able to tell, zfs replace is a one way street; once
you start the replace, there doesn't seem to be a way to cancel it until it
is completed.
Also, resilvers appear to start from scratch any time anything about the
pool changes. Do you have a drive that is flapping offline and coming back,
or something like that? Are you getting any messages in /var/adm/messages
about disk devices?
Considering the dire appearance of that pool, you might consider trying to
boost resilver priority. We found this:
http://my2ndhead.blogspot.com/2011/03/adjusting-zfs-resilvering-speed.html
to work well to improve overall resilver performance (at the cost of
pending IO requests from clients), ymmv.
-nld
On Fri, Aug 30, 2013 at 1:26 PM, "Daniel D. Gonçalves" <
daniel at dgnetwork.com.br> wrote:
> My ZFS POOL is over a month doing RESILVER in LOOP thus ending one
> RESILVER after a few minutes, another starts.
> The replace command is never finished, did three days ago subsitution of a
> device, and it never ends:
> mirror-3 DEGRADED 0 0 28
> c17t20d1 ONLINE 0 0 28
> replacing-1 DEGRADED 28 0 0
> c17t22d1 UNAVAIL 0 0 0 cannot open
> c17t13d1 ONLINE 0 0 28 (resilvering)
>
>
> In the mirror belowI would like to remove all devices with status UNAVAIL
> and do a replace again for a correct device, but the commands OFFLINE,
> REMOVE, DETACH not work:
> mirror-1 DEGRADED 28 0 0
> c17t24d1 ONLINE 0 0 28 (resilvering)
> replacing-1 UNAVAIL 0 0 0 insufficient replicas
> c17t22d1 UNAVAIL 0 0 0 cannot open
> c17t12d1 UNAVAIL 0 0 0 cannot open
> c17t21d1 UNAVAIL 0 0 0 cannot open
>
> My entire POOL:
> pool: STORAGE01
> state: DEGRADED
> status: One or more devices is currently being resilvered. The pool will
> continue to function, possibly in a degraded state.
> action: Wait for the resilver to complete.
> scan: resilver in progress since Fri Aug 30 14:42:42 2013
> 530G scanned out of 18.4T at 227M/s, 23h1m to go
> 62.1G resilvered, 2.80% done
> config:
>
> NAME STATE READ WRITE CKSUM
> STORAGE01 DEGRADED 14 0 16
> mirror-0 ONLINE 0 0 0
> c17t15d1 ONLINE 0 0 0
> c17t19d1 ONLINE 0 0 0
> mirror-1 DEGRADED 28 0 0
> c17t24d1 ONLINE 0 0 28 (resilvering)
> replacing-1 UNAVAIL 0 0 0 insufficient replicas
> c17t22d1 UNAVAIL 0 0 0 cannot open
> c17t12d1 UNAVAIL 0 0 0 cannot open
> c17t21d1 UNAVAIL 0 0 0 cannot open
> mirror-2 ONLINE 0 0 0
> c17t18d1 ONLINE 0 0 0 (resilvering)
> c17t17d1 ONLINE 0 0 0 (resilvering)
> mirror-3 DEGRADED 0 0 32
> c17t20d1 ONLINE 0 0 32
> replacing-1 DEGRADED 32 0 0
> c17t22d1 UNAVAIL 0 0 0 cannot open
> c17t13d1 ONLINE 0 0 32 (resilvering)
> mirror-5 ONLINE 0 0 0
> c17t25d1 ONLINE 0 0 0
> c17t27d1 ONLINE 0 0 0
> mirror-6 ONLINE 0 0 0
> c17t26d1 ONLINE 0 0 0
> c17t28d1 ONLINE 0 0 0
> mirror-7 ONLINE 0 0 0
> c17t29d1 ONLINE 0 0 0
> c17t31d1 ONLINE 0 0 0
> mirror-8 ONLINE 0 0 0
> c17t32d1 ONLINE 0 0 0
> c17t30d1 ONLINE 0 0 0
> mirror-9 ONLINE 0 0 0
> c17t23d1 ONLINE 0 0 0
> c17t14d1 ONLINE 0 0 0
> logs
> mirror-4 ONLINE 0 0 0
> c14t1d0 ONLINE 0 0 0
> c14t3d0 ONLINE 0 0 0
> cache
> c14t4d0 ONLINE 0 0 0
>
>
> Need urgent help to solve this. I believe it is a bug in ZFS.
>
> Thanks,
>
> Daniel
>
> Em 22/08/2013 17:42, Saso Kiselkov escreveu:
>
>> On 8/22/13 9:20 PM, "Daniel D. Gonçalves" wrote:
>>
>>> Thanks Saso,
>>>
>>> To stop RESILVER, which device I to set to OFFLINE?
>>>
>> The one that says 'resilvering'. But beware that that means that the
>> pool might not have full fault tolerance.
>>
>> I do not know how the device "c17t33d1" was placed in the
>>> MIRROR-11/REPLACING-1, how do I remove it from there?
>>>
>> If you can, let it run to completion before attempting any further
>> manipulation. The pool seems to be in quite an unhappy state anyway, so
>> better not compound the situation by doing more changes. Let the thing
>> resync back up, find the files that have the data errors in them ("zpool
>> status -v" I think), restore them or delete them and then post a new
>> "zpool status" to the list - then we'll see what can be done.
>>
>> Above all, be patient if you don't want to lose your data.
>>
>> Cheers,
>>
>
> ______________________________**_________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.**com <OmniOS-discuss at lists.omniti.com>
> http://lists.omniti.com/**mailman/listinfo/omnios-**discuss<http://lists.omniti.com/mailman/listinfo/omnios-discuss>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20130901/77f84391/attachment.html>
More information about the OmniOS-discuss
mailing list