[OmniOS-discuss] ZFS crash/reboot loop

Sun Jul 12 19:21:07 UTC 2015

First action:
If you can mount the pool read-only, update your backup

Then
I would expect that a single bad disk is the reason of the problem on a 
write command. I would first check the system and fault log or 
smartvalues for hints about a bad disk. If there is a suspicious disk, 
remove that and retry a regular import.

If there is no hint
Next what I would try is a pool export. Then create a script that 
imports the pool followed by a scrub cancel. (Hope that the cancel is 
faster than the crash). Then check logs during some pool activity.

If this does not help, I would remove all data disks and bootup.
Then hot-plug disk by disk and check if its detected properly and check 
logs. Your pool remains offline until enough disks come back.
Adding disk by disk and checking logs should help to find a bad disk 
that initiates a crash

Next option is, try a pool import where always one or next disk is 
missing. Until there is no write, missing disks are not a problem with 
ZFS (you may need to clear errors).

Last option:
use another server where you try to import (mainboard, power,  hba or 
backplane problem) remove all disks and do a nondestructive or smart 
test on another machine

Gea

On 12.07.2015 20:43, Derek Yarnell wrote:
>> The on-going scrub automatically restarts, apparently even in read-only
>> mode.  You should 'zpool scrub -s poolname' ASAP after boot (if you can)
>> to stop the ongoing scrub.
> We have tried to stop the scrub but it seems you can not cancel a scrub
> when the pool is mounted readonly.
>