[OmniOS-discuss] issue importing zpool on S11.1 from omniOS LUNs
Stephan Budach
stephan.budach at jvm.de
Wed Jan 25 22:41:19 UTC 2017
Hi Richard,
Am 25.01.17 um 20:27 schrieb Richard Elling:
> Hi Stephan,
>
>> On Jan 25, 2017, at 5:54 AM, Stephan Budach <stephan.budach at JVM.DE
>> <mailto:stephan.budach at jvm.de>> wrote:
>>
>> Hi guys,
>>
>> I have been trying to import a zpool, based on a 3way-mirror provided
>> by three omniOS boxes via iSCSI. This zpool had been working
>> flawlessly until some random reboot of the S11.1 host. Since then,
>> S11.1 has been importing this zpool without success.
>>
>> This zpool consists of three 108TB LUNs, based on a raidz-2 zvols…
>> yeah I know, we shouldn't have done that in the first place, but
>> performance was not the primary goal for that, as this one is a
>> backup/archive pool.
>>
>> When issueing a zpool import, it says this:
>>
>> root at solaris11atest2:~# zpool import
>> pool: vsmPool10
>> id: 12653649504720395171
>> state: DEGRADED
>> status: The pool was last accessed by another system.
>> action: The pool can be imported despite missing or damaged devices. The
>> fault tolerance of the pool may be compromised if imported.
>> see: http://support.oracle.com/msg/ZFS-8000-EY
>> config:
>>
>> vsmPool10 DEGRADED
>> mirror-0 DEGRADED
>> c0t600144F07A3506580000569398F60001d0 DEGRADED corrupted data
>> c0t600144F07A35066C00005693A0D90001d0 DEGRADED corrupted data
>> c0t600144F07A35001A00005693A2810001d0 DEGRADED corrupted data
>>
>> device details:
>>
>> c0t600144F07A3506580000569398F60001d0 DEGRADED scrub/resilver
>> needed
>> status: ZFS detected errors on this device.
>> The device is missing some data that is recoverable.
>>
>> c0t600144F07A35066C00005693A0D90001d0 DEGRADED scrub/resilver
>> needed
>> status: ZFS detected errors on this device.
>> The device is missing some data that is recoverable.
>>
>> c0t600144F07A35001A00005693A2810001d0 DEGRADED scrub/resilver
>> needed
>> status: ZFS detected errors on this device.
>> The device is missing some data that is recoverable.
>>
>> However, when actually running zpool import -f vsmPool10, the system
>> starts to perform a lot of writes on the LUNs and iostat report an
>> alarming increase in h/w errors:
>>
>> root at solaris11atest2:~# iostat -xeM 5
>> extended device statistics ----
>> errors ---
>> device r/s w/s Mr/s Mw/s wait actv svc_t %w %b s/w h/w
>> trn tot
>> sd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0
>> 0 0
>> sd1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0
>> 0 0
>> sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 71
>> 0 71
>> sd3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0
>> 0 0
>> sd4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0
>> 0 0
>> sd5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0
>> 0 0
>> extended device statistics ----
>> errors ---
>> device r/s w/s Mr/s Mw/s wait actv svc_t %w %b s/w h/w
>> trn tot
>> sd0 14.2 147.3 0.7 0.4 0.2 0.1 2.0 6 9 0 0
>> 0 0
>> sd1 14.2 8.4 0.4 0.0 0.0 0.0 0.3 0 0 0 0
>> 0 0
>> sd2 0.0 4.2 0.0 0.0 0.0 0.0 0.0 0 0 0 92
>> 0 92
>> sd3 157.3 46.2 2.1 0.2 0.0 0.7 3.7 0 14 0 30
>> 0 30
>> sd4 123.9 29.4 1.6 0.1 0.0 1.7 10.9 0 36 0 40
>> 0 40
>> sd5 142.5 43.0 2.0 0.1 0.0 1.9 10.2 0 45 0 88
>> 0 88
>> extended device statistics ----
>> errors ---
>> device r/s w/s Mr/s Mw/s wait actv svc_t %w %b s/w h/w
>> trn tot
>> sd0 0.0 234.5 0.0 0.6 0.2 0.1 1.4 6 10 0 0
>> 0 0
>> sd1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0
>> 0 0
>> sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 92
>> 0 92
>> sd3 3.6 64.0 0.0 0.5 0.0 4.3 63.2 0 63 0 235
>> 0 235
>> sd4 3.0 67.0 0.0 0.6 0.0 4.2 60.5 0 68 0 298
>> 0 298
>> sd5 4.2 59.6 0.0 0.4 0.0 5.2 81.0 0 72 0 406
>> 0 406
>> extended device statistics ----
>> errors ---
>> device r/s w/s Mr/s Mw/s wait actv svc_t %w %b s/w h/w
>> trn tot
>> sd0 0.0 234.8 0.0 0.7 0.4 0.1 2.2 11 10 0 0
>> 0 0
>> sd1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0
>> 0 0
>> sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 92
>> 0 92
>> sd3 5.4 54.4 0.0 0.3 0.0 2.9 48.5 0 67 0 384
>> 0 384
>> sd4 6.0 53.4 0.0 0.3 0.0 4.6 77.7 0 87 0 519
>> 0 519
>> sd5 6.0 60.8 0.0 0.3 0.0 4.8 72.5 0 87 0 727
>> 0 727
>
> h/w errors are a classification of other errors. The full error list
> is available from "iostat -E" and will
> be important to tracking this down.
>
> A better, more detailed analysis can be gleaned from the "fmdump -e"
> ereports that should be
> associated with each h/w error. However, there are dozens of causes of
> these so we don’t have
> enough info here to fully understand.
> — richard
>
Well… I can't provide you with the output of fmdump -e (since I am
currently unable to get the '-' typed in to the console, due to some
fancy keyboard layout issues and nit being able to login via ssh as well
(can authenticate, but I don't get to the shell, which may be due to the
running zpool import), but I can confirm that fmdump does show nothing
at all. I could just reset the S11.1 host, after removing the
zpool.cache file, such as that the system will not try to import the
zpool upon restart right away…
…plus I might get the option to set the keyboard right, after reboot,
but that's another issue…
Thanks,
Stephan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20170125/fb93ee4f/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5546 bytes
Desc: not available
URL: <https://omniosce.org/ml-archive/attachments/20170125/fb93ee4f/attachment-0001.bin>
More information about the OmniOS-discuss
mailing list