[OmniOS-discuss] Hung ZFS Pool
Brian Hechinger
wonko at 4amlunch.net
Wed Dec 9 15:20:15 UTC 2015
I cannot ^C out of the touch.
wonko at basket1:/export/home/wonko$ ps -ef | grep touch
root 2459 2447 0 08:12:09 ? 0:00 touch /zoom/hi
root 2050 2049 0 Dec 07 ? 0:00 touch hi
root 2049 1 0 Dec 07 ? 0:00 sudo touch hi
Also, kill -9 doesn’t touch them.
the only thing in messages is:
Dec 7 14:31:56 basket1 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-HC, TYPE: Error, VER: 1, SEVERITY: Major
Dec 7 14:31:56 basket1 EVENT-TIME: Mon Dec 7 14:31:56 EST 2015
Dec 7 14:31:56 basket1 PLATFORM: X8DTL, CSN: 1234567890, HOSTNAME: basket1
Dec 7 14:31:56 basket1 SOURCE: zfs-diagnosis, REV: 1.0
Dec 7 14:31:56 basket1 EVENT-ID: 585f9fa2-4a84-4184-8c87-c2f9c600e1a1
Dec 7 14:31:56 basket1 DESC: The ZFS pool has experienced currently unrecoverable I/O
Dec 7 14:31:56 basket1 failures. Refer to http://illumos.org/msg/ZFS-8000-HC for more information.
Dec 7 14:31:56 basket1 AUTO-RESPONSE: No automated response will be taken.
Dec 7 14:31:56 basket1 IMPACT: Read and write I/Os cannot be serviced.
Dec 7 14:31:56 basket1 REC-ACTION: Make sure the affected devices are connected, then run
Dec 7 14:31:56 basket1 'zpool clear’.
I can definitely share a kernel coredump, that’s not a problem. Just need to schedule a time to shut down all the VMs first.
Maybe later tonight.
-brian
> On Dec 9, 2015, at 10:16 AM, Dan McDonald <danmcd at omniti.com> wrote:
>
>
>> On Dec 9, 2015, at 8:14 AM, Brian Hechinger <wonko at 4amlunch.net> wrote:
>>
>> So read access appears to be ok. Writes are totally boned, however. That touch just hangs forever.
>>
>> So what do I need to do to provide you all with the information you need to diagnose this.
>
> Do you literally have a touch process hanging right now? Or is it something you can ^C out of?
>
> Does anything stand out in /var/adm/messages? Maybe the kernel is complaining about something there.
>
> My final inclination is heavy-handed:
>
> - Make sure you have at least one process stuck on writing to that filesystem.
>
> - "reboot -d" and take a kernel coredump
>
> Unless you have sensitive information, a kernel coredump you can share would be the best thing to do.
>
>
> Dan
>
> p.s. I'm at the Dr. the rest of the day starting in 90 mins, pardon any latency.
More information about the OmniOS-discuss
mailing list