[OmniOS-discuss] device probe related command timeouts
John Barfield
john.barfield at bissinc.com
Wed Jan 4 20:57:59 UTC 2017
Hi Joshua,
As requested:
root at PRD-GIP-cpls-san1:/export/home/jbarfield# pstack 18919
18919: sudo diskinfo
fee8e665 pollsys (8047b50, 2, 0, 0)
fee264af pselect (6, 8089c30, 8089c50, fef06320, 0, 0) + 1bf
fee267b8 select (6, 8089c30, 8089c50, 0, 0, 0) + 8e
08055f64 sudo_execute (807c960, 8047ce8, 41e, c0) + 4ba
080606f8 run_command (807c960, 807c9c0, 8047d88, 805dff9) + 4c
0805e03f main (8047d7c, fef076a8, 8047db4, 8054cc3, 2, 8047dc0) + 681
08054cc3 _start (2, 8047e7c, 8047e81, 0, 8047e8a, 8047e95) + 83
root at PRD-GIP-cpls-san1:/export/home/jbarfield# mdb -k
Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc pcplusmp scsi_vhci zfs sata sd ip hook neti sockfs arp usba uhci stmf stmf_sbd md lofs mpt_sas random idm ipc nfs ptm crypto cpc kvm ufs logindmux nsmb smbsrv ]
> 0t18919::pid2proc | ::ps -f
S PID PPID PGID SID UID FLAGS ADDR NAME
R 18919 18022 18919 18022 0 0x5a016000 ffffff24c389d000 sudo diskinfo
> 0t18919::pid2proc | ::walk thread | ::findstack -v
stack pointer for thread ffffff24c5bf4500: ffffff010a032c50
[ ffffff010a032c50 _resume_from_idle+0xf4() ]
ffffff010a032c80 swtch+0x141()
ffffff010a032d10 cv_wait_sig_swap_core+0x1b9(ffffff26a30f47fa,
ffffff26a30f47c0, 0)
ffffff010a032d30 cv_wait_sig_swap+0x17(ffffff26a30f47fa, ffffff26a30f47c0)
ffffff010a032d60 cv_timedwait_sig_hrtime+0x35(ffffff26a30f47fa,
ffffff26a30f47c0, ffffffffffffffff)
ffffff010a032e20 poll_common+0x554(8047b50, 2, 0, 0)
ffffff010a032ec0 pollsys+0xe7(8047b50, 2, 0, 0)
ffffff010a032f10 _sys_sysenter_post_swapgs+0x149()
>
On 1/4/17, 2:52 PM, "Joshua M. Clulow" <josh at sysmgr.org> wrote:
On 4 January 2017 at 12:29, John Barfield <john.barfield at bissinc.com> wrote:
> I’ve got a SAN that seems to be timing out on any hardware probing commands such as “format” or “diskinfo” although prtconf seems to work.
>
> Does anyone happen to have a dtrace one liner or maybe kstat command I can run to see why/what they’re hanging on?
I would start by running "pstack" with the pid of one of the stuck
processes. That will give you the part of the user program which is
stuck. Then, I would get the in-kernel state of the stuck threads;
e.g., looking at my bash process:
asgard # echo $$
45435
asgard # ps -fp 45435
UID PID PPID C STIME TTY TIME CMD
root 45435 45433 0 20:47:17 pts/3 0:00 -bash
asgard # mdb -k
Loading modules: [ unix genunix specfs ... ]
> 0t45435::pid2proc | ::ps -f
S PID PPID PGID SID UID FLAGS ADDR NAME
R 45435 45433 45435 45435 0 0x4a014000 ffffff1b14d33048 -bash
> 0t45435::pid2proc | ::walk thread | ::findstack -v
stack pointer for thread ffffff03f776c080: ffffff0011b57c10
[ ffffff0011b57c10 _resume_from_idle+0x112() ]
ffffff0011b57c40 swtch+0x141()
ffffff0011b57cd0 cv_wait_sig_swap_core+0x1b9(ffffff1b14d33108, ...)
ffffff0011b57cf0 cv_wait_sig_swap+0x17(ffffff1b14d33108, ...)
ffffff0011b57da0 waitid+0x315(7, 0, ffffff0011b57e30, f)
ffffff0011b57eb0 waitsys32+0x36(7, 0, 8047750, f)
ffffff0011b57f10 sys_syscall32+0x123()
That might tell us where in the storage subsystem you're getting stuck.
Cheers.
--
Joshua M. Clulow
UNIX Admin/Developer
http://blog.sysmgr.org
More information about the OmniOS-discuss
mailing list