[OmniOS-discuss] kernel panic - anon_decref
Saso Kiselkov
skiselkov.ml at gmail.com
Fri Nov 15 16:17:46 UTC 2013
On 11/15/13, 5:39 AM, wuffers wrote:
> So I'm adding VMware hosts (ESXi 5.5) to my OmniOS ZFS SAN, which are
> already hosting some volumes for our Windows 2012 Hyper-V
> infrastructure, running over SRP and Infiniband. In VMware, I had
> uninstalled the default Mellanox 1.9.7 drivers and installed the older
> 1.6.1 drivers along with OFED 1.8.2. I had no issues adding the new
> initiator to the target group, and creating a new host group and view
> for the host - after which the volume automagically showed up as expected.
>
> I formatted using VMFS5, and started creating a VM, attaching an ISO and
> loading up Windows Server 2012 R2. Somewhere during the install, I had
> my first kernel panic and I had to reboot the SAN as it was during
> business hours (couldn't wait for the dump to finish). Later that night
> I reproduced the issue (just loading up VMs, and trying out a VMware
> converter job) and was able to get a proper dump (which is now sitting
> in my /var/crash/unknown, ~7GB).
>
> Screenshots:
> http://i.imgur.com/nGakKyS.png?1
> http://i.imgur.com/wIx0g6J.png?1
>
>
> TIME UUID
> SUNW-MSG-ID
> Nov 14 2013 22:13:46.926077000 a4432472-983c-ca82-a231-d1b468a3a91a
> SUNOS-8000-KL
>
> TIME CLASS ENA
> Nov 14 22:13:46.8830 ireport.os.sunos.panic.dump_available
> 0x0000000000000000
> Nov 14 22:12:33.1029 ireport.os.sunos.panic.dump_pending_on_device
> 0x0000000000000000
>
> nvlist version: 0
> version = 0x0
> class = list.suspect
> uuid = a4432472-983c-ca82-a231-d1b468a3a91a
> code = SUNOS-8000-KL
> diag-time = 1384485226 890408
> de = fmd:///module/software-diagnosis
> fault-list-sz = 0x1
> fault-list = (array of embedded nvlists)
> (start fault-list[0])
> nvlist version: 0
> version = 0x0
> class = defect.sunos.kernel.panic
> certainty = 0x64
> asru =
> sw:///:path=/var/crash/unknown/.a4432472-983c-ca82-a231-d1b468a3a91a
> resource =
> sw:///:path=/var/crash/unknown/.a4432472-983c-ca82-a231-d1b468a3a91a
> savecore-succcess = 1
> dump-dir = /var/crash/unknown
> dump-files = vmdump.0
> os-instance-uuid = a4432472-983c-ca82-a231-d1b468a3a91a
> panicstr = anon_decref: slot count 0
> panicstack = fffffffffbb2fa18 () | genunix:anon_free+74
> () | genunix:segvn_free+242 () | genunix:seg_free+30 () |
> genunix:segvn_unmap+cde () | genunix:as_free+e7 () | genunix:relvm+220
> () | genunix:proc_exit+454 () | genunix:exit+15 () | genunix:rexit+18 ()
> | unix:brand_sys_sysenter+1c9 () |
> crashtime = 1384482703
> panic-time = Thu Nov 14 21:31:43 2013 EST
> (end fault-list[0])
>
> fault-status = 0x1
> severity = Major
> __ttl = 0x1
> __tod = 0x5285916a 0x3732d048
>
> While getting the Hyper-V hosts up on IB and SRP I had issues with the
> Windows hosts but never with the SAN box, and they have now been running
> stable for 3+ months until the kernel panic today. I saw some other
> anon_decref bugs, but those were in 2007-2008 and have already been
> rolled into OmniOS. I'm pretty sure I was on the original r151006, and
> now am on the latest r151006y, in hopes it's already taken care of. I'll
> try other things to see if I can reproduce on the latest build.
>
> In the meantime, does anyone want to take a look at the dump?
So, just to clear up, the SAN box is also running inside VMware? Or is
it on real hardware?
Cheers,
--
Saso
More information about the OmniOS-discuss
mailing list