[OmniOS-discuss] kernel panic - anon_decref

Saso Kiselkov skiselkov.ml at gmail.com
Fri Nov 15 16:17:46 UTC 2013


On 11/15/13, 5:39 AM, wuffers wrote:
> So I'm adding VMware hosts (ESXi 5.5)  to my OmniOS ZFS SAN, which are
> already hosting some volumes for our Windows 2012 Hyper-V
> infrastructure, running over SRP and Infiniband. In VMware, I had
> uninstalled the default Mellanox 1.9.7 drivers and installed the older
> 1.6.1 drivers along with OFED 1.8.2. I had no issues adding the new
> initiator to the target group, and creating a new host group and view
> for the host - after which the volume automagically showed up as expected.
> 
> I formatted using VMFS5, and started creating a VM, attaching an ISO and
> loading up Windows Server 2012 R2. Somewhere during the install, I had
> my first kernel panic and I had to reboot the SAN as it was during
> business hours (couldn't wait for the dump to finish). Later that night
> I reproduced the issue (just loading up VMs, and trying out a VMware
> converter job) and was able to get a proper dump (which is now sitting
> in my /var/crash/unknown, ~7GB).
> 
> Screenshots:
> http://i.imgur.com/nGakKyS.png?1
> http://i.imgur.com/wIx0g6J.png?1
> 
> 
> TIME                           UUID                                
> SUNW-MSG-ID
> Nov 14 2013 22:13:46.926077000 a4432472-983c-ca82-a231-d1b468a3a91a
> SUNOS-8000-KL
> 
>   TIME                 CLASS                                 ENA
>   Nov 14 22:13:46.8830 ireport.os.sunos.panic.dump_available
> 0x0000000000000000
>   Nov 14 22:12:33.1029 ireport.os.sunos.panic.dump_pending_on_device
> 0x0000000000000000
> 
> nvlist version: 0
>         version = 0x0
>         class = list.suspect
>         uuid = a4432472-983c-ca82-a231-d1b468a3a91a
>         code = SUNOS-8000-KL
>         diag-time = 1384485226 890408
>         de = fmd:///module/software-diagnosis
>         fault-list-sz = 0x1
>         fault-list = (array of embedded nvlists)
>         (start fault-list[0])
>         nvlist version: 0
>                 version = 0x0
>                 class = defect.sunos.kernel.panic
>                 certainty = 0x64
>                 asru =
> sw:///:path=/var/crash/unknown/.a4432472-983c-ca82-a231-d1b468a3a91a
>                 resource =
> sw:///:path=/var/crash/unknown/.a4432472-983c-ca82-a231-d1b468a3a91a
>                 savecore-succcess = 1
>                 dump-dir = /var/crash/unknown
>                 dump-files = vmdump.0
>                 os-instance-uuid = a4432472-983c-ca82-a231-d1b468a3a91a
>                 panicstr = anon_decref: slot count 0
>                 panicstack = fffffffffbb2fa18 () | genunix:anon_free+74
> () | genunix:segvn_free+242 () | genunix:seg_free+30 () |
> genunix:segvn_unmap+cde () | genunix:as_free+e7 () | genunix:relvm+220
> () | genunix:proc_exit+454 () | genunix:exit+15 () | genunix:rexit+18 ()
> | unix:brand_sys_sysenter+1c9 () |
>                 crashtime = 1384482703
>                 panic-time = Thu Nov 14 21:31:43 2013 EST
>         (end fault-list[0])
> 
>         fault-status = 0x1
>         severity = Major
>         __ttl = 0x1
>         __tod = 0x5285916a 0x3732d048
> 
> While getting the Hyper-V hosts up on IB and SRP I had issues with the
> Windows hosts but never with the SAN box, and they have now been running
> stable for 3+ months until the kernel panic today. I saw some other
> anon_decref bugs, but those were in 2007-2008 and have already been
> rolled into OmniOS. I'm pretty sure I was on the original r151006, and
> now am on the latest r151006y, in hopes it's already taken care of. I'll
> try other things to see if I can reproduce on the latest build.
> 
> In the meantime, does anyone want to take a look at the dump?

So, just to clear up, the SAN box is also running inside VMware? Or is
it on real hardware?

Cheers,
-- 
Saso



More information about the OmniOS-discuss mailing list