[OmniOS-discuss] kernel panic "kernel heap corruption detected" when creating zero eager disks

Thu Mar 26 21:15:39 UTC 2015

On Thu, Mar 26, 2015 at 1:05 PM, Dan McDonald <danmcd at omniti.com> wrote:

>
> WRITE_SAME is one of the four VAAI primitives.  Nexenta wrote this code
> for NS, and upstreamed two of them:
>
> WRITE_SAME is "hardware assisted erase".
>
> UNMAP is "hardware assisted freeing".
>
> Those are in upstream illumos.
>
> ATS is atomic-test-and-set or "hardware assisted fine-grained locking".
>
> XCOPY is "hardware assisted copying".
>
> These are in NexentaStor, and after being held back, were open-sourced,
> but not yet upstreamed.
>

>

Ahh, VAAI. I suspect this is a bigger bite to chew, looking back at some
prior discussions on this list (although I'm sure many are anxiously
awaiting this to be upstreamed). I'm guessing Microsoft's ODX will also be
supported since I understand that is just an XCOPY. I see that FreeNAS now
has support for both VAAI and ODX - are they porting stuff from the various
Illumos distros (including the referenced Nexenta work on VAAI or is it
their own implementation)?

After some more reading to answer my own questions, I came across this
VMware blog post (
http://blogs.vmware.com/vsphere/2012/06/low-level-vaai-behaviour.html):

"The following provisioning tasks are accelerated by the use of the WRITE
SAME command:

Cloning operations for eagerzeroedthick target disks.
Allocating new file blocks for thin provisioned virtual disks.
Initializing previous unwritten file blocks for zerothick virtual disks."

I don't seem to have issues allocating smaller amounts of space, so I
suspect that using thin or lazy zero will work.

Secondly, it *might* just be the vSphere fat client, as I found another
VMware KB (
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2058287)
which states I cannot make a disk larger than 4TB, which contradicts this
properties dialog:

http://i.imgur.com/f9liqpR.png
(says maximum file size of 2TB in vSphere fat client)

versus:
http://i.imgur.com/6Ya3oH4.png
(says maximum file size 64TB in the vSphere web client)

The KB goes on to state, "Checking the size of the newly created or
expanded VMDK, you find that it is 4 TB." is untrue, because it allocated
and is using 10TB. Don't know how much to trust that info as it seems
contradictory. Still, it shouldn't cause the kernel panic like it did.

Thirdly, it appears I can disable any of the VAAI primitives in the host
configuration, if all else fails (since we've determined that it is likely
caused by WRITE_SAME). Good read on this via the VAAI FAQ here (which shows
you how to check the properties via the ESX CLI):
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1021976

So here's what I will attempt to test:
- Create thin vmdk @ 10TB with vSphere fat client
- Create lazy zeroed vmdk @ 10 TB with vSphere fat client
- Create eager zeroed vmdk @ 10 TB with vSphere web client
- Create thin vmdk @ 10TB with vSphere web client
- Create lazy zeroed vmdk @ 10 TB with vSphere web client

So it seems I do have alternatives
(disabling DataMover.HardwareAcceleratedMove as a last resort).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150326/457672aa/attachment.html>