[OmniOS-discuss] ZPOOL bug after upgrade to r151020

John Barfield john.barfield at bissinc.com
Mon Apr 10 02:27:40 UTC 2017


Thank you Dan.

Do you happen to have the process or know the location of a process document for only building ZFS?

Ive re-built only nfs from illumos-gate in the past to resolve a bug but im wondering how I would build and install only zfs. (if its even possible).

There are 2 bugs that we're suffering with at two different customer sites that didnt get into r151020 and Im not sure that we can make it till r151022 is released.

Thanks for any advice

John Barfield

On Apr 9, 2017, at 6:55 PM, Dan McDonald <danmcd at omniti.com<mailto:danmcd at omniti.com>> wrote:


On Apr 7, 2017, at 8:26 PM, John Barfield <john.barfield at bissinc.com<mailto:john.barfield at bissinc.com>> wrote:

Greetings,

I just want to report that after a clean istall of r151020 I found a bug whereby importing an older zpool from r151012 and running zpool upgrade causes an SSD cache device size to be reported incorrectly. (only 1 out of 4 devices in this instance)

The cache device size is 93gb and arcstat reported it to be 680gb.

I confirmed by monitoring zpool iostat -v and saw the same size being reported.

We've had a lot of weird io lockups (which is how I found the issue, we didnt notice it until a month after) that brings all of our NFS mounts to a screeching halt and this was the only thing I could find to be out of the ordinary on the system.

CPU average @1% , 20% of ram free, no crazy processes waiting on IO. It was completely invisible. At least from my testing using several dtrace scripts from the net.

I can only assume that the incorrect size reporting caused the zpool to fill the cache drive up beyond its physical capacity during periods of heavy load.

I removed all cache devices and then added them back to the zpool. Then all disks reported correctly again. Format/diskinfo always reported correctly so it was specific to zfs.

We're monitoring the NAS closely to see if the issues occur again.

The only thing I could find that might address the symptoms you see is this:

   https://illumos.org/issues/7504

Which didn't make it upstream in time to hit r151020.

You should forward this on to the illumos ZFS developers' list:  zfs at lists.illumos.org<mailto:zfs at lists.illumos.org>.

Dan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20170410/ba48d7cd/attachment.html>


More information about the OmniOS-discuss mailing list