[OmniOS-discuss] ZFS/NFS Lockup when deleting or copying large files

John Barfield john.barfield at bissinc.com
Wed Mar 29 13:46:57 UTC 2017


zilstat showed all zeros.

In fact I emailed Richard Elling about it to be sure that I was using it properly.

iostat, prstat, & arcstat stayed very busy the whole time though.

Ill pay more attention nexttime.

John Barfield

On Mar 29, 2017, at 1:22 AM, Knut Erik Sørvik <kes at teknograd.no<mailto:kes at teknograd.no>> wrote:

Hi

I've had similar things happen, and in our case I think this might be related:

https://github.com/openzfs/openzfs/pull/214

A simple zpool iostat on the pool, with a frequent update, showed that the write column were all zeroes for an extended time, causing the clients to believe the storage was gone.

kes


Den 28. mar. 2017 kl. 23.54 skrev John Barfield <john.barfield at bissinc.com<mailto:john.barfield at bissinc.com>>:

Greetings!


I have an issue with ZFS/NFS on OmniOS where copying or deleting large files causes the SAN to lockup and requires that we reboot it sometimes to recover. One customer was trying to copy a 400GB file and another tried to delete a 4TB vmdk from a vmware NFS vstore. In both situations the issue is reproducible and in both situations I cannot seem to find anything crazy happening on either of the boxes.  This doesn’t mean anything though because I’m limited in my kstat/dtrace knowledge for troubleshooting this type of behavior on illumos.

If there is an illumOS bug I’m missing or something that jumps out at ya please let me know…of if you need more detailed information I can detail both customer locations hardware, software, and operating conditions upon request.

The VMware customer is running on r151016 and the other customer is running on the latest release.

NFSv3 Only Clients

OS:
ESXi 5.5

All 64bit Linux Clients
CentOS 5,6,&7
Some Older Debian 7 Boxes

I should also note that the VMware customer experiences this same issue on two different SAN’s that are very different in hardware configurations.

The one common thread is that they are all Sun Servers. X4170 M2, X4270 M2, x4540

This thread is being posted to both the illumos email list and the omnios email list as well.

I’m looking, I guess, for a methodology to follow to trace this down and determine if it’s a bug or a misconfiguration either on the client or in ZFS or in the NFS server.

I can get someone access if so desired.

Have a great day!

John Barfield
Engineering and Stuff

M: +1 (214) 425-0783  O: +1 (214) 506-8354
john.barfield at bissinc.com<mailto:john.barfield at bissinc.com>

<image001.png>
4925 Greenville Ave, Ste 900
Dallas, TX 75206

For Support Requests:
http://support.bissinc.com<https://url12.mailanyone.net/v1/?m=1csz4e-00085p-3G&i=57e1b682&c=9IlDjjPpk-5UCrCMFk6HSjMe2nUi5cI3kwNmKHpQpR9kuxm7fU-b_hC5P0WeTyBGrDz0zay6r2YYrImrvUikzW2dywbVvROrkwpR2S9sd89tBkKu_N60XPp9f1g8jpM5mcVhpU0WQHse1GpHuHUsFy4410p5FJ4QxX2I9bHqDdLuWhguJDTCpBYBI-w8hs5XEnvzQWtR19ud-QDucXKPmw_v5b47JftMIzAYoDY2JXA> or support at bissinc.com<mailto:support at bissinc.com>

Follow us on Twitter for Network Status &  Updates!

<image002.gif><https://url12.mailanyone.net/v1/?m=1csz4e-00085p-3G&i=57e1b682&c=Oost5JgpPBN1Zl_l5wMwGG464L9v13c1IIli6CSlLHKKgRmK3NNtCgg7AiwNB6_-CxXM4D7vwvp9JwZRk5wldseR6wBniGTVUxLaJDPRdW-FADOVCD-Cxk8sY-PcmC5r_usv0LRVfVLQ3jfedqEGaCVEID6UGIr7sRdoxG_EXVGjRqq4b4iK0SarW3fHfayNz5x06aW9LbenkcGoORjej4Vp4F97E0Rxs7UyU21NN8I>

_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com<mailto:OmniOS-discuss at lists.omniti.com>
http://lists.omniti.com/mailman/listinfo/omnios-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20170329/6817b4f7/attachment-0001.html>


More information about the OmniOS-discuss mailing list