[OmniOS-discuss] ZFS/NFS Lockup when deleting or copying large files

Doug Hughes doug at will.to
Wed Mar 29 01:03:51 UTC 2017


There are quite a bit of things to look at.

One question: do you have deduplication turned on?

I have seen things like this (with or without dedup) as a result of
memory exhaustion. I happen to know that the x4540 can only hold about
32GB of ram, so this seems like a possibility.

you might want to start collecting vmstat output (e.g. vmstat 5) and if
you can reproduce this easily, watch that. If this is the same thing
that I've experienced, you'll go into desparation free swapping (DE
column > 0 = very bad) and then the machine will just fall over. This
seems to happen in particular with extremely heavy NFS traffic which
uses up a lot of memory buffers that can't be freed up fast enough to
prevent memory exhaustion.

There are a lot of other contributed scripts in the dtrace toolkit that
may be useful too. It's venerable, but still useful.



On 3/28/2017 5:40 PM, John Barfield wrote:
>
> Greetings!
>
>  
>
>  
>
> I have an issue with ZFS/NFS on OmniOS where copying or deleting large
> files causes the SAN to lockup and requires that we reboot it
> sometimes to recover. One customer was trying to copy a 400GB file and
> another tried to delete a 4TB vmdk from a vmware NFS vstore. In both
> situations the issue is reproducible and in both situations I cannot
> seem to find anything crazy happening on either of the boxes.  This
> doesn’t mean anything though because I’m limited in my kstat/dtrace
> knowledge for troubleshooting this type of behavior on illumos.
>
>  
>
> If there is an illumOS bug I’m missing or something that jumps out at
> ya please let me know…of if you need more detailed information I can
> detail both customer locations hardware, software, and operating
> conditions upon request.
>
>  
>
> The VMware customer is running on r151016 and the other customer is
> running on the latest release.
>
>  
>
> _NFSv3 Only Clients_
>
>  
>
> OS:
>
> ESXi 5.5
>
>  
>
> _All 64bit Linux Clients_
>
> CentOS 5,6,&7
>
> Some Older Debian 7 Boxes
>
>  
>
> I should also note that the VMware customer experiences this same
> issue on two different SAN’s that are very different in hardware
> configurations.
>
>  
>
> The one common thread is that they are all Sun Servers. X4170 M2,
> X4270 M2, x4540
>
>  
>
> This thread is being posted to both the illumos email list and the
> omnios email list as well.
>
>  
>
> I’m looking, I guess, for a methodology to follow to trace this down
> and determine if it’s a bug or a misconfiguration either on the client
> or in ZFS or in the NFS server.
>
>  
>
> I can get someone access if so desired.
>
>  
>
> Have a great day!
>
>  
>
> *John Barfield*
>
> *Engineering and Stuff *
>
> * *
>
> M: +1 (214) 425-0783  O: +1 (214) 506-8354 
>
> john.barfield at bissinc.com <mailto:john.barfield at bissinc.com>
>
> * *
>
> *id:image001.png at 01D2837C.19B23B20***
>
> 4925 Greenville Ave, Ste 900
>
> Dallas, TX 75206
>
> * *
>
> _For Support Requests:_
>
> http://support.bissinc.com <http://support.bissinc.com/> or
> support at bissinc.com <mailto:support at bissinc.com>
>
>  
>
> Follow us on Twitter for Network Status &  Updates!
>
>  
>
> id:image002.gif at 01D2837C.19B23B20 <https://twitter.com/johnbarfield> 
>
>  
>
>
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20170328/b52ccdc2/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 3509 bytes
Desc: not available
URL: <https://omniosce.org/ml-archive/attachments/20170328/b52ccdc2/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 1352 bytes
Desc: not available
URL: <https://omniosce.org/ml-archive/attachments/20170328/b52ccdc2/attachment-0001.gif>


More information about the OmniOS-discuss mailing list