[OmniOS-discuss] Testing RSF-1 with zpool/nfs HA

Stephan Budach stephan.budach at JVM.DE
Thu Feb 18 08:58:39 UTC 2016


Am 18.02.16 um 08:59 schrieb Dale Ghent:
> Are you using NFS over TCP or UDP?
>
> If using it over TCP, I would expect the TCP connection to get momentarily unhappy when its connection stalls and packets might need to be retransmitted after the floating IP's new MAC address is asserted. Have you tried UDP instead?
>
> /dale
>
>
>> On Feb 18, 2016, at 1:13 AM, Stephan Budach <stephan.budach at JVM.DE> wrote:
>>
>> Hi,
>>
>> I have been test driving RSF-1 for the last week to accomplish the following:
>>
>> - cluster a zpool, that is made up from 8 mirrored vdevs, which are based on 8 x 2 SSD mirrors via iSCSI from another OmniOS box
>> - export a nfs share from above zpool via a vip
>> - have RSF-1 provide the fail-over and vip-moving
>> - use the nfs share as a repository for my Oracle VM guests and vdisks
>>
>> The setup seems to work fine, but I do have one issue, I can't seem to get solved. Whenever I failover the zpool, any inflight nfs data, will be stalled for some unpredictable time. Sometimes it takes not much longer than the "move" time of the resources but sometimes it takes up to 5 mins. until the nfs client on my VM server becomes alive again.
>>
>> So, when I issue a simple ls -l on the folder of the vdisks, while the switchover is happening, the command somtimes comcludes in 18 to 20 seconds, but sometime ls will just sit there for minutes.
>>
>> I wonder, if there's anything, I could do about that. I have already played with several timeouts, nfs wise and tcp wise, but nothing seem to yield any effect on this issue. Anyone, who knows some tricks to speed up the inflight data?
>>
>> Thanks,
>> Stephan
Yes, NFS is using tcp for it's connection and naturally, this connection 
will hang as long as the connection is broken. However, when ping starts 
working again, all access to the NFS share as of that instance works. 
But… if I issue a ls on the NFS share, while the ping is not yet 
responding, the whole NFS connection hangs until it becomes working 
again and that seems can take a lot of time.

I will try UDP instead of tcp and see, if I can get better results with 
that.

Thanks,
Stephan


More information about the OmniOS-discuss mailing list