[OmniOS-discuss] NFS Datastore vmware esxi failover

Saso Kiselkov skiselkov.ml at gmail.com
Fri Nov 8 17:36:07 UTC 2013


On 11/8/13, 4:17 PM, Matt Weiss wrote:
> I am working on a failover script using OmniOS as a NFS server.
> 
> According to VMware, if I mount and nfs datastore via its IP Address
> then I should be able to move the IP around and still mount it, however
> it is not working right.
> 
> For example:
> 
> On an ESXi instance (5.1U1) I mount the following NFS Datastore
> 172.16.50.100
> /tank/vmrep
> which amounts to a UUID of 6c0c1d0d-928ef591 in /vmfs/volumes
> 
> 
> omni-rep1: 172.16.50.1
> omni-rep2: 172.16.50.2
> 
> I am using zrep to failover my zfs dataset.
> http://www.bolthole.com/solaris/zrep/zrep.documentation.html
> 
> Essential, it puts primary into read-only, does a zfs send/receive, then
> sets the secondary to rw.
> 
> 
> To expose my dataset (tank/vmrep) I am using sharenfs property of zfs. I
> have created a virtual ip to use for this purpose.
> 
> #setnfsip.sh
> ipadm create-addr -T static -a 172.16.50.100/24 vmxnet3s0/nfs
> 
> #removenfsip.sh
> ipadm delete-addr vmxnet3s0/nfs
> 
> 
> So, when I want to failover, I just do the following:
> 
> #!/bin/sh
> #zfs unshare tank/vmrep
> #sleep 5
> /scripts/removenfsip.sh
> sync
> sleep 5
> #zrep sync tank/vmrep
> #sleep 5
> #the following does the zfs snapshot/send/receive
> zrep failover tank/vmrep
> sleep 5
> #ssh 172.16.50.2 /usr/sbin/zfs share tank/vmrep
> #sleep 5
> ssh 172.16.50.2 /scripts/setnfsip.sh
> 
> 
> So, all goes well, omni-rep2 is now exporting tank/vmrep with NFS, it
> has the 172.16.50.100 ip address, it is the exact replica of omni-rep1.
> 
> The problem is in ESXi the datastore goes inaccessable.  I can fail back
> and the datastore comes back online like fine.  I can mount the nfs
> datastore as a new one with the .100 ip on omni-rep2 so it is being
> exported properly.
> 
> According to the last paragraph of this
> 
> https://communities.netapp.com/community/netapp-blogs/getvirtical/blog/2011/09/28/nfs-datastore-uuids-how-they-work-and-what-changed-in-vsphere-5
> 
> 
> It should work, I have merely changed which host is broadcasting my
> datastore's IP address.
> 
> I know a guy named saso? did some iScsi failover recently and noted it
> worked with NFS.  I am just wondering what I am missing here.

I haven't done NFS datastore failover from ESXi myself, but off the top
of my head I guess what's going haywire here is that you're setting the
dataset read-only before moving it over. Don't do that. Simply tear down
the IP address, migrate the dataset, set up a new NFS share on the
target machine and then reinstate the IP address at the target. ESXi
aggressively monitors the health of its datastores and if it gets to a
state it can't deal with (e.g. write a datastore that refuses to process
it), it will offline the whole datastore, awaiting administrator
intervention.

Don't worry about the datastore being offline for a while, ESXi will
hold VM writes and the VMs themselves won't usually complain for up to
1-2 minutes (defaults on Windows/Linux).

-- 
Saso


More information about the OmniOS-discuss mailing list