[OmniOS-discuss] OmniOS based redundant NFS

Mon Oct 2 17:39:07 UTC 2017

Thank you, Stephan,
I will sure use custom guuids! And thanks for pointing at failure mode.
I have tried exporting physical disks with sbdadm, it worked:
===
 sbdadm list-lu

Found 2 LU(s)

             GUID                    DATA SIZE           SOURCE
--------------------------------  -------------------  ----------------
600144f0dc4db979000059cd2e460001  146815668224
/dev/rdsk/c6t500000E117B07742d0
600144f0dc4db979000059cd2e4c0002  146815668224
/dev/rdsk/c7t500000E117A9AD12d0
===
I am thinking, I can have on the target server many disks exported
with different lun numbers, or each disk having a separate target for
export. I think many lun numbers in the same target would be easier to
manage.
Also, I've tested a setup where both layers: ISCSI target and NFS head
are on the same machines, and ZPOOL to be used for NFS is composed of
imported ISCSI targets, even if they are local to this NFS server.

-- 
  Sergey.
Regards,
Sergey Ivanov

On Sat, Sep 30, 2017 at 2:45 AM, Stephan Budach <stephan.budach at jvm.de> wrote:
> Hi Sergey,
>
> ----- Ursprüngliche Mail -----
>> Von: "sergey ivanov" <sergey57 at gmail.com>
>> An: "Stephan Budach" <stephan.budach at jvm.de>
>> CC: "omnios-discuss" <omnios-discuss at lists.omniti.com>
>> Gesendet: Freitag, 29. September 2017 22:30:31
>> Betreff: Re: [OmniOS-discuss] OmniOS based redundant NFS
>>
>> Thanks, Stephan.
>> I did a simple test with creating lu over physical disks for use as
>> ISSI targets, and it worked well. I am going to directly connect 2
>> servers and export their disks as separate I SCSI targets. Or maybe
>> different LUNs in a target. And then on the active server start
>> initiator to get these targets, and combine them into a pool of 2-way
>> mirrors so that it stays degraded but working if one of the servers
>> dies.
>
> Well, different targets mean, that you will be able to service single disks on one node, without having to degrade the whole zpool, but only the affected vdevs. On the other hand, there is more complexity since, you will have of course quite a big number of iSCSI targets to login to. This may be ok, if the number doesn't get too hight, but going with hundreds of disks, I chose to use fewer targets with more LUNs.
>
> One thing to keep in mind is, that stmfad allows you to create the guuid to your liking. That is that you can freely choose the last 20 bytes to be anything you want. I used that to ascii-code the node name and slot into the guid, such as that it displays on my NFS heads, when running format. This helps a lot in mapping the LUNs to drives.
>
>> So, manual failover for this configuration will be the following. If
>> the server to be disabled is still active, stop NFS, export zpool on
>> it, stop iscsiadm, release shared IP. On the other server: import
>> zpool and start NFS, activate shared IP.
>
> I am using the sharenfs properties of ZFS, but you will likely have to run zpool export -f <zpool> if you want to fail over the service, since the zpool is still busy. Also, you'd better set zpool failmode to panic instead of wait, such as that an issue triggers a reboot, rather than keeping you NFS head waiting.
>
>> I read once there are some tricks which make clients do not recognize
>> NFS server is changed underneath all mounts, but I never tried it.
>
> The only issue I came across was, when I deliberatley failed over the NFS service forth and back within the a too short period, which causes the NFSd on the former primary node to re-use the tcp packets numbers, insisting on reusing it's old NFS connections to the clients. I solved that by resetting the NFSd each time a service starts on any NFS head. The currently connected NFS clients are not affected by that and this solved this particular issue for me.
>
> Cheers,
> Stephan
>
>> --
>>   Regards,
>>   Sergey Ivanov.
>>
>> Regards,
>> Sergey Ivanov
>>
>>
>> On Thu, Sep 28, 2017 at 12:49 AM, Stephan Budach
>> <stephan.budach at jvm.de> wrote:
>> > Hi Sergey,
>> >
>> > ----- Ursprüngliche Mail -----
>> >> Von: "sergey ivanov" <sergey57 at gmail.com>
>> >> An: "Stephan Budach" <stephan.budach at jvm.de>
>> >> CC: "omnios-discuss" <omnios-discuss at lists.omniti.com>
>> >> Gesendet: Mittwoch, 27. September 2017 23:15:49
>> >> Betreff: Re: [OmniOS-discuss] OmniOS based redundant NFS
>> >>
>> >> Thanks, Stephan!
>> >>
>> >> Please, explain "The reason to use two x two separate servers is,
>> >> that
>> >> the mirrored zpool's vdevs look the same on each NFS head".
>> >>
>> >> I understand that, if I want to have the same zpool based on iscsi
>> >> devices, I should not mix local disks with iscsi target disks.
>> >>
>> >> But I think I can have 2 computers, each exporting a set of local
>> >> disks as iscsi targets. And to have iscsi initiators on the same
>> >> computers importing these targets to build zpools.
>> >>
>> >> Also, looking at sbdadm, I think I can 'create lu
>> >> /dev/rdsk/c0t0d3s2'.
>> >>
>> >> Ok, I think I would better try it and report how it goes.
>> >
>> > Actually, things can become quite complex, I'd like to reduce the
>> > "mental" involvement to the absolute minimum, mainly because we
>> > often faced a situation where something would suddenly break,
>> > which had been running for a long time without problems. This is
>> > when peeple start… well maybe not panicking, but having to recap
>> > what the current setup was like and what they had to do to tackle
>> > this.
>> >
>> > So, uniformity is a great deal of help on such systems - at least
>> > for us. Technically, there is no issue with mixing local and
>> > remote iSCST targets on the same node, which serves as an iSCSI
>> > target and a NFS head.
>> >
>> > Also, if one of the nodes really goes down, you will be loosing
>> > your failover NFS head as well, maybe not a big deal and depending
>> > on your requirements okay. I do have such a setup as well,
>> > although only for an archive ZPOOL, where I can tolerate this
>> > reduced redundancy for the benefit of a more lightweight setup.
>> >
>> > Cheers,
>> > Stephan
>>