[OmniOS-discuss] Illumos/omnios ALUA support was:: Active-Active vSphere

Aaron Curry asc1111 at gmail.com
Sat Nov 29 19:58:52 UTC 2014


Johan,

It looks to me like your understanding of ALUA is correct.

Bottom line: ALUA is a way to present a LUN to a host while advertising all
possible paths. These "possible paths" include the local target HBAs as
well as any other target HBAs in the ALUA group. This is necessary in a
high availability storage solution!

Without ALUA, the host will see only the paths for the LUN to the pool
owner. Simply moving the pool to a secondary storage server and recreating
the LUNs/viwes (no ALUA involved) will cause the host to see it as a new
storage device. This setup can reduce downtime in the case of an outage or
planned maintenance but will cause any active IO to fail. The whole point
of highly available storage is for there to be no interruption in IO when a
pool moves between storage servers. Keep in mind that a zfs pool can only
be imported on one server at a time. To import it on two servers at the
same time will corrupt the pool.

I will respectfully disagree with Saso here and say that getting ALUA
itself up and running is not all that complicated (I'll qualify this
statement further down). The problem, as you pointed out, is the lack of
information on how to do it. I won't go into the details here, but the
basics of getting up and running are as follows:

1) Configure the stmf_proxy service similar to what high-availability.com
has documented here
<http://www.high-availability.com/FAQ/index.php?action=artikel&cat=3&id=62&artlang=en>.
Obviously there are some changes you have to make from that document, but
this is the most reliable information I have found on the matter.
2) Start the stmf_proxy service.
3) Create a target group with the same name on all storage nodes in your
cluster.
4) Add all target HBAs from all nodes to the target group on all servers.
So, if you have two servers with two target HBAs, the target group on both
servers will have four target group members.
5) Create a LUN and view. The stmf_proxy service will automagically
replicate the LUN/view information to the other nodes in the ALUA group.
6) Rescan the storage on the host.

When all that is done, the host will see the LUN with all paths configured
in the ALUA group. The paths to the pool owner will show as active while
the paths to the other storage servers will show as "standby." The
multipathing software will recognize the standby paths for what they are
and not attempt to perform any IO on those paths... ever. Configuring your
multipathing software to use the standby paths will result in IO timing out
on a standby path and then retrying on an active path. This will make the
IO appear to be really slow as Saso mentioned. However, with the
mulipathing software I have worked with (VMware, Windows... can't say I've
tried it with Linux), the round-robin setting recognizes a standby path and
simply ignores it. So, to back to the original poster's question: on ESXi,
Round-Robin works just fine... even when using an ALUA configuration.

With all that said, while I don't think that getting ALUA up and running is
that complicated (assuming you can get enough information to help you out
with the process) the problem is, where do you go from there? Highly
available storage is only good if a secondary node can take over when the
primary node fails. How do you manage that? How do the nodes in a cluster
monitor each other? How does a secondary node import a pool and change the
configured LUNs from standby to active (changing from standby to active is
not actually handled by ALUA)? And then what do you do when the primary
comes back online? How do you keep it from trying to take over? How do you
avoid split-brain scenarios? This is where the cluster management software
comes into play, and where things get really complicated. There are lots of
moving parts here to try to manage and coordinate. You can try to write up
all the scripts yourself. I have seen references to a set of open source
utilities that people have used with varying degrees of success. My
personal opinion is that if you truly need an HA storage solution, you are
much better off spending the money on a supported solution for the whole
stack. My preference for the FC HA storage stack:

Hardware: Supermicro Storage Bridge Bay system (purchased through a system
integrator who provides warranty/support). Its a little more expensive, but
definitely worth it.
OS: OmniOS. Great product with a great community. Plus there's a real, for
profit company backing it and providing professional support if needed.
Cluster: RSF-1 (high-availability.com). This is the only supported ZFS
cluster software that I am aware of (Nexenta actually uses it in their HA
cluster solution). It works great and doesn't cost much when you compare it
to buying a brand-name storage box.

There are, of course, other ways to implement redundant storage, such as
mirroring storage at the host level, but there are limitations to where
this can be used and it adds overhead to the host.

Sorry for this being so long. If you're still reading at this point... wow,
you must be really interested in this topic. For those who have made it
this far, the bottom line to the ALUA question is this: ALUA is only needed
in an HA setup. And, if you need HA and want to use OmniOS, then you're
much better off buying a license for RSF-1.

Aaron

On Sat, Nov 29, 2014 at 2:10 AM, Johan Kragsterman <
johan.kragsterman at capvert.se> wrote:

> So, since this subject seams very much to be around ALUA understanding,
> this might be a good opportunity to discuss Illumos/OmniOS ALUA
> implementation..?
>
> So I change/alter the subject, if you folks don't mind...?
>
> If you mind, I can start a new thread...?
>
> I continue further down...
>
>
>
> -----"OmniOS-discuss" <omnios-discuss-bounces at lists.omniti.com> skrev:
> -----
> Till: Rune Tipsmark <rt at steait.net>
> Från: Saso Kiselkov
> Sänt av: "OmniOS-discuss"
> Datum: 2014-11-28 15:54
> Kopia: "omnios-discuss at lists.omniti.com" <omnios-discuss at lists.omniti.com>
> Ärende: Re: [OmniOS-discuss] Active-Active vSphere
>
> On 11/28/14 3:34 PM, Rune Tipsmark wrote:
> > Okay, i noticed alua support is not enabled per default on omnios. How
> do i enable that?
>
> The process is complicated, not very well documented and it isn't meant
> for your scenario anyway. If you only have one storage node, then you
> don't need ALUA, even if you have multiple connections to the node. So
> no need to worry about it.
>
>
>
>
> When you say it is poorly documented, I suppose you mean ALUA on Illumos
> dists?
>
> If you mean on Illumos dists, I haven't found any documentation, apart
> from your(Saso) patch Feature #3775. If you know of any such documentation,
> pls give us a link to it.
>
> There are some documentation though, from Nexenta and
> high-availability.com, but I find this difficult to use in OmniOS, since
> it's only Nexenta focus(of coarse).
>
>
> One thing I don't understand is if the Nexenta implementation of ALUA use
> access through the "not prefered path" to the LUN(-s), even through the
> node that doesn't own it?? In my understanding, that would need a
> complicated interconnect between the head nodes, as well as a LU
> pass-through handling through the "not prefered path-node", wouldn't it?
> And I haven't seen that anywhere in Illumos/Nexenta, but of coarse, I might
> be wrong here...?
>
> If I'm right here, the Nexenta implementation of ALUA is not providing the
> connection to the LUN(-s) through the "not prefered path", it only provides
> access after failover, when the "not prefered path" has become the
> "prefered path"?
>
> If I'm wrong here, and have failed to understand something, pls enlighten
> me...
>
>
>
> I'm also curious about the NMC, Nexenta Management Console. Is that one
> also in illumos-gate, or any equivalent software...?
>
> If not, we would need to add and check the target group members on
> both(all) nodes manually, right?
>
>
> How is this related to your implementation of stmf-ha, Saso?
>
>
> Regards Johan
>
>
>
>
>
>
>
> Cheers,
> --
> Saso
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20141129/a62a6e86/attachment-0001.html>


More information about the OmniOS-discuss mailing list