[OmniOS-discuss] ixgbe: breaking aggr on 10GbE X540-T2
Stephan Budach
stephan.budach at JVM.DE
Wed May 11 12:50:58 UTC 2016
Am 11.05.16 um 13:36 schrieb Stephan Budach:
> Am 09.05.16 um 20:43 schrieb Dale Ghent:
>>> On May 9, 2016, at 2:04 PM, Stephan Budach <stephan.budach at JVM.DE>
>>> wrote:
>>>
>>> Am 09.05.16 um 16:33 schrieb Dale Ghent:
>>>>> On May 9, 2016, at 8:24 AM, Stephan Budach <stephan.budach at JVM.DE>
>>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I have a strange behaviour where OmniOS omnios-r151018-ae3141d
>>>>> will break the LACP aggr-link on different boxes, when Intel
>>>>> X540-T2s are involved. It first starts with a couple if link
>>>>> downs/ups on one port and finally the link on that port
>>>>> negiotates to 1GbE instead of 10GbE, which then breaks the LACP
>>>>> channel on my Cisco Nexus for this connection.
>>>>>
>>>>> I have tried swapping and interchangeing cables and thus
>>>>> switchports, but to no avail.
>>>>>
>>>>> Anyone else noticed this and even better… knows a solution to this?
>>>> Was this an issue noticed only with r151018 and not with previous
>>>> versions, or have you only tried this with 018?
>>>>
>>>> By your description, I presume that the two ixgbe physical links
>>>> will stay at 10Gb and not bounce down to 1Gb if not LACP'd together?
>>>>
>>>> /dale
>>> I have noticed that on prior versions of OmniOS as well, but we only
>>> recently started deploying 10GbE LACP bonds, when we introduced our
>>> Nexus gear to our network. I will have to check if both links stay
>>> at 10GbE, when not being configured as a LACP bond. Let me check
>>> that tomorrow and report back. As we're heading for a streched DC,
>>> we are mainly configuring 2-way LACP bonds over our Nexus gear, so
>>> we don't actually have any single 10GbE connection, as they will all
>>> have to be conencted to both DCs. This is achieved by using VPCs on
>>> our Nexus switches.
>> Provide as much detail as you can - if you're using hw flow control,
>> whether both links act this way at the same time or independently,
>> and so-on. Problems like this often boil down to a very small and
>> seemingly insignificant detail.
>>
>> I currently have ixgbe on the operating table for adding X550
>> support, so I can take a look at this; however I don't have your type
>> of switches available to me so LACP-specific testing is something I
>> can't do for you.
>>
>> /dale
> I checked the ixgbe.conf files on each host and they all are still at
> the standard setting, which includes flow_control = 3;
> So they all have flow control enabled. As for the Nexus config, all of
> those ports are still on standard ethernet ports and modifications
> have only been made globally to the switch.
> I will now have to yank the one port on one of the hosts from the aggr
> and configure it as a standalone port. Then we will see, if it still
> receives the disconnects/reconnects and finally the negotiation to
> 1GbE instead of 10GbE. As this only seems to happen to the same port I
> never experienced other ports of the affected aggrs acting up. I also
> thought to notice, that those were always the "same" physical ports,
> that is the first port on the card (ixgbe0), but that might of course
> be a coincidence.
>
> Thanks,
> Stephan
Ok, so we can likely rule out LACP as a generic reason for this issue…
After removing ixgbe0 from the aggr1, I plugged it into an unused port
of my Nexus FEX and low and behold, here we go:
root at tr1206902:/root# tail -f /var/adm/messages
May 11 14:37:17 tr1206902 mac: [ID 435574 kern.info] NOTICE: ixgbe0 link
up, 1000 Mbps, full duplex
May 11 14:38:35 tr1206902 mac: [ID 486395 kern.info] NOTICE: ixgbe0 link
down
May 11 14:38:48 tr1206902 mac: [ID 435574 kern.info] NOTICE: ixgbe0 link
up, 10000 Mbps, full duplex
May 11 15:24:55 tr1206902 mac: [ID 486395 kern.info] NOTICE: ixgbe0 link
down
May 11 15:25:10 tr1206902 mac: [ID 435574 kern.info] NOTICE: ixgbe0 link
up, 10000 Mbps, full duplex
So, after less than an hour, we had the first link-cycle on ixgbe0, alas
on another port, which has no LACP config whatsoever. I will monitor
this for a while and see, if we will get more of those.
Thanks,
Stephan
More information about the OmniOS-discuss
mailing list