[OmniOS-discuss] problems with 10g interfaces dropping off for a time and then coming back

Doug Hughes doug at will.to
Thu Mar 5 18:31:32 UTC 2015

I'm having an issue with r*12 with 10g Solarflare interfaces setup in an
aggregate simultaneously dropping for a while for no apparent reason and
then coming back. Oddly, I can see them leaving the port channel and
dropping on the switch side, but there's no log messages or anything on the
client side. They are 5162 cards, for what it's worth.

Has anybody else seen anything like this? Any idea why the host ports don't
seem to log any messages to the effect? I can see side affects of this on
the host. It only happens during moderate to heavy load. Interrupt
balancing looks ok (intrstat), and I watch vmstat, and then all of a sudden
the cs, interrupts and other markers drop preciptously (probably as a
result of a complete drop of network traffic), and it will stay that way
for a couple of minutes and then recover on its own. Sometimes it is up to
30 minutes and then it just recovers, equally as mysteriously. I can
sometimes fix it by toggling the interface on the switch.

I have other hosts with the same hardware and driver but running Solaris 10
that don't exhibit this.
