[OmniOS-discuss] [discuss] Re: [networking] rge_intr troubles

Saso Kiselkov skiselkov.ml at gmail.com
Sun Oct 2 12:23:05 UTC 2016


Well what I know so far is that after a fixed number of packets
(probably after filling up the RX ring), we get a storm of
RX_FIFO_OVERFLOW_INT | NO_RXDESC_INT interrupts, probably because the
adapter is telling us that it filled up the RX ring and we didn't let it
know that we dequeued the old packets. Unfortunately, for the life of
me, I can't figure out how we're supposed to let it know that. I've been
staring at the drivers (both our and FreeBSD's) for hours and to me it's
all just a jumble of "DMA sync this, write reg that".

-- 
Saso

On 10/2/16 3:40 AM, Garrett D'Amore wrote:
> probably we should do something.  like reap the descriptors. i am afk but the usual strategy is to treat these kinds of interrupts just like normal rx. after that you should ack the interrupt of course. 
> 
> Sent from my iPhone
> 
>> On Oct 1, 2016, at 6:31 PM, Saso Kiselkov <skiselkov.ml at gmail.com> wrote:
>>
>>> On 10/2/16 12:23 AM, Robert Mustacchi wrote:
>>>> On 10/1/16 15:15 , Saso Kiselkov wrote:
>>>>> On 10/1/16 11:45 PM, Dale Ghent wrote:
>>>>>
>>>>>> On Oct 1, 2016, at 3:36 PM, Saso Kiselkov <skiselkov.ml at gmail.com> wrote:
>>>>>>
>>>>>> So I'm playing around with a box that has an on-board Realtek NIC and
>>>>>> periodically, about once every 2-5 minutes, the network just goes out to
>>>>>> lunch and stops responding to ping or attempts to send anything from
>>>>>> the box. I noticed that while doing so, the box is getting floored by
>>>>>> interrupts from the NIC, so I see tons of rge_intr activity and one CPU
>>>>>> core receiving about 160000 interrupts per second (other cores are idle).
>>>>>
>>>>> One core getting all the interrupts is expected, as both these chips and the driver do not support RSS.
>>>>>
>>>>> The key thing here is to see what rge_intr() is actually doing. It has 2 outcomes: It identifies the interrupt type, processes it, then returns to the DDI that it was claimed. IF it doesn't identify the interrupt, rge_intr() returns and reports unclaimed to the DDI.
>>>>>
>>>>> Knowing this info would be a good first step in figuring out what's going on.
>>>>
>>>> Gah, I'm an idiot, it's actually a bitmask of two things:
>>>>
>>>> RX_FIFO_OVERFLOW_INT | NO_RXDESC_INT
>>>>
>>>> Apparently, we don't give it enough rx descriptors. Trying to now figure
>>>> out where to change that...
>>>
>>> There'll always be cases where we don't have enough rx descriptors for
>>> devices. Presumably we shouldn't actually care about receiving that
>>> interrupt. Do you happen to have a specification for the device handy?
>>>
>>> Given that we're not doing anything with the NO_RXDESC_INT, we probably
>>> should just mask it on the device if possible.
>>
>> Just as a general FYI, I'm dealing with 8168G version of the MAC.
>> FreeBSD does have a driver that supports it, but since the driver there
>> appears home-grown (similar to ours), trying to transplant it would be a
>> major undertaking. I'll try to identify the major differences between
>> the versions we support and the 8168G, but of course, this being
>> hardware, they are many and few of them make any logical sense.
>>
>> --
>> Saso
>>
> 
> 
> -------------------------------------------
> illumos-networking
> Archives: https://www.listbox.com/member/archive/182193/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/182193/22721964-fe287663
> Modify Your Subscription: https://www.listbox.com/member/?member_id=22721964&id_secret=22721964-d1c6dd60
> Powered by Listbox: http://www.listbox.com
> 



More information about the OmniOS-discuss mailing list