[OmniOS-discuss] Clues for tracking down why kernel memory isn't being released?
Richard Elling
richard.elling at richardelling.com
Thu Jul 16 18:34:37 UTC 2015
> On Jul 16, 2015, at 9:48 AM, Chris Siebenmann <cks at cs.toronto.edu> wrote:
>
> I wrote:
>> We have one ZFS-based NFS fileserver that persistently runs at a very
>> high level of non-ARC kernel memory usage that never seems to shrink.
>> On a 128 GB machine, mdb's ::memstat reports 95% memory usage by just
>> 'Kernel' while the ZFS ARC is only at about 21 GB (as reported by
>> 'kstat -m') although c_max should allow it to grow much bigger.
>>
>> According to ::kmastat, a *huge* amount of this memory appears to be
>> vanishing into allocated but not used kmem_alloc_131072 slab buffers:
>>
>>> ::kmastat
>> cache buf buf buf memory alloc alloc
>> name size in use total in use succeed fail
>> ------------------------------ ----- --------- --------- ------ ---------- -----
>> [...]
>> kmem_alloc_131072 128K 6 613033 74.8G 196862991 0
>
> It turns out that the explanation for this is relatively simple, as
> is the work around. Put simply: the OmniOS kernel does not actually
> free up these deallocated cache objects until the system is put under
> relatively strong memory pressure. Crucially, *the ZFS ARC does not
> create this memory pressure*; I think that you pretty much need a user
> level program allocating enough memory in order to trigger it, and I
> think the memory growth needs to happen relatively rapidly fast so that
> the kernel doesn't reclaim enough memory through lesser means (such as
> shrinking the ZFS ARC).
I don't think we will get much traction for ZFS pushing applications out of RAM.
There is a nuance here, that can be difficult to resolve.
>
> (Specifically, you need to force kmem_reap() to be called. The primary
> path for this is if 'freemem' drops under 'lotsfree', which is only a few
> hundred MB on many systems. See usr/src/uts/common/os/vm_pageout.c in
> the OmniOS source repo.)
>
> Since our fileservers are purely NFS fileservers and have a basically
> static level of user memory usage, they rarely or never rapidly use up
> enough memory to trigger this 'allocated but unused' reclaim[*].
>
> The good news is that it's easy enough these days to eat memory at the
> user level (you can do it with modern 64-bit scripting languages like
> Python, even at an interactive prompt). The bad news is that when we did
> this on the server in question we provoked a significant system stall at
> both the NFS server level and even the level of ssh logins and shells;
> this is clearly not something that we'd want to automate.
>
> It's my personal opinion that there should be something in the kernel
> that automatically reaps drastically outsized kmem caches after a
> while. It's absurd that we've run for weeks with more than 70 GB of RAM
> sitting unused and an undersized ZFS ARC because of this.
kmem reaps can be very painful
>
> - cks
> [*: interested parties can see how often cache reaping has been triggered
> with the following 'mdb -k' command:
> ::walk kmem_cache | ::printf "%4d %s\n" kmem_cache_t cache_reap cache_name
ugh. How about:
kstat -p :::reap
-- richard
>
> Even on this heavily used fileserver, up for 45 days, the reap count
> was *8*. Many of our other fileservers, with less usage, have reap
> counts of 0.
> ]
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150716/14647018/attachment.html>
More information about the OmniOS-discuss
mailing list