[OmniOS-discuss] zfs pool 100% busy, disks less than 10%

Mon Nov 3 19:30:19 UTC 2014

On Nov 2, 2014, at 6:24 PM, Rune Tipsmark <rt at steait.net> wrote:

> Looking a bit more at these numbers, am I seeing twice the actual rate due to mirroring?

You need to know what you are measuring. iostat and zpool iostat measure the I/O from
the pool to the disk. This is often very different than an application to the file system (pool).
In practice, it is often so different that it is impractical to correlate the two.

> How does compression affect the numbers?
> 
> Say I have 1 vdev mirrored and I see the pool writing 100 MB/sec, its what? 50 MB each disk? But from the client side only 50 MB/sec total? What if I compress it at the same time at say 1.50 ratio, will the pool show 100 MB/sec and the client write 75 MB/sec actual?

Measure both and compare. Usually we measure application side bandwidth using a tool
such as fsstat.
 -- richard

> 
> Br,
> Rune
> 
> -----Original Message-----
> From: Richard Elling [mailto:richard.elling at richardelling.com] 
> Sent: Sunday, November 02, 2014 6:07 PM
> To: Rune Tipsmark
> Cc: Eric Sproul; omnios-discuss at lists.omniti.com
> Subject: Re: [OmniOS-discuss] zfs pool 100% busy, disks less than 10%
> 
> 
> On Oct 31, 2014, at 6:07 PM, Rune Tipsmark <rt at steait.net> wrote:
> 
>> So actually started storage vmotions on 3 host, 6 concurrent and am 
>> getting about 1GB/sec Guess I need more hosts to really push this, the disk are not more than 20-25% busy, so in theory I could push a bit more.
>> 
>> I think this is resolved for now.... cpu sitting at 30-40% usage while 
>> moving 1GB/sec
> 
> Yes, that seems about right.
> -- richard
> 
>> 
>> Iostat -xn 1
>> pool04       396G  39.5T      9  15.9K   325K  1.01G
>> pool04       396G  39.5T      7  17.0K   270K  1.03G
>> pool04       396G  39.5T     12  17.4K   558K  1.10G
>> pool04       396G  39.5T     10  16.9K   442K  1.03G
>> pool04       397G  39.5T      6  16.9K   332K  1021M
>> pool04       397G  39.5T      1  16.3K  74.9K  1.01G
>> pool04       397G  39.5T      8  17.0K   433K  1.05G
>> pool04       397G  39.5T     20  17.1K   716K  1023M
>> pool04       397G  39.5T     11  18.3K   425K  1.14G
>> pool04       398G  39.5T      0  18.3K  65.9K  1.11G
>> pool04       398G  39.5T     16  17.9K   551K  1.06G
>> pool04       398G  39.5T      0  16.8K   105K  1.03G
>> pool04       398G  39.5T      1  18.2K   124K  1.11G
>> pool04       398G  39.5T      0  17.1K  45.9K  1.05G
>> pool04       399G  39.5T      6  17.3K   454K  1.08G
>> pool04       399G  39.5T      0  17.9K      0  1.06G
>> pool04       399G  39.5T      2  16.9K   116K  1.04G
>> pool04       399G  39.5T      2  18.8K   130K  1.09G
>> pool04       399G  39.5T      0  17.6K      0  1.03G
>> pool04       400G  39.5T      3  17.5K   155K  1.04G
>> pool04       400G  39.5T      0  17.6K  31.5K  1.03G
>> 
>> -----Original Message-----
>> From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] 
>> On Behalf Of Rune Tipsmark
>> Sent: Friday, October 31, 2014 12:38 PM
>> To: Richard Elling; Eric Sproul
>> Cc: omnios-discuss at lists.omniti.com
>> Subject: Re: [OmniOS-discuss] zfs pool 100% busy, disks less than 10%
>> 
>> Ok, makes sense.
>> What other kind of  indicators can I look at?
>> 
>> I get decent results from DD but still feels a bit slow...
>> 
>> Compression lz4 should not slow it down right? Cpu is not doing much when copying data over, maybe 15% busy or so... 
>> 
>> Sync=always, block size 1M
>> 204800000000 bytes (205 GB) copied, 296.379 s, 691 MB/s
>> real    4m56.382s
>> user    0m0.461s
>> sys     3m12.662s
>> 
>> Sync=disabled, block size 1M
>> 204800000000 bytes (205 GB) copied, 117.774 s, 1.7 GB/s
>> real    1m57.777s
>> user    0m0.237s
>> sys     1m57.466s
>> 
>> ... while doing this I was looking at my FIO cards, I think the reason is that the SLC's need more power to deliver higher performance, they are supposed to deliver 1.5GB/sec but only delivers around 350MB/sec each....
>> 
>> Now looking for aux power cables and will retest...
>> 
>> Br,
>> Rune
>> 
>> -----Original Message-----
>> From: Richard Elling [mailto:richard.elling at richardelling.com]
>> Sent: Friday, October 31, 2014 9:03 AM
>> To: Eric Sproul
>> Cc: Rune Tipsmark; omnios-discuss at lists.omniti.com
>> Subject: Re: [OmniOS-discuss] zfs pool 100% busy, disks less than 10%
>> 
>> 
>> On Oct 31, 2014, at 7:14 AM, Eric Sproul <eric.sproul at circonus.com> wrote:
>> 
>>> On Fri, Oct 31, 2014 at 2:33 AM, Rune Tipsmark <rt at steait.net> wrote:
>>> 
>>>> Why is this pool showing near 100% busy when the underlying disks 
>>>> are doing nothing at all....
>>> 
>>> Simply put, it's just how the accounting works in iostat.  It treats 
>>> the pool like any other device, so if there is even one outstanding 
>>> request to the pool, it counts towards the busy%.  Keith W. from 
>>> Joyent explained this recently on the illumos-zfs list:
>>> http://www.listbox.com/member/archive/182191/2014/10/sort/time_rev/pa
>>> g 
>>> e/3/entry/18:93/20141017161955:F3E11AB2-563A-11E4-8EDC-D0C677981E2F/
>>> 
>>> The TL;DR is: if your pool has more than one disk in it, the 
>>> pool-wide busy% is useless.
>> 
>> FWIW, we use %busy as an indicator that we can ignore a device/subsystem when looking for performance problems. We don't use it as an indicator of problems. In other words, if the device isn't > 10% busy, forgetabouddit. If it is more busy, look in more detail at the meaningful performance indicators.
>> -- richard
>> 
>> _______________________________________________
>> OmniOS-discuss mailing list
>> OmniOS-discuss at lists.omniti.com
>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>