[OmniOS-discuss] ZIL TXG commits happen very frequently - why?

Rune Tipsmark rt at steait.net
Wed Oct 14 23:41:13 UTC 2015


Thanks, that was helpful reading although I'm not 100% sure where to start.


I did a bit of testing with the scripts and running IOmeter on a VM residing on a vSphere host connected to my ZFS box with 8Gbit Fibre channel.


I noticed that txg commits rarely took over 1 second.


root at zfs10:/tmp# dtrace -s duration.d pool01
dtrace: script 'duration.d' matched 2 probes
CPU     ID                    FUNCTION:NAME
  7  17407       txg_sync_thread:txg-synced sync took 0.68 seconds
  8  17407       txg_sync_thread:txg-synced sync took 0.71 seconds
 12  17407       txg_sync_thread:txg-synced sync took 0.52 seconds
  7  17407       txg_sync_thread:txg-synced sync took 0.29 seconds
 22  17407       txg_sync_thread:txg-synced sync took 0.64 seconds
  1  17407       txg_sync_thread:txg-synced sync took 0.34 seconds
  5  17407       txg_sync_thread:txg-synced sync took 0.93 seconds
  0  17407       txg_sync_thread:txg-synced sync took 0.46 seconds
  9  17407       txg_sync_thread:txg-synced sync took 2.59 seconds
  0  17407       txg_sync_thread:txg-synced sync took 0.29 seconds
  7  17407       txg_sync_thread:txg-synced sync took 1.31 seconds
  8  17407       txg_sync_thread:txg-synced sync took 0.71 seconds
 10  17407       txg_sync_thread:txg-synced sync took 0.67 seconds
  8  17407       txg_sync_thread:txg-synced sync took 0.29 seconds
 12  17407       txg_sync_thread:txg-synced sync took 0.58 seconds
  1  17407       txg_sync_thread:txg-synced sync took 0.46 seconds

also I noticed that the default allocation of 4GB on the slog device was never used, the peak I saw was just over 1GB, most of the time half that.

 0  17408      txg_sync_thread:txg-syncing 1179MB of 4096MB used
  9  17408      txg_sync_thread:txg-syncing  482MB of 4096MB used
 20  17408      txg_sync_thread:txg-syncing  686MB of 4096MB used
  0  17408      txg_sync_thread:txg-syncing  429MB of 4096MB used
 14  17408      txg_sync_thread:txg-syncing  328MB of 4096MB used
 10  17408      txg_sync_thread:txg-syncing  374MB of 4096MB used
  8  17408      txg_sync_thread:txg-syncing  510MB of 4096MB used
 12  17408      txg_sync_thread:txg-syncing  210MB of 4096MB used
  1  17408      txg_sync_thread:txg-syncing  268MB of 4096MB used
  0  17408      txg_sync_thread:txg-syncing  432MB of 4096MB used
 16  17408      txg_sync_thread:txg-syncing  236MB of 4096MB used
 18  17408      txg_sync_thread:txg-syncing  341MB of 4096MB used
  9  17408      txg_sync_thread:txg-syncing  361MB of 4096MB used
 14  17408      txg_sync_thread:txg-syncing  597MB of 4096MB used
 10  17408      txg_sync_thread:txg-syncing  357MB of 4096MB used
 21  17408      txg_sync_thread:txg-syncing  437MB of 4096MB used
 18  17408      txg_sync_thread:txg-syncing  637MB of 4096MB used

I did not see any significant write latency, but I did see a high read latency which is odd since it doesn't reflect what I experience on the VM or the vSphere host.

root at zfs10:/tmp# dtrace -s rw.d -c 'sleep 60'

  write
           value  ------------- Distribution ------------- count
               8 |                                         0
              16 |                                         27
              32 |@                                        11222
              64 |@@@@@@@                                  106215
             128 |@@@@@@@@@@@@@@@@@@@@@@                   327807
             256 |@@@@@@                                   94605
             512 |@                                        20467
            1024 |                                         6067
            2048 |@                                        7968
            4096 |@                                        10076
            8192 |                                         3380
           16384 |                                         249
           32768 |                                         214
           65536 |                                         219
          131072 |                                         77
          262144 |                                         1
          524288 |                                         0

  read
           value  ------------- Distribution ------------- count
               4 |                                         0
               8 |                                         18
              16 |                                         58
              32 |                                         174
              64 |@@@@@@                                   4322
             128 |@@@@@@@@@                                6278
             256 |@@@                                      2545
             512 |@                                        892
            1024 |@                                        1074
            2048 |@@                                       1171
            4096 |@@@                                      2222
            8192 |@@@@@@                                   4103
           16384 |@@@                                      2400
           32768 |@@                                       1401
           65536 |@@                                       1504
          131072 |@                                        897
          262144 |@                                        427
          524288 |                                         39
         1048576 |                                         1
         2097152 |                                         0

                               avg latency      stddev        iops  throughput
write                                496us      3136us      9809/s   450773k/s
read                               22633us     59917us       492/s    17405k/s

I also happen to monitor how busy each disk is and I don't see any significant load there either... here is an example

[cid:391d20a6-a7e7-4ec1-850c-2153d4eb4f64]

so I'm a bit lost as what to do next, I don't see any stress on the system in terms of writes but I still cannot max out the 8gbit FC... reads however are doing fairly good, getting just over 700MB/sec which is acceptable over 8Gbit FC. Writes tend to be between 350 and 450 MB/sec... they should get up to 700MB/sec as well.

Any ideas where to start?

br,
Rune





________________________________
From: Schweiss, Chip <chip at innovates.com>
Sent: Wednesday, October 14, 2015 2:44 PM
To: Rune Tipsmark
Cc: omnios-discuss at lists.omniti.com
Subject: Re: [OmniOS-discuss] ZIL TXG commits happen very frequently - why?

It all has to do with the write throttle and buffers filling.   Here's a great blog post on how it works and how it's tuned:

http://dtrace.org/blogs/ahl/2014/02/10/the-openzfs-write-throttle/

http://dtrace.org/blogs/ahl/2014/08/31/openzfs-tuning/

-Chip


On Wed, Oct 14, 2015 at 12:45 AM, Rune Tipsmark <rt at steait.net<mailto:rt at steait.net>> wrote:
Hi all.

Wondering if anyone could shed some light on why my ZFS pool would perform TXG commits up to 5 times per second. It’s set to the default 5 second interval and occasionally it does wait 5 seconds between commits, but only when nearly idle.

I’m not sure if this impacts my performance but I would suspect it doesn’t improve it. I force sync on all data.

I got 11 mirrors (7200rpm sas disks) two SLOG devices and two L2 ARC devices and a pair of spare disks.

Each log device can hold 150GB of data so plenty for 2 TXG commits. The system has 384GB memory.

Below is a bit of output from zilstat during a near idle time this morning so you wont see 4-5 commits per second, but during load later today it will happen..

root at zfs10:/tmp# ./zilstat.ksh -M -t -p pool01 txg
waiting for txg commit...
TIME                        txg       N-MB     N-MB/s N-Max-Rate       B-MB     B-MB/s B-Max-Rate    ops  <=4kB 4-32kB >=32kB
2015 Oct 14 06:21:19   10872771          3          3          0         21         21          2    234     14     19    201
2015 Oct 14 06:21:22   10872772         10          3          3         70         23         24    806      0     84    725
2015 Oct 14 06:21:24   10872773         12          6          5         56         28         26    682     17    107    558
2015 Oct 14 06:21:25   10872774         13         13          2         75         75         14    651      0     10    641
2015 Oct 14 06:21:25   10872775          0          0          0          0          0          0      1      0      0      1
2015 Oct 14 06:21:26   10872776         11         11          6         53         53         29    645      2    136    507
2015 Oct 14 06:21:30   10872777         11          2          4         81         20         32    873     11     60    804
2015 Oct 14 06:21:30   10872778          0          0          0          0          0          0      1      0      1      0
2015 Oct 14 06:21:31   10872779         12         12         11         56         56         52    631      0      8    623
2015 Oct 14 06:21:33   10872780         11          5          4         74         37         27    858      0     44    814
2015 Oct 14 06:21:36   10872781         14          4          6         79         26         30    977     12     82    883
2015 Oct 14 06:21:39   10872782         11          3          4         78         26         25    957     18     55    884
2015 Oct 14 06:21:43   10872783         13          3          4         80         20         24    930      0    135    795
2015 Oct 14 06:21:46   10872784         13          4          4         81         27         29    965     13     95    857
2015 Oct 14 06:21:49   10872785         11          3          6         80         26         41   1077     12    215    850
2015 Oct 14 06:21:53   10872786          9          3          2         67         22         18    870      1     74    796
2015 Oct 14 06:21:56   10872787         12          3          5         72         18         26    909     17    163    729
2015 Oct 14 06:21:58   10872788         12          6          3         53         26         21    530      0     33    497
2015 Oct 14 06:21:59   10872789         26         26         24         72         72         62    882     12     60    810
2015 Oct 14 06:22:02   10872790          9          3          5         57         19         28    777      0     70    708
2015 Oct 14 06:22:07   10872791         11          2          3         96         24         22   1044     12     46    986
2015 Oct 14 06:22:10   10872792         13          3          4         78         19         22    911     12     38    862
2015 Oct 14 06:22:14   10872793         11          2          4         79         19         26    930     10     94    826
2015 Oct 14 06:22:17   10872794         11          3          5         73         24         26   1054     17    151    886
2015 Oct 14 06:22:17   10872795          0          0          0          0          0          0      2      0      0      2
2015 Oct 14 06:22:18   10872796         40         40         38         78         78         60    707      0     28    680
2015 Oct 14 06:22:22   10872797         10          3          3         66         22         21    937     14    164    759
2015 Oct 14 06:22:25   10872798          9          2          2         66         16         21    821     11     92    718
2015 Oct 14 06:22:28   10872799         24         12         14         80         40         43    750      0     23    727
2015 Oct 14 06:22:28   10872800          0          0          0          0          0          0      2      0      0      2
2015 Oct 14 06:22:29   10872801         15          7          9         49         24         24    526     11     25    490
2015 Oct 14 06:22:33   10872802         10          2          3         79         19         24    939      0     63    876
2015 Oct 14 06:22:36   10872803         10          5          3         59         29         18    756     11     65    682
2015 Oct 14 06:22:36   10872804          0          0          0          0          0          0      0      0      0      0
2015 Oct 14 06:22:36   10872805         13         13          2         58         58          9    500      0     29    471

--

root at zfs10:/tmp# zpool status pool01
  pool: pool01
state: ONLINE
  scan: scrub repaired 0 in 7h53m with 0 errors on Sat Oct  3 06:53:43 2015
config:

        NAME                       STATE     READ WRITE CKSUM
        pool01                     ONLINE       0     0     0
          mirror-0                 ONLINE       0     0     0
            c4t5000C50055FC9533d0  ONLINE       0     0     0
            c4t5000C50055FE6A63d0  ONLINE       0     0     0
          mirror-1                 ONLINE       0     0     0
            c4t5000C5005708296Fd0  ONLINE       0     0     0
            c4t5000C5005708351Bd0  ONLINE       0     0     0
          mirror-2                 ONLINE       0     0     0
            c4t5000C500570858EFd0  ONLINE       0     0     0
            c4t5000C50057085A6Bd0  ONLINE       0     0     0
          mirror-3                 ONLINE       0     0     0
            c4t5000C50057086307d0  ONLINE       0     0     0
            c4t5000C50057086B67d0  ONLINE       0     0     0
          mirror-4                 ONLINE       0     0     0
            c4t5000C500570870D3d0  ONLINE       0     0     0
            c4t5000C50057089753d0  ONLINE       0     0     0
          mirror-5                 ONLINE       0     0     0
            c4t5000C500625B7EA7d0  ONLINE       0     0     0
            c4t5000C500625B8137d0  ONLINE       0     0     0
          mirror-6                 ONLINE       0     0     0
            c4t5000C500625B8427d0  ONLINE       0     0     0
            c4t5000C500625B86E3d0  ONLINE       0     0     0
          mirror-7                 ONLINE       0     0     0
            c4t5000C500625B886Fd0  ONLINE       0     0     0
            c4t5000C500625BB773d0  ONLINE       0     0     0
          mirror-8                 ONLINE       0     0     0
            c4t5000C500625BC2C3d0  ONLINE       0     0     0
            c4t5000C500625BD3EBd0  ONLINE       0     0     0
          mirror-9                 ONLINE       0     0     0
            c4t5000C50062878C0Bd0  ONLINE       0     0     0
            c4t5000C50062878C43d0  ONLINE       0     0     0
          mirror-10                ONLINE       0     0     0
            c4t5000C50062879687d0  ONLINE       0     0     0
            c4t5000C50062879707d0  ONLINE       0     0     0
        logs
          c11d0                    ONLINE       0     0     0
          c10d0                    ONLINE       0     0     0
        cache
          c14d0                    ONLINE       0     0     0
          c15d0                    ONLINE       0     0     0
        spares
          c4t5000C50062879723d0    AVAIL
          c4t5000C50062879787d0    AVAIL

errors: No known data errors


Br,
Rune

_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com<mailto:OmniOS-discuss at lists.omniti.com>
http://lists.omniti.com/mailman/listinfo/omnios-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151014/8fa19e83/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pastedImage.png
Type: image/png
Size: 51990 bytes
Desc: pastedImage.png
URL: <https://omniosce.org/ml-archive/attachments/20151014/8fa19e83/attachment-0001.png>


More information about the OmniOS-discuss mailing list