[OmniOS-discuss] Mildly confusing ZFS iostat output
W Verb
wverb73 at gmail.com
Tue Jan 27 01:16:50 UTC 2015
Hello All,
I am mildly confused by something iostat does when displaying statistics
for a zpool. Before I begin rooting through the iostat source, does anyone
have an idea of why I am seeing high "wait" and "wsvc_t" values for "ppool"
when my devices apparently are not busy? I would have assumed that the
stats for the pool would be the sum of the stats for the zdevs....
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
10.0 9183.0 40.5 344942.0 0.0 1.8 0.0 0.2 0 178 c4
1.0 187.0 4.0 19684.0 0.0 0.1 0.0 0.5 0 8
c4t5000C5006A597B93d0
2.0 199.0 12.0 20908.0 0.0 0.1 0.0 0.6 0 12
c4t5000C500653DE049d0
2.0 197.0 8.0 20788.0 0.0 0.2 0.0 0.8 0 15
c4t5000C5003607D87Bd0
0.0 202.0 0.0 20908.0 0.0 0.1 0.0 0.6 0 11
c4t5000C5006A5903A2d0
0.0 189.0 0.0 19684.0 0.0 0.1 0.0 0.5 0 10
c4t5000C500653DEE58d0
5.0 957.0 16.5 1966.5 0.0 0.1 0.0 0.1 0 7
c4t50026B723A07AC78d0
0.0 201.0 0.0 20787.9 0.0 0.1 0.0 0.7 0 14
c4t5000C5003604ED37d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
c4t5000C500653E447Ad0
0.0 3525.0 0.0 110107.7 0.0 0.5 0.0 0.2 0 51
c4t500253887000690Dd0
0.0 3526.0 0.0 110107.7 0.0 0.5 0.0 0.1 1 50
c4t5002538870006917d0
10.0 6046.0 40.5 344941.5 837.4 1.9 138.3 0.3 23 67 ppool
For those following the VAAI thread, this is the system I will be using as
my testbed.
Here is the structure of ppool (taken at a different time than above):
root at sanbox:/root# zpool iostat -v ppool
capacity operations bandwidth
pool alloc free read write read write
------------------------- ----- ----- ----- ----- ----- -----
ppool 191G 7.97T 23 637 140K 15.0M
mirror 63.5G 2.66T 7 133 46.3K 840K
c4t5000C5006A597B93d0 - - 1 13 24.3K 844K
c4t5000C500653DEE58d0 - - 1 13 24.1K 844K
mirror 63.6G 2.66T 7 133 46.5K 839K
c4t5000C5006A5903A2d0 - - 1 13 24.0K 844K
c4t5000C500653DE049d0 - - 1 13 24.6K 844K
mirror 63.5G 2.66T 7 133 46.8K 839K
c4t5000C5003607D87Bd0 - - 1 13 24.5K 843K
c4t5000C5003604ED37d0 - - 1 13 24.4K 843K
logs - - - - - -
mirror 301M 222G 0 236 0 12.5M
c4t5002538870006917d0 - - 0 236 5 12.5M
c4t500253887000690Dd0 - - 0 236 5 12.5M
cache - - - - - -
c4t50026B723A07AC78d0 62.3G 11.4G 19 113 83.0K 1.07M
------------------------- ----- ----- ----- ----- ----- -----
root at sanbox:/root# zfs get all ppool
NAME PROPERTY VALUE SOURCE
ppool type filesystem -
ppool creation Sat Jan 24 18:37 2015 -
ppool used 5.16T -
ppool available 2.74T -
ppool referenced 96K -
ppool compressratio 1.51x -
ppool mounted yes -
ppool quota none default
ppool reservation none default
ppool recordsize 128K default
ppool mountpoint /ppool default
ppool sharenfs off default
ppool checksum on default
ppool compression lz4 local
ppool atime on default
ppool devices on default
ppool exec on default
ppool setuid on default
ppool readonly off default
ppool zoned off default
ppool snapdir hidden default
ppool aclmode discard default
ppool aclinherit restricted default
ppool canmount on default
ppool xattr on default
ppool copies 1 default
ppool version 5 -
ppool utf8only off -
ppool normalization none -
ppool casesensitivity sensitive -
ppool vscan off default
ppool nbmand off default
ppool sharesmb off default
ppool refquota none default
ppool refreservation none default
ppool primarycache all default
ppool secondarycache all default
ppool usedbysnapshots 0 -
ppool usedbydataset 96K -
ppool usedbychildren 5.16T -
ppool usedbyrefreservation 0 -
ppool logbias latency default
ppool dedup off default
ppool mlslabel none default
ppool sync standard local
ppool refcompressratio 1.00x -
ppool written 96K -
ppool logicalused 445G -
ppool logicalreferenced 9.50K -
ppool filesystem_limit none default
ppool snapshot_limit none default
ppool filesystem_count none default
ppool snapshot_count none default
ppool redundant_metadata all default
Currently, ppool contains a single 5TB zvol that I am hosting as an iSCSI
block device. At the zdev level, I have ensured that the ashift is 12 for
all devices, all physical devices are 4k-native SATA, and the cache/log
SSDs are also set for 4k. The block sizes are manually set in sd.conf, and
confirmed with "echo ::sd_state | mdb -k | egrep '(^un|_blocksize)'". The
zvol blocksize is 4k, and the iSCSI block transfer size is 512B (not that
it matters).
All drives contain a single Solaris2 partition with an EFI label, and are
properly aligned:
format> verify
Volume name = < >
ascii name = <ATA-ST3000DM001-1CH1-CC27-2.73TB>
bytes/sector = 512
sectors = 5860533167
accessible sectors = 5860533134
Part Tag Flag First Sector Size Last Sector
0 usr wm 256 2.73TB
5860516750
1 unassigned wm 0 0 0
2 unassigned wm 0 0 0
3 unassigned wm 0 0 0
4 unassigned wm 0 0 0
5 unassigned wm 0 0 0
6 unassigned wm 0 0 0
8 reserved wm 5860516751 8.00MB 5860533134
I scrubbed the pool last night, which completed without error. From "zdb
ppool", I have extracted (with minor formatting):
capacity operations bandwidth ---- errors
----
description used avail read write read write read write
cksum
ppool 339G 7.82T 26.6K 0 175M 0 0
0 5
mirror 113G 2.61T 8.87K 0 58.5M 0 0
0 2
/dev/dsk/c4t5000C5006A597B93d0s0 3.15K 0 48.8M 0 0
0 2
/dev/dsk/c4t5000C500653DEE58d0s0 3.10K 0 49.0M 0 0
0 2
mirror 113G 2.61T 8.86K 0 58.5M 0 0
0 8
/dev/dsk/c4t5000C5006A5903A2d0s0 3.12K 0 48.7M 0 0
0 8
/dev/dsk/c4t5000C500653DE049d0s0 3.08K 0 48.9M 0 0
0 8
mirror 113G 2.61T 8.86K 0 58.5M 0 0
0 10
/dev/dsk/c4t5000C5003607D87Bd0s0 2.48K 0 48.8M 0 0
0 10
/dev/dsk/c4t5000C5003604ED37d0s0 2.47K 0 48.9M 0 0
0 10
log mirror 44.0K 222G 0 0 37 0 0
0 0
/dev/dsk/c4t5002538870006917d0s0 0 0 290 0 0
0 0
/dev/dsk/c4t500253887000690Dd0s0 0 0 290 0 0
0 0
Cache
/dev/dsk/c4t50026B723A07AC78d0s0
0 73.8G 0 0 35 0 0
0 0
Spare
/dev/dsk/c4t5000C500653E447Ad0s0 4 0 136K 0 0
0 0
This shows a few checksum errors, which is not consistent with the output
of "zfs status -v", and "iostat -eE" shows no physical error count. I again
see the discrepancy between the "ppool" value and what I would expect,
which would be a sum of the cksum errors for each vdev.
I also observed a ton of leaked space, which I expect from a live pool, as
well as a single:
db_blkptr_cb: Got error 50 reading <96, 1, 2, 3fc8>
DVA[0]=<1:1dc4962000:1000> DVA[1]=<2:1dc4654000:1000> [L2 zvol object]
fletcher4 lz4 LE contiguous unique double size=4000L/a00P
birth=52386L/52386P fill=4825
cksum=c70e8a7765:f2a
dce34f59c:c8a289b51fe11d:7e0af40fe154aab4 -- skipping
By the way, I also found:
Uberblock:
magic = 000000000*0bab10c*
Wow. Just wow.
-Warren V
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150126/ed4894e9/attachment-0001.html>
More information about the OmniOS-discuss
mailing list