[OmniOS-discuss] Slow scrub performance
wuffers
moo at wuffers.net
Tue Jul 29 00:11:32 UTC 2014
Does this look normal?
pool: rpool
state: ONLINE
scan: scrub repaired 0 in 0h3m with 0 errors on Tue Jul 15 09:36:17 2014
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c4t0d0s0 ONLINE 0 0 0
c4t1d0s0 ONLINE 0 0 0
errors: No known data errors
pool: tank
state: ONLINE
scan: scrub in progress since Mon Jul 14 17:54:42 2014
6.59T scanned out of 24.2T at 5.71M/s, (scan is slow, no estimated time)
0 repaired, 27.25% done
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c1t5000C50055F9F637d0 ONLINE 0 0 0
c1t5000C50055F9EF2Fd0 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
c1t5000C50055F87D97d0 ONLINE 0 0 0
c1t5000C50055F9D3B3d0 ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
c1t5000C50055E6606Fd0 ONLINE 0 0 0
c1t5000C50055F9F92Bd0 ONLINE 0 0 0
mirror-3 ONLINE 0 0 0
c1t5000C50055F856CFd0 ONLINE 0 0 0
c1t5000C50055F9FE87d0 ONLINE 0 0 0
mirror-4 ONLINE 0 0 0
c1t5000C50055F84A97d0 ONLINE 0 0 0
c1t5000C50055FA0AF7d0 ONLINE 0 0 0
mirror-5 ONLINE 0 0 0
c1t5000C50055F9D3E3d0 ONLINE 0 0 0
c1t5000C50055F9F0B3d0 ONLINE 0 0 0
mirror-6 ONLINE 0 0 0
c1t5000C50055F8A46Fd0 ONLINE 0 0 0
c1t5000C50055F9FB8Bd0 ONLINE 0 0 0
mirror-7 ONLINE 0 0 0
c1t5000C50055F8B21Fd0 ONLINE 0 0 0
c1t5000C50055F9F89Fd0 ONLINE 0 0 0
mirror-8 ONLINE 0 0 0
c1t5000C50055F8BE3Fd0 ONLINE 0 0 0
c1t5000C50055F9E123d0 ONLINE 0 0 0
mirror-9 ONLINE 0 0 0
c1t5000C50055F9379Bd0 ONLINE 0 0 0
c1t5000C50055F9E7D7d0 ONLINE 0 0 0
mirror-10 ONLINE 0 0 0
c1t5000C50055E65F0Fd0 ONLINE 0 0 0
c1t5000C50055F9F80Bd0 ONLINE 0 0 0
mirror-11 ONLINE 0 0 0
c1t5000C50055F8A22Bd0 ONLINE 0 0 0
c1t5000C50055F8D48Fd0 ONLINE 0 0 0
mirror-12 ONLINE 0 0 0
c1t5000C50055E65807d0 ONLINE 0 0 0
c1t5000C50055F8BFA3d0 ONLINE 0 0 0
mirror-13 ONLINE 0 0 0
c1t5000C50055E579F7d0 ONLINE 0 0 0
c1t5000C50055E65877d0 ONLINE 0 0 0
mirror-14 ONLINE 0 0 0
c1t5000C50055F9FA1Fd0 ONLINE 0 0 0
c1t5000C50055F8CDA7d0 ONLINE 0 0 0
mirror-15 ONLINE 0 0 0
c1t5000C50055F8BF9Bd0 ONLINE 0 0 0
c1t5000C50055F9A607d0 ONLINE 0 0 0
mirror-16 ONLINE 0 0 0
c1t5000C50055E66503d0 ONLINE 0 0 0
c1t5000C50055E4FDE7d0 ONLINE 0 0 0
mirror-17 ONLINE 0 0 0
c1t5000C50055F8E017d0 ONLINE 0 0 0
c1t5000C50055F9F3EBd0 ONLINE 0 0 0
mirror-18 ONLINE 0 0 0
c1t5000C50055F8B80Fd0 ONLINE 0 0 0
c1t5000C50055F9F63Bd0 ONLINE 0 0 0
mirror-19 ONLINE 0 0 0
c1t5000C50055F84FB7d0 ONLINE 0 0 0
c1t5000C50055F9FEABd0 ONLINE 0 0 0
mirror-20 ONLINE 0 0 0
c1t5000C50055F8CCAFd0 ONLINE 0 0 0
c1t5000C50055F9F91Bd0 ONLINE 0 0 0
mirror-21 ONLINE 0 0 0
c1t5000C50055E65ABBd0 ONLINE 0 0 0
c1t5000C50055F8905Fd0 ONLINE 0 0 0
mirror-22 ONLINE 0 0 0
c1t5000C50055E57A5Fd0 ONLINE 0 0 0
c1t5000C50055F87E73d0 ONLINE 0 0 0
mirror-23 ONLINE 0 0 0
c1t5000C50055E66053d0 ONLINE 0 0 0
c1t5000C50055E66B63d0 ONLINE 0 0 0
mirror-24 ONLINE 0 0 0
c1t5000C50055F8723Bd0 ONLINE 0 0 0
c1t5000C50055F8C3ABd0 ONLINE 0 0 0
logs
c2t5000A72A3007811Dd0 ONLINE 0 0 0
cache
c2t500117310015D579d0 ONLINE 0 0 0
c2t50011731001631FDd0 ONLINE 0 0 0
c12t500117310015D59Ed0 ONLINE 0 0 0
c12t500117310015D54Ed0 ONLINE 0 0 0
spares
c1t5000C50055FA2AEFd0 AVAIL
c1t5000C50055E595B7d0 AVAIL
errors: No known data errors
---
This is a ~90TB SAN on r151008, with 25 pairs of 4TB mirror drives. The
last scrub I ran was about 3 months ago, which took (from my recollection)
~250 hours or so. I've only run about 4 scrubs so far on this installation.
The current scrub has been running for 2 weeks, with no end in sight. The
last time I saw an estimate, it said around ~650 hours remaining.
This thread http://comments.gmane.org/gmane.os.solaris.opensolaris.zfs/46021
from over 3 years ago mention the metaslab_min_alloc_size as a way to
improve this (reducing it to 4K from 10MB). Further reading into this
property got me this Illumos bug: https://www.illumos.org/issues/54, which
states "Turns out this tunable is made irrelevant as a result of a change
to use the metaslab_df_ops allocator. We don't need to change it. I'm
closing this bug.". So that seems like a dead end to me.
This is the current load with scrub running (~350 VMs between Hyper-V and
VMware environments):
# iostat -xnze
extended device statistics ---- errors ---
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn
tot device
0.4 12.5 39.7 78.8 0.1 0.0 5.0 0.1 0 0 0 0 0
0 rpool
0.2 6.9 19.9 39.4 0.0 0.0 0.0 0.1 0 0 0 0 0
0 c4t0d0
0.2 6.8 19.9 39.4 0.0 0.0 0.0 0.1 0 0 0 0 0
0 c4t1d0
4.4 29.3 209.7 962.7 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F8723Bd0
4.7 25.1 209.4 962.3 0.0 0.0 0.0 1.5 0 3 0 0 0
0 c1t5000C50055E66B63d0
4.7 27.6 208.3 952.7 0.0 0.0 0.0 1.3 0 3 0 0 0
0 c1t5000C50055F87E73d0
4.4 28.6 209.1 974.3 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F8BFA3d0
4.4 28.9 208.3 964.5 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9E123d0
4.4 25.7 208.7 955.7 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9F0B3d0
4.4 26.5 209.1 960.9 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9D3B3d0
4.3 25.2 206.6 936.1 0.0 0.0 0.0 1.5 0 3 0 0 0
0 c1t5000C50055E4FDE7d0
4.4 26.9 208.1 982.6 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9A607d0
4.4 24.5 208.7 955.4 0.0 0.0 0.0 1.5 0 3 0 0 0
0 c1t5000C50055F8CDA7d0
4.3 26.5 207.8 943.8 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055E65877d0
4.4 27.7 208.0 961.1 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9E7D7d0
4.3 26.0 208.0 953.9 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055FA0AF7d0
4.3 26.1 208.0 966.2 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9FE87d0
4.4 28.5 208.6 965.3 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9F91Bd0
4.3 26.7 207.2 945.0 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9FEABd0
4.4 26.5 209.3 980.1 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9F63Bd0
4.3 26.1 207.6 944.3 0.0 0.0 0.0 1.5 0 3 0 0 0
0 c1t5000C50055F9F3EBd0
4.3 26.5 208.1 954.9 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9F80Bd0
32.5 14.7 1005.6 751.2 0.0 0.0 0.0 0.3 0 1 0 0 0
0 c2t500117310015D579d0
32.5 14.7 1004.1 751.2 0.0 0.0 0.0 0.3 0 1 0 0 0
0 c2t50011731001631FDd0
0.0 180.8 0.0 16434.5 0.0 0.3 0.0 1.6 0 4 0 0 0
0 c2t5000A72A3007811Dd0
4.4 25.3 208.7 966.7 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9FB8Bd0
4.4 26.3 208.5 949.1 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9F92Bd0
4.4 29.7 208.6 975.1 0.0 0.0 0.0 1.3 0 3 0 0 0
0 c1t5000C50055F8905Fd0
4.4 25.7 207.9 954.1 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F8D48Fd0
4.4 26.8 208.4 967.4 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9F89Fd0
4.4 28.5 208.1 964.9 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9EF2Fd0
4.4 29.4 209.5 962.7 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F8C3ABd0
4.7 25.0 208.9 962.3 0.0 0.0 0.0 1.5 0 3 0 0 0
0 c1t5000C50055E66053d0
4.3 25.1 207.5 936.1 0.0 0.0 0.0 1.5 0 3 0 0 0
0 c1t5000C50055E66503d0
4.4 25.6 209.1 955.7 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9D3E3d0
4.3 26.6 207.4 945.0 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F84FB7d0
4.3 26.0 207.5 944.3 0.0 0.0 0.0 1.5 0 3 0 0 0
0 c1t5000C50055F8E017d0
4.3 26.4 207.1 943.8 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055E579F7d0
4.4 28.5 208.8 974.3 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055E65807d0
4.4 25.9 208.5 953.9 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F84A97d0
4.4 26.4 209.2 960.9 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F87D97d0
4.4 28.5 208.8 964.9 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9F637d0
4.4 29.6 208.9 975.1 0.0 0.0 0.0 1.3 0 3 0 0 0
0 c1t5000C50055E65ABBd0
4.4 26.7 208.5 982.6 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F8BF9Bd0
4.3 25.6 207.6 954.1 0.0 0.0 0.0 1.5 0 3 0 0 0
0 c1t5000C50055F8A22Bd0
4.4 27.6 208.2 961.1 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F9379Bd0
4.7 27.6 208.3 952.8 0.0 0.0 0.0 1.3 0 3 0 0 0
0 c1t5000C50055E57A5Fd0
4.4 28.4 208.4 965.3 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F8CCAFd0
4.4 26.4 208.9 980.1 0.0 0.0 0.0 1.5 0 3 0 0 0
0 c1t5000C50055F8B80Fd0
4.4 24.4 208.9 955.4 0.0 0.0 0.0 1.5 0 3 0 0 0
0 c1t5000C50055F9FA1Fd0
4.3 26.4 207.6 954.9 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055E65F0Fd0
4.4 28.8 208.3 964.5 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F8BE3Fd0
4.3 26.7 207.4 967.4 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F8B21Fd0
4.4 25.1 208.9 966.7 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F8A46Fd0
4.4 26.0 209.7 966.2 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055F856CFd0
4.4 26.2 209.0 949.1 0.0 0.0 0.0 1.4 0 3 0 0 0
0 c1t5000C50055E6606Fd0
32.5 14.7 1004.3 750.9 0.0 0.0 0.0 0.3 0 1 0 0 0
0 c12t500117310015D59Ed0
32.5 14.7 1004.4 751.3 0.0 0.0 0.0 0.3 0 1 0 0 0
0 c12t500117310015D54Ed0
349.1 646.9 14437.7 67437.3 52.7 2.6 52.9 2.6 12 37 0 0 0
0 tank
What should I be checking for? Is a scrub supposed to take that long (and I
thought over 10 days for the last one was long..)? There doesn't seem to be
any hardware errors. Is the load too high (12% wait, 37% busy with asvc_t
of 2.6ms)?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140728/5e9a91b1/attachment-0001.html>
More information about the OmniOS-discuss
mailing list