[OmniOS-discuss] write amplification zvol

Stephan Budach stephan.budach at jvm.de
Thu Sep 28 08:33:17 UTC 2017


----- Ursprüngliche Mail ----- 

> Von: "anthony omnios" <icoomnios at gmail.com>
> An: "Richard Elling" <richard.elling at richardelling.com>
> CC: omnios-discuss at lists.omniti.com
> Gesendet: Donnerstag, 28. September 2017 09:56:42
> Betreff: Re: [OmniOS-discuss] write amplification zvol

> Thanks Richard for your help.

> My problem is that i have a network ISCSI traffic of 2 MB/s, each 5
> seconds i need to write on disks 10 MB of network traffic but on
> pool filervm2 I am writing much more that, approximatively 60 MB
> each 5 seconds. Each ssd of filervm2 is writting 15 MB every 5
> second. When i check with smartmootools every ssd is writing
> approximatively 250 GB of data each day.

> How can i reduce amont of data writting on each ssd ? i have try to
> reduce block size of zvol but it change nothing.

> Anthony

> 2017-09-28 1:29 GMT+02:00 Richard Elling <
> richard.elling at richardelling.com > :

> > Comment below...
> 

> > > On Sep 27, 2017, at 12:57 AM, anthony omnios <
> > > icoomnios at gmail.com
> > > > wrote:
> 
> > >
> 
> > > Hi,
> 
> > >
> 
> > > i have a problem, i used many ISCSI zvol (for each vm), network
> > > traffic is 2MB/s between kvm host and filer but i write on disks
> > > many more than that. I used a pool with separated mirror zil
> > > (intel s3710) and 8 ssd samsung 850 evo 1To
> 
> > >
> 
> > > zpool status
> 
> > > pool: filervm2
> 
> > > state: ONLINE
> 
> > > scan: resilvered 406G in 0h22m with 0 errors on Wed Sep 20
> > > 15:45:48
> > > 2017
> 
> > > config:
> 
> > >
> 
> > > NAME STATE READ WRITE CKSUM
> 
> > > filervm2 ONLINE 0 0 0
> 
> > > mirror-0 ONLINE 0 0 0
> 
> > > c7t5002538D41657AAFd0 ONLINE 0 0 0
> 
> > > c7t5002538D41F85C0Dd0 ONLINE 0 0 0
> 
> > > mirror-2 ONLINE 0 0 0
> 
> > > c7t5002538D41CC7105d0 ONLINE 0 0 0
> 
> > > c7t5002538D41CC7127d0 ONLINE 0 0 0
> 
> > > mirror-3 ONLINE 0 0 0
> 
> > > c7t5002538D41CD7F7Ed0 ONLINE 0 0 0
> 
> > > c7t5002538D41CD83FDd0 ONLINE 0 0 0
> 
> > > mirror-4 ONLINE 0 0 0
> 
> > > c7t5002538D41CD7F7Ad0 ONLINE 0 0 0
> 
> > > c7t5002538D41CD7F7Dd0 ONLINE 0 0 0
> 
> > > logs
> 
> > > mirror-1 ONLINE 0 0 0
> 
> > > c4t2d0 ONLINE 0 0 0
> 
> > > c4t4d0 ONLINE 0 0 0
> 
> > >
> 
> > > i used correct ashift of 13 for samsung 850 evo
> 
> > > zdb|grep ashift :
> 
> > >
> 
> > > ashift: 13
> 
> > > ashift: 13
> 
> > > ashift: 13
> 
> > > ashift: 13
> 
> > > ashift: 13
> 
> > >
> 
> > > But i write a lot on ssd every 5 seconds (many more than the
> > > network traffic of 2 MB/s)
> 
> > >
> 
> > > iostat -xn -d 1 :
> 
> > >
> 
> > > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
> 
> > > 11.0 3067.5 288.3 153457.4 6.8 0.5 2.2 0.2 5 14 filervm2
> 

> > filervm2 is seeing 3067 writes per second. This is the interface to
> > the upper layers.
> 
> > These writes are small.
> 

> > > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 rpool
> 
> > > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t0d0
> 
> > > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t1d0
> 
> > > 0.0 552.6 0.0 17284.0 0.0 0.1 0.0 0.2 0 8 c4t2d0
> 
> > > 0.0 552.6 0.0 17284.0 0.0 0.1 0.0 0.2 0 8 c4t4d0
> 

> > The log devices are seeing 552 writes per second and since
> > sync=standard that
> 
> > means that the upper layers are requesting syncs.
> 

> > > 1.0 233.3 48.1 10051.6 0.0 0.0 0.0 0.1 0 3 c7t5002538D41657AAFd0
> 
> > > 5.0 250.3 144.2 13207.3 0.0 0.0 0.0 0.1 0 3 c7t5002538D41CC7127d0
> 
> > > 2.0 254.3 24.0 13207.3 0.0 0.0 0.0 0.1 0 4 c7t5002538D41CC7105d0
> 
> > > 3.0 235.3 72.1 10051.6 0.0 0.0 0.0 0.1 0 3 c7t5002538D41F85C0Dd0
> 
> > > 0.0 228.3 0.0 16178.7 0.0 0.0 0.0 0.2 0 4 c7t5002538D41CD83FDd0
> 
> > > 0.0 225.3 0.0 16210.7 0.0 0.0 0.0 0.2 0 4 c7t5002538D41CD7F7Ed0
> 
> > > 0.0 282.3 0.0 19991.1 0.0 0.0 0.0 0.2 0 5 c7t5002538D41CD7F7Dd0
> 
> > > 0.0 280.3 0.0 19871.0 0.0 0.0 0.0 0.2 0 5 c7t5002538D41CD7F7Ad0
> 

> > The pool disks see 1989 writes per second total or 994 writes per
> > second logically.
> 

> > It seems to me that reducing 3067 requested writes to 994 logical
> > writes is the opposite
> 
> > of amplification. What do you expect?
> 
> > -- richard
> 

> > >
> 
> > > I used zvol of 64k, i try with 8k and problem is the same.
> 
> > >
> 
> > > zfs get all filervm2/hdd-110022a :
> 
> > >
> 
> > > NAME PROPERTY VALUE SOURCE
> 
> > > filervm2/hdd-110022a type volume -
> 
> > > filervm2/hdd-110022a creation Tue May 16 10:24 2017 -
> 
> > > filervm2/hdd-110022a used 5.26G -
> 
> > > filervm2/hdd-110022a available 2.90T -
> 
> > > filervm2/hdd-110022a referenced 5.24G -
> 
> > > filervm2/hdd-110022a compressratio 3.99x -
> 
> > > filervm2/hdd-110022a reservation none default
> 
> > > filervm2/hdd-110022a volsize 25G local
> 
> > > filervm2/hdd-110022a volblocksize 64K -
> 
> > > filervm2/hdd-110022a checksum on default
> 
> > > filervm2/hdd-110022a compression lz4 local
> 
> > > filervm2/hdd-110022a readonly off default
> 
> > > filervm2/hdd-110022a copies 1 default
> 
> > > filervm2/hdd-110022a refreservation none default
> 
> > > filervm2/hdd-110022a primarycache all default
> 
> > > filervm2/hdd-110022a secondarycache all default
> 
> > > filervm2/hdd-110022a usedbysnapshots 15.4M -
> 
> > > filervm2/hdd-110022a usedbydataset 5.24G -
> 
> > > filervm2/hdd-110022a usedbychildren 0 -
> 
> > > filervm2/hdd-110022a usedbyrefreservation 0 -
> 
> > > filervm2/hdd-110022a logbias latency default
> 
> > > filervm2/hdd-110022a dedup off default
> 
> > > filervm2/hdd-110022a mlslabel none default
> 
> > > filervm2/hdd-110022a sync standard local
> 
> > > filervm2/hdd-110022a refcompressratio 3.99x -
> 
> > > filervm2/hdd-110022a written 216K -
> 
> > > filervm2/hdd-110022a logicalused 20.9G -
> 
> > > filervm2/hdd-110022a logicalreferenced 20.9G -
> 
> > > filervm2/hdd-110022a snapshot_limit none default
> 
> > > filervm2/hdd-110022a snapshot_count none default
> 
> > > filervm2/hdd-110022a redundant_metadata all default
> 
> > >
> 
> > > Sorry for my bad english.
> 
> > >
> 
> > > What can be the problem ? thanks
> 
> > >
> 
> > > Best regards,
> 
> > >
> 
> > > Anthony
> 

How did you setup your LUNs? Especially, what is the block size for those LUNs. Could it be, that you went with the default of 512b blocks, where the drives do have 4k or even 8k blocks?

Cheers,
Stephan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5546 bytes
Desc: not available
URL: <https://omniosce.org/ml-archive/attachments/20170928/02552c16/attachment.bin>


More information about the OmniOS-discuss mailing list