[OmniOS-discuss] Status of TRIM support?

Dan Swartzendruber dswartz at druber.com
Wed May 28 01:11:07 UTC 2014


So I've been running with sync=disabled on my vsphere NFS datastore.  I've
been willing to do so because I have a big-ass UPS, and do hourly backups.
 But, I'm thinking of going to an active/passive connection to my JBOD,
using Saso's blog post on zfs zfs-create.blogspot.com.  Here's why I think
I can't keep using sync=disabled (I would love to have my logic sanity
checked.)  If you switch manually from host A to B, all is well, since
before host A exports the pool, any pending writes will be completed (so
even though we lied to vsphere, it's okay.)  On the other hand, if host A
crashes/hangs and host B takes over, forcibly importing the pool, you
could end up with the following scenario: vsphere issues writes for blocks
A, B, C, D and E.  A and B have been written.  C and D were sent to host
A, and ACKed, so vsphere thinks all is well.  Host A has not yet committed
blocks C and D to disk.  Host B imports the pool, assumes the virtual IP
for the NFS share and vsphere reconnects to the datastore.  Since it
thinks it has written blocks A-D, it then issues a write for block E. 
Host B commits that to disk.  vsphere thinks blocks A-E were written to
disk, when in fact, blocks C and D were not.  Silent data corruption, and
as far as I can tell, no way to know this happened, so if I ever did have
a forced failover, I would have to rollback every single VM to the last
known, good snapshot.  Anyway, I decided to see what would happen
write-wise with an SLOG SSD.  I took a samsung 840PRO used for l2arc and
made that a log device.  I ran crystaldiskmark before and after.  Prior to
the SLOG, I was getting about 90MB/sec (gigabit enet), which is pretty
good.  Afterward, it went down to 8MB/sec!  I pulled the SSD and plugged
it into my windows 7 workstation, formatted it and deleted the partition,
which should have TRIM'ed it.  I reinserted it as SLOG and re-ran the
test.  50MB/sec.  Still not great, but this is after all an MLC device,
not SLC, and that's probably 'good enough'.  Looking at open-zfs.org, it
looks like out of illumos, freebsd and ZoL, only freebsd has TRIM now.  I
don't want to have to re-TRIM the thing every few weeks (or however long
it takes).  Does over-provisioning help?



More information about the OmniOS-discuss mailing list