[OmniOS-discuss] Status of TRIM support?
Saso Kiselkov
skiselkov.ml at gmail.com
Wed May 28 09:36:23 UTC 2014
On 5/28/14, 3:11 AM, Dan Swartzendruber wrote:
>
> So I've been running with sync=disabled on my vsphere NFS datastore. I've
> been willing to do so because I have a big-ass UPS, and do hourly backups.
> But, I'm thinking of going to an active/passive connection to my JBOD,
> using Saso's blog post on zfs zfs-create.blogspot.com. Here's why I think
> I can't keep using sync=disabled (I would love to have my logic sanity
> checked.) If you switch manually from host A to B, all is well, since
> before host A exports the pool, any pending writes will be completed (so
> even though we lied to vsphere, it's okay.) On the other hand, if host A
> crashes/hangs and host B takes over, forcibly importing the pool, you
> could end up with the following scenario: vsphere issues writes for blocks
> A, B, C, D and E. A and B have been written. C and D were sent to host
> A, and ACKed, so vsphere thinks all is well. Host A has not yet committed
> blocks C and D to disk. Host B imports the pool, assumes the virtual IP
> for the NFS share and vsphere reconnects to the datastore. Since it
> thinks it has written blocks A-D, it then issues a write for block E.
> Host B commits that to disk. vsphere thinks blocks A-E were written to
> disk, when in fact, blocks C and D were not. Silent data corruption, and
> as far as I can tell, no way to know this happened, so if I ever did have
> a forced failover, I would have to rollback every single VM to the last
> known, good snapshot. Anyway, I decided to see what would happen
> write-wise with an SLOG SSD. I took a samsung 840PRO used for l2arc and
> made that a log device. I ran crystaldiskmark before and after. Prior to
> the SLOG, I was getting about 90MB/sec (gigabit enet), which is pretty
> good. Afterward, it went down to 8MB/sec! I pulled the SSD and plugged
> it into my windows 7 workstation, formatted it and deleted the partition,
> which should have TRIM'ed it. I reinserted it as SLOG and re-ran the
> test. 50MB/sec. Still not great, but this is after all an MLC device,
> not SLC, and that's probably 'good enough'. Looking at open-zfs.org, it
> looks like out of illumos, freebsd and ZoL, only freebsd has TRIM now. I
> don't want to have to re-TRIM the thing every few weeks (or however long
> it takes). Does over-provisioning help?
Hi Dan,
First off, the Samsung 840 Pro apparently doesn't have power loss
protection, so DON'T use it for slog (ZIL). Use some enterprise-class
SSD that has proper protection of its DRAM contents. Even better, if you
have the cash to spend, get a ZeusRAM - these are true NVRAM devices
with extremely low latency.
If you use an SSD for slog, do a secure erase on it and then partition
it so that you leave something like 1/3 of it unused and untouched by
the OS. Evidence suggests that that might dramatically improve write
IOPS consistency:
http://www.anandtech.com/show/6489/playing-with-op
Cheers,
--
Saso
More information about the OmniOS-discuss
mailing list