[OmniOS-discuss] Pliant/Sandisk SSD ZIL

Tue Feb 18 07:05:23 UTC 2014

On Feb 17, 2014, at 5:48 PM, Derek Yarnell <derek at umiacs.umd.edu> wrote:

> On 2/17/14, 7:31 PM, Richard Elling wrote:
>> On Feb 17, 2014, at 2:48 PM, Derek Yarnell <derek at umiacs.umd.edu> wrote:
>> 
>>> Hi,
>>> 
>>> So we bought a new Dell R720xd with 2 Dell SLC SSDs which were shipped
>>> as a Pliant-LB206S-D323-186.31GB via format.
>> 
>> Pliant SSDs (note: Pliant was purchased by Sandisk in 2011) are optimized for
>> lots of concurrent I/O operations. This is not the kind of workload generated by
>> the ZIL, which is more contiguous, single thread-like.
> 
> I realize that they may not be as good as Stec/HGST ZeusRAM drives for
> the slog.  

They are not. By your measurements, the ZeusRAM is 2 orders of magnitude faster.

> I still can't wrap my head around that it is as bad as having
> no discrete slog.
> 
>> 
>> Never ever use zpool iostat to measure application performance. zpool iostat
>> measures workload to the vdevs, showing back-end operations to disk. As such
>> there is no correlation to client-side operations of any sort, especially writes and
>> metadata updates. You'll need to go up the stack and see what it is doing. For NFS
>> I highly recommend nfssvrtop. To see the response time of the Pliant, use "iostat -x"
>> or, as I prefer, "iostat -zxCn"
> 
> Yes I realize that iostat will show me this information and the svc_t
> for the Pliant ssd(s) is anywhere from 2-7ms.  But zpool iostat will
> show you your ZIL writes accurately no?

zpool iostat does not show latency (svc_t) except for a very limited few, distros.
For NFS, latency matters.

>  I realize that it will then
> coalesce these into its transactions and write it out at the 5sec interval.
> 
>> 
>> Note: response time (measured by iostat -x as a variant of "svc_t") is the critical 
>> measurement for NFS workloads. Bandwidth is almost meaningless in analyzing
>> NFS.
> 
> Yes and I have done this too.  The average RTT on untaring is 43.000 ms.
> I guess we will just be getting another set of ZeusRAM drives.

Be careful with these measurements. Untar is the pathological worst case workload
for NFS because of sync-on-close for clients. This workload is the poster child for Amdahl’s
Law. Also, the bulk of the latency could in fact be due to transfer from the client to the server
waiting to occur. In other words, the client can significantly impact these measurements. 
From a server’s perspective, nfssvrtop is the only tool I’m aware of that measures NFS 
operation latency from IP down to disk and back (the reason I wrote it :-). Of course, dtrace 
allows you to measure almost everything in the system, so you can add more measurements.

Fortunately, most real workloads are not tar -x.
 — richard

> 
> Thanks,
> derek
> 
> -- 
> Derek T. Yarnell
> University of Maryland
> Institute for Advanced Computer Studies
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

-- 

ZFS storage and performance consulting at http://www.RichardElling.com