[OmniOS-discuss] NVMe Performance
Dan McDonald
danmcd at omniti.com
Fri Apr 8 05:16:55 UTC 2016
Thanks for these measurements and observations. I will suggest you forward this mail to the Illumos developer list, for a wider audience.
Thanks!
Dan
Sent from my iPhone (typos, autocorrect, and all)
> On Apr 7, 2016, at 11:35 PM, Josh Coombs <jcoombs at staff.gwi.net> wrote:
>
> Hi all,
>
> I just recently kitbashed a backup storage dump based on OmniOS, a couple retired servers and a few new bits to improve it's perf.
>
> - HP DL360 G6 with dual Xeon 5540s, 80GB RAM
> - The onboard HP SAS is hosting the root pool on it's RAID 5 of SAS disks, not ideal but the card doesn't have an IT mode.
> - LSI 9208e for main storage
> - - Chenbro SAS expander in an Addonics 20 bay SATA shelf
> - - 5 x WD 6TB Red drives in a RAIDz3 pool
> - 4 x onboard GigE, LAG'd together
> - dual port QLogic 4GB FC card for later use with my VMWare farm
>
> As I noted I was able to get a few new bits for the box, three Intel DC3600 400GB PCIe NVMe SSDs. I figured I could use two for log in a mirror setup, and one for cache and have a lovely setup.
>
> Initial testing without the NVMe drives setup with dd shows I can slam a sustained 300MB/s to the SATA pool when dumping to a file in one of my zfs repositories:
>
> # dd if=/dev/zero of=bigfile bs=1M count=10240
> 10240+0 records in
> 10240+0 records out
> 10737418240 bytes (11 GB) copied, 37.3299 s, 288 MB/s
>
> # dd if=/dev/zero of=bigfile bs=1M count=1024
> 1024+0 records in
> 1024+0 records out
> 1073741824 bytes (1.1 GB) copied, 0.593213 s, 1.8 GB/s
>
> For short bursts, the box caches nicely with the ram I was able to scrounge up. I forgot I had one zfs set for gzip-9 compression and was very impressed to see it sustained 1.7 GB/s over the course of dumping 107 GB of zeros, nice!
>
> Setting up a simple pool using one NVMe card with one volume and repeating the test on it I get:
>
> # dd if=/dev/zero of=bigfile bs=1M count=10240
> 10240+0 records in
> 10240+0 records out
> 10737418240 bytes (11 GB) copied, 16.1904 s, 663 MB/s
>
> Not quite the blazing performance I was expecting. I was hoping they could sustain at least twice that. If I make a zpool out of all three and bump my test to 53GB it sustains 1.1 GB/s, so there is a little performance scaling by spreading the work across all three but again no where near what I was anticipating.
>
> I also did a raw write to one of the units after making sure it wasn't in use by any pools:
> # dd if=/dev/zero of=/dev/rdsk/c3t1d0 bs=1M count=10240
> 10240+0 records in
> 10240+0 records out
> 10737418240 bytes (11 GB) copied, 22.2461 s, 483 MB/s
>
> If I get the system loaded enough, doing that results in transport errors being logged for the unit while I'm hammering it with dd.
>
> Digging around I found some FreeBSD discussions where they observed their NVMe driver rolling back to legacy interrupt mode on systems with lots of cpu cores as they ran out of MSIx slots. Given I've got 8 physical cores and I've got a lot of devices on the PCIe bus I don't know if that is a possibility here or not. I've not poked at the driver source yet as to be honest I wouldn't know what I was looking for.
>
> I also understand that the driver is pretty new to Illumos so I shouldn't expect it to be a rocket yet. I just figured I'd share what I've observed so far to see if that matches what I should expect or if there is additional testing work I can do to help improve the driver's performance down the road.
>
> Josh Coombs
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
More information about the OmniOS-discuss
mailing list