[OmniOS-discuss] Comstar Disconnects under high load.

Narayan Desai narayan.desai at gmail.com
Mon May 12 23:32:40 UTC 2014


Are you perchance using iscsi/iSER? We've seen similar timeouts that don't
seem to correspond to hardware issues. From what we can tell, something
causes iscsi heartbeats not to be processed, so the client eventually times
out the block device and tries to reinitialize it.

In our case, we're running VMs using KVM on linux hosts. The guest detects
block device death, and won't recover without a reboot.

FWIW, switching to iscsi directly over IPoIB works great for identical
workloads. We've seen this with 151006 and I think 151008. We've not yet
tried it with 151010. This smells like some problem in comstar's iscsi/iser
driver.
 -nld


On Mon, May 12, 2014 at 5:13 PM, David Bomba <turbo124 at gmail.com> wrote:

> Hi guys,
>
> We have ~ 10 OmniOS powered ZFS storage arrays used to drive Virtual
> Machines under XenServer + VMWare using Infiniband interconnect.
>
> Our usual recipe is to use either LSI HBA or Areca Cards in pass through
> mode using internal drives SAS drives..
>
> This has worked flawlessly with Omnios 6/8.
>
> Recently we deployed a slightly different configuration
>
> HP DL380 G6
> 64GB ram
> X5650 proc
> LSI 9208-e card
> HP MDS 600 / SSA 70 external enclosure
> 30 TOSHIBA-MK2001TRKB-1001-1.82TB SAS2 drives in mirrored configuration.
>
> despite the following message in dmesg the array appeared to be working as
> expected
>
> scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,340f at 8/pci1000,30b0 at 0(mpt_sas1):
> May 13 04:01:07 s6      Log info 0x31140000 received for target 11.
>
> Despite this message we pushed into production and whilst the performance
> of the array has been good, as soon as we perform high write IO performance
> goes from 22k IOPS down to 100IOPS, this causes the target to disconnect
> from hypervisors and general mayhem ensues for the VMs.\
>
> During this period where performance degrades, there are no other messages
> coming into dmesg.
>
> Where should we begin to debug this? Could this be a symptom of not enough
> RAM? We have flashed the LSI cards to the latest firmware with no change in
> performance.
>
> Thanks in advance!
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140512/d411ea3c/attachment.html>


More information about the OmniOS-discuss mailing list