[OmniOS-discuss] QLE2652 I/O Disconnect. Heat Sinks?

Nate Smith nsmith at careyweb.com
Thu Mar 5 17:06:06 UTC 2015


The way I have it set up, is that Hyper-V hypervisor picks up the comstar targets and mounts them as ntfs storage to host the HVDs for Cluster File System. In the cluster, I can have either hypervisor drop and the cluster stays up. I'm getting this behavior on 2008 R2 and 2012 R2 (I have both hypervisors connecting to different luns at the same time, so it's hard to say which is causing it to fail).

As far as which PCI device I'm on, interrupts, etc, I could never find a rhyme or reason to it, but I didn't do an exacting test. It's hard to reproduce the problem to test for it. I know my HBAs were always on separate PCI busses and running at 8x on both systems I used.

0Nate



-----Original Message-----
From: Johan Kragsterman [mailto:johan.kragsterman at capvert.se] 
Sent: Thursday, March 05, 2015 12:00 PM
To: Rune Tipsmark
Cc: 'Nate Smith'; omnios-discuss at lists.omniti.com
Subject: Ang: Re: [OmniOS-discuss] QLE2652 I/O Disconnect. Heat Sinks?

Hi!

-----"OmniOS-discuss" <omnios-discuss-bounces at lists.omniti.com> skrev: -----
Till: "'Nate Smith'" <nsmith at careyweb.com>, "omnios-discuss at lists.omniti.com" <omnios-discuss at lists.omniti.com>
Från: Rune Tipsmark 
Sänt av: "OmniOS-discuss" 
Datum: 2015-03-05 17:15
Ärende: Re: [OmniOS-discuss] QLE2652 I/O Disconnect. Heat Sinks?

Haven’t tried iSCSI but had similar issues with Infiniband… more frequent due to higher io load, but no console error messages.

 

This only happened on my SuperMicro server and never on my HP server… what brand are you running?

 




This is interesting, only on Supermicro, and never on HP? I'd like to know some more details here...

First, when you say "server", do you mean the SAN head? Not the hosts?

Second: Can you specify the exakt model of the Supermicro and the HP?

Third: Did you pay attention to bios settings on the two different servers? Like C-states, and other settings...how about IRQ settings? And how about the physical PCIe buses the HBA's are sitting on? This is often causing problems, if you don't know the layout of the PCIe-buses.

Fourth: When you say you can cause it with windows as initiator, do you mean windows on hardware, and not windows as a VM? And when you say you can NOT cause it on VmWare, you mean you can run a windows VM on VmWare with direct LUN access without problems? And is this true for both hardwares, HP and Supermicro?

Since it appears on one hardware and not another, it is difficult to blame any specific sofware, but we just had a discussion here about iScsi/comstar, where Garrret suspected comstar to handle certain things bad. I don't know wether that has anything to do with this.

Rgrds Johan





Br,

Rune

 

 

From: Nate Smith [mailto:nsmith at careyweb.com] 
Sent: Thursday, March 05, 2015 8:10 AM
To: Rune Tipsmark; omnios-discuss at lists.omniti.com
Subject: RE: [OmniOS-discuss] QLE2652 I/O Disconnect. Heat Sinks?

 

Do you see the same problem with Windows and iSCSI as an initiator? I wish there was a way to turn up debugging to figure this out.

 

From: Rune Tipsmark [mailto:rt at steait.net] 
Sent: Thursday, March 05, 2015 11:08 AM
To: 'Nate Smith'; omnios-discuss at lists.omniti.com
Subject: RE: [OmniOS-discuss] QLE2652 I/O Disconnect. Heat Sinks?

 

Same problem here… have noticed I can cause this easily by using Windows as initiator… I cannot cause this using VMware as initiator…

No idea how to fix, but a big problem.

Br,

Rune

 

 

From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] On Behalf Of Nate Smith
Sent: Thursday, March 05, 2015 6:01 AM
To: omnios-discuss at lists.omniti.com
Subject: [OmniOS-discuss] QLE2652 I/O Disconnect. Heat Sinks?

 

I’ve had this problem for a while, and I have no way to diagnose what is going on, but occasionally when system IO gets high (I’ve seen it happen especially on backups), I will lose connectivity with my Fibre Channel cards which serve up fibre channel LUNS to a VM cluster. All hell breaks loose, and then connectivity gets restored. I don’t get an error that it’s dropped, at least not on the Omnios system, but I get notice when it’s restored (which makes no sense). I’m wondering if the cards are just overheating, and if heat sinks with a fan would help on the io chip.

 

Mar  5 01:55:01 newstorm fct: [ID 132490 kern.notice] NOTICE: qlt2,0 LINK UP, portid 20000, topology Fabric Pt-to-Pt,speed 8G

Mar  5 01:56:26 newstorm fct: [ID 132490 kern.notice] NOTICE: qlt0,0 LINK UP, portid 20100, topology Fabric Pt-to-Pt,speed 8G

Mar  5 02:00:13 newstorm last message repeated 1 time

Mar  5 02:00:15 newstorm fct: [ID 132490 kern.notice] NOTICE: qlt3,0 LINK UP, portid 10000, topology Fabric Pt-to-Pt,speed 8G

Mar  5 02:00:15 newstorm fct: [ID 132490 kern.notice] NOTICE: qlt2,0 LINK UP, portid 20000, topology Fabric Pt-to-Pt,speed 8G

Mar  5 02:00:18 newstorm fct: [ID 132490 kern.notice] NOTICE: qlt1,0 LINK UP, portid 10100, topology Fabric Pt-to-Pt,speed 8G

_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss







More information about the OmniOS-discuss mailing list