[OmniOS-discuss] Ang: Re: Ang: Re: LUN (in)visibility
Johan Kragsterman
johan.kragsterman at capvert.se
Fri Dec 11 07:45:37 UTC 2015
Hi!
-----Tom Robinson <tom.robinson at motec.com.au> skrev: -----
Till: Johan Kragsterman <johan.kragsterman at capvert.se>
Från: Tom Robinson <tom.robinson at motec.com.au>
Datum: 2015-12-10 23:10
Kopia: Dan McDonald <danmcd at omniti.com>, omnios-discuss <omnios-discuss at lists.omniti.com>
Ärende: Re: Ang: Re: [OmniOS-discuss] LUN (in)visibility
On 10/12/15 18:55, Johan Kragsterman wrote:
> You say "infiniband". Do you mean SRP? Where do you have your subnet manager? In the IB switch? If so, did you check the switch SM logs?
>
> I suppose you checked the data links? dladm show-link? What exactly did you check?
>
> How about multipath? How many paths did/do you have to each LUN? I know there were a discussion about too many paths to a LUN earlier on this list. That was fibre channel, though.
>
> I can't really comment on iScsi since I never use it...
Hi Johan,
Yes, SRP. As I said, we had everything working fine before which means we also have a subnet
manager. The SM actually runs on it's own litlte box.
---------- ----------- -----
| storage|======|IB Switch|======|KVM|
---------- ----------- -----
| |
---- ------
|SM| |ESXi|
---- ------
Normally there are only three paths; one iSCSI and two SRP.
I spent a lot of time hunting around on the KVM system looking for clues as at that time I didn't
see any issues else where in the setup.
On OmniOS, in /var/adm/messages I had this:
Oct 26 07:28:43 monza.motec.com.au genunix: [ID 408789 kern.warning] WARNING: hermon0: fault
detected external to device; service unavailable
Oct 26 07:28:43 monza.motec.com.au genunix: [ID 451854 kern.warning] WARNING: hermon0: port 2 down
Oct 26 07:28:47 monza.motec.com.au genunix: [ID 408822 kern.info] NOTICE: hermon0: fault detected
external to device; service still unavailable
Oct 26 07:28:47 monza.motec.com.au genunix: [ID 611667 kern.info] NOTICE: hermon0: port 1 down
Oct 26 07:30:12 monza.motec.com.au genunix: [ID 408789 kern.notice] NOTICE: hermon0: fault cleared
external to device; service available
Oct 26 07:30:12 monza.motec.com.au genunix: [ID 451854 kern.notice] NOTICE: hermon0: port 2 up
Oct 26 07:30:12 monza.motec.com.au genunix: [ID 408822 kern.info] NOTICE: hermon0: no fault external
to device; service available
Oct 26 07:30:12 monza.motec.com.au genunix: [ID 611667 kern.info] NOTICE: hermon0: port 1 up
Oct 26 07:31:31 monza.motec.com.au genunix: [ID 408789 kern.warning] WARNING: hermon0: fault
detected external to device; service unavailable
Oct 26 07:31:31 monza.motec.com.au genunix: [ID 451854 kern.warning] WARNING: hermon0: port 2 down
Oct 26 07:31:38 monza.motec.com.au genunix: [ID 408822 kern.info] NOTICE: hermon0: fault detected
external to device; service still unavailable
Oct 26 07:31:38 monza.motec.com.au genunix: [ID 611667 kern.info] NOTICE: hermon0: port 1 down
Oct 26 07:32:12 monza.motec.com.au genunix: [ID 408789 kern.notice] NOTICE: hermon0: fault cleared
external to device; service available
Oct 26 07:32:12 monza.motec.com.au genunix: [ID 451854 kern.notice] NOTICE: hermon0: port 2 up
Oct 26 07:32:12 monza.motec.com.au genunix: [ID 408822 kern.info] NOTICE: hermon0: no fault external
to device; service available
Oct 26 07:32:12 monza.motec.com.au genunix: [ID 611667 kern.info] NOTICE: hermon0: port 1 up
Isn't the hermon0 driver for the Mellanox cards?
Kind regards,
Tom
Yeah, that's the driver, and this seem to me like a data link problem. And a data link problem could be one or mer things among many things, like I suggested before to check the SM, if you got any logs there.
The msg: "fault detected external to device" is of coarse the key here, but I can't decipher it, unfortunatly...
Do you run the iScsi service over the same IB infrastructure?
Rgrds Johan
[bilagan "signature.asc" borttagen av Johan Kragsterman/Capvert]
More information about the OmniOS-discuss
mailing list