[OmniOS-discuss] [discuss] COMSTAR hanging

Brian Hechinger wonko at 4amlunch.net
Wed Jan 13 04:21:03 UTC 2016


In my case the SATA disks aren’t on the 1068E.

-brian

> On Jan 12, 2016, at 11:19 PM, John Barfield <john.barfield at bissinc.com> wrote:
> 
> BTW I left off that it has the same LSI controller chipset
> 
> Sent from Outlook Mobile <https://aka.ms/qtex0l>
> _____________________________
> From: John Barfield <john.barfield at bissinc.com>
> Sent: Tuesday, January 12, 2016 10:17 PM
> Subject: Re: [OmniOS-discuss] [discuss] COMSTAR hanging
> To: <discuss at lists.illumos.org>, omnios-discuss <omnios-discuss at lists.omniti.com>
> 
> 
> My input may or may not be valid but Im going to throw it out there anyway :)
> 
> do you have any Mpt disconnect errors in /var/adm/messages? 
> 
> Also do you have smartmontools installed? 
> 
> I ran into similiar issues just booting a sunfire x4540 recently off of OmniOS live, i/o would just hang while probing device nodes.
> 
> I found the drive that was acting up and pulled it. 
> 
> All of a sudden everything miraculously worked amazing. 
> 
> I compiled smartmontools after I got it to boot and found 10 drives out of 48 with bad sectors in prefail state.
> 
> I dont know if this happens with SAS drives or not but Im using SATA and saw this was a common issue in old opensolaris threads. 
> 
> -barfield 
> 
> Sent from Outlook Mobile <https://aka.ms/qtex0l>
> 
> 
> 
> On Tue, Jan 12, 2016 at 8:08 PM -0800, "Brian Hechinger" <wonko at 4amlunch.net <mailto:wonko at 4amlunch.net>> wrote: 
> 
> In the meantime I’ve removed the SLOG and L2ARC just in case. I don’t think that’s it though. At least will have some sort of data point to work with here. :) 
> 
> -brian 
> 
> > On Jan 12, 2016, at 10:55 PM, Brian Hechinger <wonko at 4amlunch.net> wrote: 
> > 
> > Ok, it has happened. 
> > 
> > Checking this here, the pool seems to be fine. I can read and write files. 
> > 
> > except ‘zpool status’ is now currently hanging. I can still read/write from the pool, however. 
> > 
> > I can telnet to port 3260, but restarting target services has hung. 
> > 
> > root at basket1:/tank/Share# svcs -a | grep stmf 
> > online         Jan_05   svc:/system/stmf:default 
> > root at basket1:/tank/Share# svcs -a | grep target 
> > disabled       Jan_05   svc:/system/fcoe_target:default 
> > online         Jan_05   svc:/network/iscsi/target:default 
> > online         Jan_05   svc:/system/ibsrp/target:default 
> > root at basket1:/tank/Share# svcadm restart /system/ibsrp/target 
> > root at basket1:/tank/Share# svcadm restart /network/iscsi/target 
> > root at basket1:/tank/Share# svcadm restart /system/stmf 
> > root at basket1:/tank/Share# svcs -a | grep target 
> > disabled       Jan_05   svc:/system/fcoe_target:default 
> > online*        22:43:03 svc:/system/ibsrp/target:default 
> > online*        22:43:13 svc:/network/iscsi/target:default 
> > root at basket1:/tank/Share# svcs -a | grep stmf 
> > online*        22:43:18 svc:/system/stmf:default 
> > root at basket1:/tank/Share# 
> > 
> > I’m doing a crash dump reboot. I’ll post the output somewhere. 
> > 
> > The output of echo '$<threadlist' | mdb -k is attached. 
> > 
> > <threadlist.out> 
> > 
> >> On Jan 8, 2016, at 3:11 PM, Matej Zerovnik <matej at zunaj.si> wrote: 
> >> 
> >> Is the pool usable during comstar hang? 
> >> Can you write and read from the pool (test both, in my case, when pool froze, I wasn’t able to write to the pool, but I could read). 
> >> 
> >> Again, this might not be connected with Comstar, but in my case, Comstar and pool hang were exchanging. 
> >> 
> >> Matej 
> >> 
> >>> On 08 Jan 2016, at 20:11, Brian Hechinger <wonko at 4amlunch.net> wrote: 
> >>> 
> >>> Yeah, I’m using the 1068E to boot from (this has been supported since before Illumos) but that doesn’t have anything accessed by COMSTAR. 
> >>> 
> >>> It’s the ICH10R SATA that hosts the disks that COMSTAR shares out space from. 
> >>> 
> >>> -brian 
> >>> 
> >>>> On Jan 8, 2016, at 1:31 PM, Richard Jahnel <rjahnel at ellipseinc.com> wrote: 
> >>>> 
> >>>> First off, love SuperMicro good choice IMHO. 
> >>>> 
> >>>> This board has two on board controllers. 
> >>>> 
> >>>> LSI SAS1068E (not 100% sure there are working illumos drivers for this one) 
> >>>> 
> >>>> And 
> >>>> 
> >>>> Intel ICH10R SATA (So I'm guessing your using this one.) 
> >>>> 
> >>>> -----Original Message----- 
> >>>> From: OmniOS-discuss [ mailto:omnios-discuss-bounces at lists.omniti.com <mailto:omnios-discuss-bounces at lists.omniti.com>] On Behalf Of Brian Hechinger 
> >>>> Sent: Friday, January 08, 2016 12:16 PM 
> >>>> To: Matej Zerovnik <matej at zunaj.si> 
> >>>> Cc: omnios-discuss <omnios-discuss at lists.omniti.com> 
> >>>> Subject: Re: [OmniOS-discuss] [discuss] COMSTAR hanging 
> >>>> 
> >>>> 
> >>>>> Which controller exactly do you have? 
> >>>> 
> >>>> Whatever ACHI stuff is built into the motherboard. Motherboard is X8DTL-3F. 
> >>>> 
> >>>>> Do you know firmware version? 
> >>>> 
> >>>> I’m assuming this is linked to the BIOS version? 
> >>>> 
> >>>>> Which hard drives? 
> >>>> 
> >>>> Hitachi-HUA723030ALA640-MKAOAA50-2.73TB 
> >>>> 
> >>>>> It might not tell much, but it’s good to have as much information as possible. 
> >>>>> 
> >>>>> When comstar hangs, can you telnet to the iSCSI port? 
> >>>>> What does svcs says, is the service running? 
> >>>>> What happens in you try to restart it? 
> >>>>> How do you restart it? 
> >>>> 
> >>>> I’ll try all these things next time. 
> >>>> 
> >>>>> In my case, svcs reported service running, but when I tried to telnet, there was no connection as well as there was no listening port opened when checking with 'netstat -an'. If I tried to restart target and stmf service, but stmf service got stucked in online* state and would not start. Reboot was the only solution in my case, but as I said, latest 014 release is working OK (but then again, load got reduced). 
> >>>> 
> >>>> All good info. Thanks! 
> >>>> 
> >>>> -brian 
> >>>> 
> >>>>> 
> >>>>> Matej 
> >>>>> 
> >>>>>> On 08 Jan 2016, at 17:50, Dave Pooser <dave-oo at pooserville.com> wrote: 
> >>>>>> 
> >>>>>>>> On Jan 8, 2016, at 11:22 AM, Brian Hechinger <wonko at 4amlunch.net> wrote: 
> >>>>>>>> 
> >>>>>>>> No, ZFS raid10 
> >>>>>>> 
> >>>>>>> Saw the HW-RAID term, and got concerned.  That's what, raidz2 in ZFS-ese? 
> >>>>>> 
> >>>>>> It's a zpool with multiple mirror vdevs. 
> >>>>>> 
> >>>>>> -- 
> >>>>>> Dave Pooser 
> >>>>>> Cat-Herder-in-Chief, Pooserville.com 
> >>>>>> 
> >>>>>> 
> >>>>>> _______________________________________________ 
> >>>>>> OmniOS-discuss mailing list 
> >>>>>> OmniOS-discuss at lists.omniti.com 
> >>>>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss <http://lists.omniti.com/mailman/listinfo/omnios-discuss> 
> >>>>> 
> >>>>> _______________________________________________ 
> >>>>> OmniOS-discuss mailing list 
> >>>>> OmniOS-discuss at lists.omniti.com 
> >>>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss <http://lists.omniti.com/mailman/listinfo/omnios-discuss> 
> >>>> 
> >>>> _______________________________________________ 
> >>>> OmniOS-discuss mailing list 
> >>>> OmniOS-discuss at lists.omniti.com 
> >>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss <http://lists.omniti.com/mailman/listinfo/omnios-discuss> 
> >>> 
> >> 
> > 
> 
> 
> 
> http://www.listbox.com <http://www.listbox.com/> 
> illumos-discuss | Archives <https://www.listbox.com/member/archive/182180/=now>  <https://www.listbox.com/member/archive/rss/182180/26677440-40b316d8> | Modify <https://www.listbox.com/member/?member_id=26677440&id_secret=26677440-8fd7f4fe> Your Subscription	 <http://www.listbox.com/>
> 
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20160112/3d78d1e1/attachment-0001.html>


More information about the OmniOS-discuss mailing list