From fcliang at baolict.com  Tue Dec  1 17:03:46 2015
From: fcliang at baolict.com (Fucai.Liang)
Date: Wed, 2 Dec 2015 01:03:46 +0800
Subject: [OmniOS-discuss] qemu-system-x86_64 can not locked enough memory
Message-ID: <D186A49E-FE0D-43D5-833A-15F75A1D5088@baolict.com>


Hello, guys?

I has a server running OmniOS v11 r151016.  the server have 32G memory .  
I star tow kvm virtual machines by running the following commands:

qemu-system-x86_64 -enable-kvm -vnc 0.0.0.0:12 -cpu host -smp 4 -m 8192 -no-hpe


qemu-system-x86_64 -enable-kvm -vnc 0.0.0.0:11 -cpu host -smp 2 -m 4096 -no-hpe

one use 8G memory and the other one use 4G memory.

now the memory usage of the system as following:

root at BLCC01:/root# prtconf | grep Memory
Memory size: 32760 Megabytes
root at BLCC01:/root#  echo "::memstat" | mdb -k
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                     549618              2146    7%
ZFS File Data              668992              2613    8%
Anon                      3198732             12495   38%
Exec and libs                1411                 5    0%
Page cache                   4402                17    0%
Free (cachelist)            10578                41    0%
Free (freelist)           3950545             15431   47%

Total                     8384278             32751
Physical                  8384277             32751
root at BLCC01:/root# swap -sh
total: 12G allocated + 35M reserved = 12G used, 6.8G available
root at BLCC01:/root# swap -l
swapfile             dev    swaplo   blocks     free
/dev/zvol/dsk/rpool/swap 263,2         8  8388600  8388600
root at BLCC01:/root#


root at BLCC01:/root# prctl $$

project.max-locked-memory
       usage           12.0GB
       system          16.0EB    max   deny                                 -
project.max-port-ids
       privileged      8.19K       -   deny                                 -
       system          65.5K     max   deny                                 -
project.max-shm-memory
       privileged      8.00GB      -   deny                                 -
       system          16.0EB    max   deny                                 -






#prstat -J

PROJID    NPROC  SWAP   RSS MEMORY      TIME  CPU PROJECT
    1        5   12G   12G    38%   1:07:23 5.6% user.root
    0       43   72M   76M   0.2%   0:00:59 0.0% system
    3        5 4392K   14M   0.0%   0:00:00 0.0% default



then I start the third vm (4G memory), it got the following error :


qemu-system-x86_64 -enable-kvm -vnc 0.0.0.0:2 -cpu host -smp 2 -m 4096 -no-hpet

qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying?


I got 15G free memory in the system, why qemu-system-x86_64 can not locked enough memory ?

Thanks for your help !

sorry for my poor english !




-----------------------------------
fcliang




From danmcd at omniti.com  Tue Dec  1 17:11:43 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 1 Dec 2015 12:11:43 -0500
Subject: [OmniOS-discuss] qemu-system-x86_64 can not locked enough memory
In-Reply-To: <D186A49E-FE0D-43D5-833A-15F75A1D5088@baolict.com>
References: <D186A49E-FE0D-43D5-833A-15F75A1D5088@baolict.com>
Message-ID: <D6885F04-FF25-4260-94E6-11EC65C1439B@omniti.com>


> On Dec 1, 2015, at 12:03 PM, Fucai.Liang <fcliang at baolict.com> wrote:
> 
> then I start the third vm (4G memory), it got the following error :
> 
> 
> qemu-system-x86_64 -enable-kvm -vnc 0.0.0.0:2 -cpu host -smp 2 -m 4096 -no-hpet
> 
> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying?
> 
> 
> I got 15G free memory in the system, why qemu-system-x86_64 can not locked enough memory ?

What does "vmstat 1 5" say prior to your launch of the third VM?

Dan


From fcliang at baolict.com  Tue Dec  1 17:24:34 2015
From: fcliang at baolict.com (Fucai.Liang)
Date: Wed, 2 Dec 2015 01:24:34 +0800
Subject: [OmniOS-discuss] qemu-system-x86_64 can not locked enough memory
In-Reply-To: <D6885F04-FF25-4260-94E6-11EC65C1439B@omniti.com>
References: <D186A49E-FE0D-43D5-833A-15F75A1D5088@baolict.com>
	<D6885F04-FF25-4260-94E6-11EC65C1439B@omniti.com>
Message-ID: <13D19B56-7F73-40EB-88B7-0551A2D76316@baolict.com>


root at BLCC01:/root# vmstat 15
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr ro s0 s1 s2   in   sy   cs us sy id
 0 0 0 8327684 16628108 21 334 0 0 0  0 39  1  1 28 28 3971 22009 9713 1  6 93
 0 0 0 7135732 15842152 1 3  0  0  0  0  0  0  0 19 19 3895 22852 9855 1  5 93
 1 0 0 7135692 15842176 0 0  0  0  0  0  0  0  0 26 26 4053 22766 9903 1  5 94
 0 0 0 7135656 15842140 0 0  0  0  0  0  0  0  0 20 20 4001 22727 9858 1  5 94
                                                                                                                                   ??????????????>launch third VM .
 1 0 0 1966932 14275356 13 98 0 0  0  0  0  0  0 20 20 4103 22893 10480 1 8 91
 0 0 0 1037608 13954964 2 30 0  0  0  0  0  0  0 19 19 4195 23059 10683 1 6 93
 0 0 0 1037280 13954636 0 0  0  0  0  0  0  0  0 24 24 4312 22948 10636 1 5 94
 0 0 0 1037112 13954468 0 0  0  0  0  0  0  0  0 21 21 4362 22927 10678 1 5 93
 0 0 0 1037288 13954644 0 0  0  0  0  0  0  0  0 19 19 4256 22897 10551 1 5 94
 0 0 0 1037412 13954768 0 0  0  0  0  0  0  0  0 19 19 4384 23172 10638 1 6 93

thank Dan 


-----------------------------------
fcliang



> On Dec 2, 2015, at 1:11 AM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> 
>> On Dec 1, 2015, at 12:03 PM, Fucai.Liang <fcliang at baolict.com> wrote:
>> 
>> then I start the third vm (4G memory), it got the following error :
>> 
>> 
>> qemu-system-x86_64 -enable-kvm -vnc 0.0.0.0:2 -cpu host -smp 2 -m 4096 -no-hpet
>> 
>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying?
>> 
>> 
>> I got 15G free memory in the system, why qemu-system-x86_64 can not locked enough memory ?
> 
> What does "vmstat 1 5" say prior to your launch of the third VM?
> 
> Dan
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151202/bd67657e/attachment-0001.html>

From josh at sysmgr.org  Tue Dec  1 17:37:12 2015
From: josh at sysmgr.org (Joshua M. Clulow)
Date: Tue, 1 Dec 2015 09:37:12 -0800
Subject: [OmniOS-discuss] qemu-system-x86_64 can not locked enough memory
In-Reply-To: <D6885F04-FF25-4260-94E6-11EC65C1439B@omniti.com>
References: <D186A49E-FE0D-43D5-833A-15F75A1D5088@baolict.com>
	<D6885F04-FF25-4260-94E6-11EC65C1439B@omniti.com>
Message-ID: <CAEwA5n+RRjwOeDiOdAavG40u-P=yYgNznn5mAACsUKir3Si16w@mail.gmail.com>

On 1 December 2015 at 09:11, Dan McDonald <danmcd at omniti.com> wrote:
>> On Dec 1, 2015, at 12:03 PM, Fucai.Liang <fcliang at baolict.com> wrote:
>> then I start the third vm (4G memory), it got the following error :
>> qemu-system-x86_64 -enable-kvm -vnc 0.0.0.0:2 -cpu host -smp 2 -m 4096 -no-hpet
>>
>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying?
>>
>> I got 15G free memory in the system, why qemu-system-x86_64 can not locked enough memory ?
> What does "vmstat 1 5" say prior to your launch of the third VM?

I suspect it will show you have free memory available, but that what
is really happening is getting here:

  https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/vm/seg_vn.c#L7989-L8002

This is likely failing in page_pp_lock() because "availrmem" has
fallen below "pages_pp_maximum":

  https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/vm/vm_page.c#L3817-L3818

We set this value here, though it can be overridden in "/etc/system":

  https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/vm/vm_page.c#L423-L436

You can look at the current values with mdb:

  mdb -ke 'availrmem/D ; pages_pp_maximum/D'

Increasing this value doesn't seem to be without risk: I believe that
it can lead to memory exhaustion deadlocks, amongst other things.  I
don't know if it's expected to be tuneable without a reboot.


Cheers.

-- 
Joshua M. Clulow
UNIX Admin/Developer
http://blog.sysmgr.org

From fcliang at baolict.com  Wed Dec  2 03:38:35 2015
From: fcliang at baolict.com (=?utf-8?Q?Fucai_Liang_=EF=BC=88BLCT=EF=BC=89?=)
Date: Wed, 2 Dec 2015 11:38:35 +0800
Subject: [OmniOS-discuss] qemu-system-x86_64 can not locked enough memory
In-Reply-To: <CAEwA5n+RRjwOeDiOdAavG40u-P=yYgNznn5mAACsUKir3Si16w@mail.gmail.com>
References: <D186A49E-FE0D-43D5-833A-15F75A1D5088@baolict.com>
	<D6885F04-FF25-4260-94E6-11EC65C1439B@omniti.com>
	<CAEwA5n+RRjwOeDiOdAavG40u-P=yYgNznn5mAACsUKir3Si16w@mail.gmail.com>
Message-ID: <26523C65-285A-41C3-8C5C-CD509D20F965@baolict.com>


Thank for  your help!

when the server boot up, it has 7989066 pages  availrmem.  after I launch one VM (8Gmemory),  availrmem decrease to 4756624 .



7989066-4756624 = 3232442

3232442/256 = 12626.7265625 / 1024 = 12.3G



root at BLCC01:/root# mdb -ke 'availrmem/D ; pages_pp_maximum/D'
availrmem:
availrmem:      7989066
pages_pp_maximum:
pages_pp_maximum:               325044

root at BLCC01:/root# qemu-system-x86_64 -enable-kvm -vnc 0.0.0.0:12 -cpu host -smp 4 -m 8192 -no-hpe



root at BLCC01:/root# mdb -ke 'availrmem/D ; pages_pp_maximum/D'
availrmem:
availrmem:      4756624
pages_pp_maximum:
pages_pp_maximum:               325044
root at BLCC01:/root#


That mean the VM  use 12.3G availrmem , how it happens ?

Thank !




------------------------------
fcliang




On Dec 2, 2015, at 1:37, Joshua M. Clulow <josh at sysmgr.org> wrote:

> On 1 December 2015 at 09:11, Dan McDonald <danmcd at omniti.com> wrote:
>>> On Dec 1, 2015, at 12:03 PM, Fucai.Liang <fcliang at baolict.com> wrote:
>>> then I start the third vm (4G memory), it got the following error :
>>> qemu-system-x86_64 -enable-kvm -vnc 0.0.0.0:2 -cpu host -smp 2 -m 4096 -no-hpet
>>> 
>>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
>>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
>>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
>>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
>>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying...
>>> qemu_mlock: have only locked 1940582400 of 4294967296 bytes; still trying?
>>> 
>>> I got 15G free memory in the system, why qemu-system-x86_64 can not locked enough memory ?
>> What does "vmstat 1 5" say prior to your launch of the third VM?
> 
> I suspect it will show you have free memory available, but that what
> is really happening is getting here:
> 
>  https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/vm/seg_vn.c#L7989-L8002
> 
> This is likely failing in page_pp_lock() because "availrmem" has
> fallen below "pages_pp_maximum":
> 
>  https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/vm/vm_page.c#L3817-L3818
> 
> We set this value here, though it can be overridden in "/etc/system":
> 
>  https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/vm/vm_page.c#L423-L436
> 
> You can look at the current values with mdb:
> 
>  mdb -ke 'availrmem/D ; pages_pp_maximum/D'
> 
> Increasing this value doesn't seem to be without risk: I believe that
> it can lead to memory exhaustion deadlocks, amongst other things.  I
> don't know if it's expected to be tuneable without a reboot.
> 
> 
> Cheers.
> 
> -- 
> Joshua M. Clulow
> UNIX Admin/Developer
> http://blog.sysmgr.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151202/54940a76/attachment.html>

From omnios at citrus-it.net  Wed Dec  2 15:01:21 2015
From: omnios at citrus-it.net (Andy Fiddaman)
Date: Wed, 2 Dec 2015 15:01:21 +0000 (UTC)
Subject: [OmniOS-discuss] PCRE version
Message-ID: <alpine.GSO.2.00.1512021456290.9809@erncre.pvgehf-vg.arg>


Whilst playing around with the latest version of ClamAV I notice that it
now prints this warning:

configure: WARNING: The installed pcre version may contain a security bug.
Please upgrade to 8.38 or later: http://www.pcre.org.

There is some information on the security fixes in 8.38 at
https://blog.fuzzing-project.org/29-Heap-Overflow-in-PCRE.html but I
couldn't find anything specific on the Exim mailing list.

May be worth bumping the PCRE version.

Andy
-- 
Citrus IT Limited | +44 (0)870 199 8000 | enquiries at citrus-it.co.uk
Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ
Registered in England and Wales | Company number 4899123


From danmcd at omniti.com  Wed Dec  2 15:09:15 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 2 Dec 2015 10:09:15 -0500
Subject: [OmniOS-discuss] PCRE version
In-Reply-To: <alpine.GSO.2.00.1512021456290.9809@erncre.pvgehf-vg.arg>
References: <alpine.GSO.2.00.1512021456290.9809@erncre.pvgehf-vg.arg>
Message-ID: <8AA43FEA-E2C9-4169-9A74-3A12A639846A@omniti.com>


> On Dec 2, 2015, at 10:01 AM, Andy Fiddaman <omnios at citrus-it.net> wrote:
> 
> May be worth bumping the PCRE version.

Sure is.  I followed this, but wasn't sure how deeply it might affect the user base.  Someone's asking, so I'll take care of it.  I'll have to bump it on all the releases.

Watch for it later today.

Dan


From danmcd at omniti.com  Wed Dec  2 16:21:10 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 2 Dec 2015 11:21:10 -0500
Subject: [OmniOS-discuss] PCRE now updated for OmniOS
Message-ID: <EF744BD0-2A5F-4B6B-AF46-FAEC0A8A90C0@omniti.com>

Several CVEs have been filed against PCRE (Perl Compatible Regular Expressions).  All supported versions of OmniOS (r151006, r151014, and r151016) have updates of PCRE to version 8.38.

This is a non-reboot-needed update, but you may need to restart certain services, especially those not provided by the system.

Thanks,
Dan


From cks at cs.toronto.edu  Wed Dec  2 19:16:59 2015
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Wed, 02 Dec 2015 14:16:59 -0500
Subject: [OmniOS-discuss] What's the best way to detect OmniOS version,
	specifically r151014?
Message-ID: <20151202191659.3635A7A0875@apps0.cs.toronto.edu>

 We have at least one shell script that needs to know if it's running
on a host with OmniOS r151014 versus a host with an earlier OmniOS
version (due to the change in ZFS pool reservations from 1/64th of the
pool to 1/32nd of the pool that we picked up with r151014). Is there
any particular good way for a shell script to determine this, ideally
in a lightweight way and without requiring root permissions?

 Thanks in advance.

	- cks

From ikaufman at eng.ucsd.edu  Wed Dec  2 19:19:53 2015
From: ikaufman at eng.ucsd.edu (Ian Kaufman)
Date: Wed, 2 Dec 2015 11:19:53 -0800
Subject: [OmniOS-discuss] What's the best way to detect OmniOS version,
 specifically r151014?
In-Reply-To: <20151202191659.3635A7A0875@apps0.cs.toronto.edu>
References: <20151202191659.3635A7A0875@apps0.cs.toronto.edu>
Message-ID: <CAPJtH1iHEutXunQ=OLsb4MuStHOWO6hu2vxT+=kGxLdkuGb=UA@mail.gmail.com>

Examine /etc/release?

Ian

On Wed, Dec 2, 2015 at 11:16 AM, Chris Siebenmann <cks at cs.toronto.edu>
wrote:

>  We have at least one shell script that needs to know if it's running
> on a host with OmniOS r151014 versus a host with an earlier OmniOS
> version (due to the change in ZFS pool reservations from 1/64th of the
> pool to 1/32nd of the pool that we picked up with r151014). Is there
> any particular good way for a shell script to determine this, ideally
> in a lightweight way and without requiring root permissions?
>
>  Thanks in advance.
>
>         - cks
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>



-- 
Ian Kaufman
Research Systems Administrator
UC San Diego, Jacobs School of Engineering ikaufman AT ucsd DOT edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151202/c03880d5/attachment.html>

From danmcd at omniti.com  Wed Dec  2 19:22:37 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 2 Dec 2015 14:22:37 -0500
Subject: [OmniOS-discuss] What's the best way to detect OmniOS version,
	specifically r151014?
In-Reply-To: <20151202191659.3635A7A0875@apps0.cs.toronto.edu>
References: <20151202191659.3635A7A0875@apps0.cs.toronto.edu>
Message-ID: <A226A760-3413-409F-8192-0C3349DFD2BD@omniti.com>


> On Dec 2, 2015, at 2:16 PM, Chris Siebenmann <cks at cs.toronto.edu> wrote:
> 
> We have at least one shell script that needs to know if it's running
> on a host with OmniOS r151014 versus a host with an earlier OmniOS
> version (due to the change in ZFS pool reservations from 1/64th of the
> pool to 1/32nd of the pool that we picked up with r151014). Is there
> any particular good way for a shell script to determine this, ideally
> in a lightweight way and without requiring root permissions?
> 
> Thanks in advance.

/etc/release is the stable interface.  We use it ourselves in the omniti-ms gate:

	# Determine what release we're running as that affects some versions of things
	RELEASE=$(head -1 /etc/release | awk '{ print $3 }') 

Hope this helps,
Dan


From cks at cs.toronto.edu  Wed Dec  2 19:23:09 2015
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Wed, 02 Dec 2015 14:23:09 -0500
Subject: [OmniOS-discuss] What's the best way to detect OmniOS version,
	specifically r151014?
In-Reply-To: ikaufman's message of Wed, 02 Dec 2015 11:19:53 -0800.
	<CAPJtH1iHEutXunQ=OLsb4MuStHOWO6hu2vxT+=kGxLdkuGb=UA@mail.gmail.com>
Message-ID: <20151202192309.BCCCE7A0875@apps0.cs.toronto.edu>

> On Wed, Dec 2, 2015 at 11:16 AM, Chris Siebenmann <cks at cs.toronto.edu>
> wrote:
> >  We have at least one shell script that needs to know if it's
> > running on a host with OmniOS r151014 versus a host with an earlier
> > OmniOS version (due to the change in ZFS pool reservations from
> > 1/64th of the pool to 1/32nd of the pool that we picked up with
> > r151014). Is there any particular good way for a shell script to
> > determine this, ideally in a lightweight way and without requiring
> > root permissions?
>
> Examine /etc/release?

 Somehow I missed that file. This looks like exactly what I want;
I can easily match against the first line. Thank you.

	- cks

From doug at will.to  Wed Dec  2 19:34:28 2015
From: doug at will.to (Doug Hughes)
Date: Wed, 2 Dec 2015 14:34:28 -0500
Subject: [OmniOS-discuss] What's the best way to detect OmniOS version,
 specifically r151014?
In-Reply-To: <20151202191659.3635A7A0875@apps0.cs.toronto.edu>
References: <20151202191659.3635A7A0875@apps0.cs.toronto.edu>
Message-ID: <6519793c-24ba-4446-b8da-576b9e3697e8.maildroid@localhost>

You can look at /etc/release. It's on the 1st line.


Sent from my android device.

-----Original Message-----
From: Chris Siebenmann <cks at cs.toronto.edu>
To: omnios-discuss at lists.omniti.com
Sent: Wed, 02 Dec 2015 14:30
Subject: [OmniOS-discuss] What's the best way to detect OmniOS version, specifically r151014?

 We have at least one shell script that needs to know if it's running
on a host with OmniOS r151014 versus a host with an earlier OmniOS
version (due to the change in ZFS pool reservations from 1/64th of the
pool to 1/32nd of the pool that we picked up with r151014). Is there
any particular good way for a shell script to determine this, ideally
in a lightweight way and without requiring root permissions?

 Thanks in advance.

	- cks
_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151202/0ef783ce/attachment.html>

From peter.tribble at gmail.com  Thu Dec  3 10:35:58 2015
From: peter.tribble at gmail.com (Peter Tribble)
Date: Thu, 3 Dec 2015 10:35:58 +0000
Subject: [OmniOS-discuss] PCRE now updated for OmniOS
In-Reply-To: <EF744BD0-2A5F-4B6B-AF46-FAEC0A8A90C0@omniti.com>
References: <EF744BD0-2A5F-4B6B-AF46-FAEC0A8A90C0@omniti.com>
Message-ID: <CAEgYsbHJfVTWpasn5CQLT=6UgLM+wTt9gQVTxR6JeZ5hGpyE1w@mail.gmail.com>

On Wed, Dec 2, 2015 at 4:21 PM, Dan McDonald <danmcd at omniti.com> wrote:

> Several CVEs have been filed against PCRE (Perl Compatible Regular
> Expressions).  All supported versions of OmniOS (r151006, r151014, and
> r151016) have updates of PCRE to version 8.38.
>
> This is a non-reboot-needed update, but you may need to restart certain
> services, especially those not provided by the system.
>

That's not entirely true, unfortunately. The versions appear to be
overconstrained, so you need to update entire and omnios-userland
in order to see the updated pcre packages (which will require a reboot
if you aren't current). All my machines simply tell me that no updates
are available.

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151203/9cd7b590/attachment.html>

From danmcd at omniti.com  Thu Dec  3 14:05:08 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Thu, 3 Dec 2015 09:05:08 -0500
Subject: [OmniOS-discuss] PCRE now updated for OmniOS
In-Reply-To: <CAEgYsbHJfVTWpasn5CQLT=6UgLM+wTt9gQVTxR6JeZ5hGpyE1w@mail.gmail.com>
References: <EF744BD0-2A5F-4B6B-AF46-FAEC0A8A90C0@omniti.com>
	<CAEgYsbHJfVTWpasn5CQLT=6UgLM+wTt9gQVTxR6JeZ5hGpyE1w@mail.gmail.com>
Message-ID: <912F6657-4DDD-4DE5-BD64-C45D6E467733@omniti.com>




Sent from my iPhone (typos, autocorrect, and all)
> On Dec 3, 2015, at 5:35 AM, Peter Tribble <peter.tribble at gmail.com> wrote:
> 
> (which will require a reboot
> if you aren't current)

Much of the entire and OmniOS-userland constraints could be loosened, and at least in a few cases, they have.

As for not being current, I assume most folks stay up to date.  The most recent reboot-required update included potential security fixes, so I (perhaps incorrectly) assume people take their machines to updates when I release them.  The one mentioned here:

http://lists.omniti.com/pipermail/omnios-discuss/2015-November/005950.html

Included a KVM driver update that closed a hole, eg.

Dan

Sent from my iPhone (typos, autocorrect, and all)


From danmcd at omniti.com  Thu Dec  3 21:18:18 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Thu, 3 Dec 2015 16:18:18 -0500
Subject: [OmniOS-discuss] OpenSSL updates for OmniOS
Message-ID: <936B7914-AD43-4B35-AA23-2102C105F126@omniti.com>

OpenSSL 1.0.2e is now available for LTS (r151014), Stable (r151016), and will also be ready for the next large update of bloody.

OpenSSL 1.0.1q is now available for old-LTS (r151006).

Additionally, LTS receives a bump to wget, to work better with modern HTTPS servers, and old-LTS gets a bump in "entire", due to a previous packaging error.

These are SECURITY FIXES and you should "pkg update" as soon as possible.

Dan


From omnios at citrus-it.net  Thu Dec  3 23:03:11 2015
From: omnios at citrus-it.net (Andy Fiddaman)
Date: Thu, 3 Dec 2015 23:03:11 +0000 (UTC)
Subject: [OmniOS-discuss] PCRE now updated for OmniOS
In-Reply-To: <EF744BD0-2A5F-4B6B-AF46-FAEC0A8A90C0@omniti.com>
References: <EF744BD0-2A5F-4B6B-AF46-FAEC0A8A90C0@omniti.com>
Message-ID: <alpine.GSO.2.00.1512032302550.22669@erncre.pvgehf-vg.arg>


On Wed, 2 Dec 2015, Dan McDonald wrote:

; Several CVEs have been filed against PCRE (Perl Compatible Regular Expressions).  All supported versions of OmniOS (r151006, r151014, and r151016) have updates of PCRE to version 8.38.
;
; This is a non-reboot-needed update, but you may need to restart certain services, especially those not provided by the system.

Thanks Dan. Quick as ever!

Andy
-- 
Citrus IT Limited | +44 (0)870 199 8000 | enquiries at citrus-it.co.uk
Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ
Registered in England and Wales | Company number 4899123


From paladinemishakal at gmail.com  Fri Dec  4 09:38:52 2015
From: paladinemishakal at gmail.com (Lawrence Giam)
Date: Fri, 4 Dec 2015 17:38:52 +0800
Subject: [OmniOS-discuss] core dump while trying to import pool
Message-ID: <CAGueQCfmTuFoch_4xPhy3sWM3KCT5dJB8TOVSmq=Lt_BR_jO1A@mail.gmail.com>

Hi All,

I have a problem here is that I am upgrading my server OS from OpenIndiana
151a7 to OmniOS R151014. While working on the upgrade, I have detached the
sas expander from the main chassis and the installation was proceeding fine.

When the upgrade is done, I connect back the SAS expander. I have 2 pool
which one is on the main chassis and the other one is on the SAS expander.
When I was trying to import the pool on the main chassis, the system core
dump and rebooted.

I removed the SAS expander and attempt to boot with the main chassis. The
system displayed this on the screen:
svc.startd[10]: svc:/system/boot-archive:default: Method or service exit
timed out. Killing contract 15.
svc.startd[10]: svc:/system/boot-archive:default: Method
"/lib/svc/method/boot-archive" failed due to signal KILL.

console login: Reading ZFS config: done.
Mounting ZFS filesystems: (43/1018)

After a while, the system core dump again
ffffff003ee4a170 unix:die+df ()
ffffff003ee4a280 unix:trap+db3 ()
ffffff003ee4a290 unix:cmntrap+e6 ()
ffffff003ee4a3d0 zfs:zap_leaf_lookup_closest+45 ()
ffffff003ee4a470 zfs:fzap_cursor_retrieve+bb ()
ffffff003ee4a510 zfs:zap_cursor_trtrieve+11e ()
ffffff003ee4a700 zfs:zfs_purgedir+67 ()
ffffff003ee4a750 zfs:zfs_rmnode+202 ()
ffffff003ee4a790 zfs:zfs_zinactive+e8 ()
ffffff003ee4a7f0 zfs:zfs_inactive+75 ()
ffffff003ee4a850 genunix:fop_inactive+76 ()
ffffff003ee4a880 genunix:vn_rele+82 ()
ffffff003ee4aa70 zfs:zfs_unlinked_drain++aa ()
ffffff003ee4aab0 zfs:zfsvfs_setup+e8 ()
ffffff003ee4ab10 zfs:zfs_domount+131 ()
ffffff003ee4ac40 zfs:zfs_mount+24f ()
ffffff003ee4ac70 genunix:fsop_mount+1e ()
ffffff003ee4ad70 genunix:domount+86b ()
ffffff003ee4ae80 genunix:mount+167 ()
ffffff003ee4aec0 genunix: syscall_ap+94 ()
ffffff003ee4af10 unix:brand_sys_sysenter+1c9 ()

syncing file systems.... done
dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel

Now, I boot to single user mode.

I need help urgently, what should I do next?

Thanks & Regards,
Lawrence.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151204/7e3c5658/attachment.html>

From paladinemishakal at gmail.com  Fri Dec  4 10:40:18 2015
From: paladinemishakal at gmail.com (Lawrence Giam)
Date: Fri, 4 Dec 2015 18:40:18 +0800
Subject: [OmniOS-discuss] core dump while trying to import pool
In-Reply-To: <CAGueQCfmTuFoch_4xPhy3sWM3KCT5dJB8TOVSmq=Lt_BR_jO1A@mail.gmail.com>
References: <CAGueQCfmTuFoch_4xPhy3sWM3KCT5dJB8TOVSmq=Lt_BR_jO1A@mail.gmail.com>
Message-ID: <CAGueQCdBHv8-oOnfPrLDCRE40xVUL_S52JRf8gfJgJ=FOrOe5g@mail.gmail.com>

Not sure is this is link to what I am facing:

lawrence at sgsan7r:/export/home/lawrence$ fmdump -Vp -u
036b26cc-a99a-c9a5-9a1e-df89eef1be5d
TIME                           UUID
SUNW-MSG-ID
Dec 04 2015 17:50:58.724277000 036b26cc-a99a-c9a5-9a1e-df89eef1be5d
SUNOS-8000-KL

  TIME                 CLASS                                 ENA
  Dec 04 17:50:58.7180 ireport.os.sunos.panic.dump_available
0x0000000000000000
  Dec 04 17:50:53.9974 ireport.os.sunos.panic.dump_pending_on_device
0x0000000000000000

nvlist version: 0
        version = 0x0
        class = list.suspect
        uuid = 036b26cc-a99a-c9a5-9a1e-df89eef1be5d
        code = SUNOS-8000-KL
        diag-time = 1449222658 719027
        de = fmd:///module/software-diagnosis
        fault-list-sz = 0x1
        fault-list = (array of embedded nvlists)
        (start fault-list[0])
        nvlist version: 0
                version = 0x0
                class = defect.sunos.kernel.panic
                certainty = 0x64
                asru =
sw:///:path=/var/crash/unknown/.036b26cc-a99a-c9a5-9a1e-df89eef1be5d
                resource =
sw:///:path=/var/crash/unknown/.036b26cc-a99a-c9a5-9a1e-df89eef1be5d
                savecore-succcess = 1
                dump-dir = /var/crash/unknown
                dump-files = vmdump.0
                os-instance-uuid = 036b26cc-a99a-c9a5-9a1e-df89eef1be5d
                panicstr = BAD TRAP: type=e (#pf Page fault)
rp=ffffff003ee4a290 addr=20 occurred in module "zfs" due to a NULL pointer
dereference
                panicstack = unix:die+df () | unix:trap+db3 () |
unix:cmntrap+e6 () | zfs:zap_leaf_lookup_closest+45 () |
zfs:fzap_cursor_retrieve+bb () | zfs:zap_cursor_retrieve+11e () |
zfs:zfs_purgedir+67 () | zfs:zfs_rmnode+202 () | zfs:zfs_zinactive+e8 () |
zfs:zfs_inactive+75 () | genunix:fop_inactive+76 () | genunix:vn_rele+82 ()
| zfs:zfs_unlinked_drain+aa () | zfs:zfsvfs_setup+e8 () |
zfs:zfs_domount+131 () | zfs:zfs_mount+24f () | genunix:fsop_mount+1e () |
genunix:domount+86b () | genunix:mount+167 () | genunix:syscall_ap+94 () |
unix:brand_sys_sysenter+1c9 () |
                crashtime = 1449220361
                panic-time = Fri Dec  4 17:12:41 2015 SGT
        (end fault-list[0])

        fault-status = 0x1
        severity = Major
        __ttl = 0x1
        __tod = 0x56616202 0x2b2b9708

On Fri, Dec 4, 2015 at 5:38 PM, Lawrence Giam <paladinemishakal at gmail.com>
wrote:

> Hi All,
>
> I have a problem here is that I am upgrading my server OS from OpenIndiana
> 151a7 to OmniOS R151014. While working on the upgrade, I have detached the
> sas expander from the main chassis and the installation was proceeding fine.
>
> When the upgrade is done, I connect back the SAS expander. I have 2 pool
> which one is on the main chassis and the other one is on the SAS expander.
> When I was trying to import the pool on the main chassis, the system core
> dump and rebooted.
>
> I removed the SAS expander and attempt to boot with the main chassis. The
> system displayed this on the screen:
> svc.startd[10]: svc:/system/boot-archive:default: Method or service exit
> timed out. Killing contract 15.
> svc.startd[10]: svc:/system/boot-archive:default: Method
> "/lib/svc/method/boot-archive" failed due to signal KILL.
>
> console login: Reading ZFS config: done.
> Mounting ZFS filesystems: (43/1018)
>
> After a while, the system core dump again
> ffffff003ee4a170 unix:die+df ()
> ffffff003ee4a280 unix:trap+db3 ()
> ffffff003ee4a290 unix:cmntrap+e6 ()
> ffffff003ee4a3d0 zfs:zap_leaf_lookup_closest+45 ()
> ffffff003ee4a470 zfs:fzap_cursor_retrieve+bb ()
> ffffff003ee4a510 zfs:zap_cursor_trtrieve+11e ()
> ffffff003ee4a700 zfs:zfs_purgedir+67 ()
> ffffff003ee4a750 zfs:zfs_rmnode+202 ()
> ffffff003ee4a790 zfs:zfs_zinactive+e8 ()
> ffffff003ee4a7f0 zfs:zfs_inactive+75 ()
> ffffff003ee4a850 genunix:fop_inactive+76 ()
> ffffff003ee4a880 genunix:vn_rele+82 ()
> ffffff003ee4aa70 zfs:zfs_unlinked_drain++aa ()
> ffffff003ee4aab0 zfs:zfsvfs_setup+e8 ()
> ffffff003ee4ab10 zfs:zfs_domount+131 ()
> ffffff003ee4ac40 zfs:zfs_mount+24f ()
> ffffff003ee4ac70 genunix:fsop_mount+1e ()
> ffffff003ee4ad70 genunix:domount+86b ()
> ffffff003ee4ae80 genunix:mount+167 ()
> ffffff003ee4aec0 genunix: syscall_ap+94 ()
> ffffff003ee4af10 unix:brand_sys_sysenter+1c9 ()
>
> syncing file systems.... done
> dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
>
> Now, I boot to single user mode.
>
> I need help urgently, what should I do next?
>
> Thanks & Regards,
> Lawrence.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151204/b62e688d/attachment.html>

From jdg117 at elvis.arl.psu.edu  Fri Dec  4 12:54:31 2015
From: jdg117 at elvis.arl.psu.edu (John D Groenveld)
Date: Fri, 04 Dec 2015 07:54:31 -0500
Subject: [OmniOS-discuss] core dump while trying to import pool
In-Reply-To: Your message of "Fri, 04 Dec 2015 17:38:52 +0800."
	<CAGueQCfmTuFoch_4xPhy3sWM3KCT5dJB8TOVSmq=Lt_BR_jO1A@mail.gmail.com> 
References: <CAGueQCfmTuFoch_4xPhy3sWM3KCT5dJB8TOVSmq=Lt_BR_jO1A@mail.gmail.com>
Message-ID: <201512041254.tB4CsVBJ005325@elvis.arl.psu.edu>

In message <CAGueQCfmTuFoch_4xPhy3sWM3KCT5dJB8TOVSmq=Lt_BR_jO1A at mail.gmail.com>
, Lawrence Giam writes:
>151a7 to OmniOS R151014. While working on the upgrade, I have detached the
>sas expander from the main chassis and the installation was proceeding fine.

If you boot 151016 installation media to single-user, can you
import your pools?

John
groenveld at acm.org


From danmcd at omniti.com  Fri Dec  4 15:42:34 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Fri, 4 Dec 2015 10:42:34 -0500
Subject: [OmniOS-discuss] core dump while trying to import pool
In-Reply-To: <CAGueQCfmTuFoch_4xPhy3sWM3KCT5dJB8TOVSmq=Lt_BR_jO1A@mail.gmail.com>
References: <CAGueQCfmTuFoch_4xPhy3sWM3KCT5dJB8TOVSmq=Lt_BR_jO1A@mail.gmail.com>
Message-ID: <71C5258A-C99E-44DF-BFE1-A1D5EE0CE686@omniti.com>


> On Dec 4, 2015, at 4:38 AM, Lawrence Giam <paladinemishakal at gmail.com> wrote:
> 
> Hi All,
> 
> I have a problem here is that I am upgrading my server OS from OpenIndiana 151a7 to OmniOS R151014. While working on the upgrade, I have detached the sas expander from the main chassis and the installation was proceeding fine.
> 
> When the upgrade is done, I connect back the SAS expander. I have 2 pool which one is on the main chassis and the other one is on the SAS expander. When I was trying to import the pool on the main chassis, the system core dump and rebooted.

I've seen this stack before:

> After a while, the system core dump again
> ffffff003ee4a170 unix:die+df ()
> ffffff003ee4a280 unix:trap+db3 ()
> ffffff003ee4a290 unix:cmntrap+e6 ()
> ffffff003ee4a3d0 zfs:zap_leaf_lookup_closest+45 ()
> ffffff003ee4a470 zfs:fzap_cursor_retrieve+bb ()
> ffffff003ee4a510 zfs:zap_cursor_trtrieve+11e ()
> ffffff003ee4a700 zfs:zfs_purgedir+67 ()
> ffffff003ee4a750 zfs:zfs_rmnode+202 ()
> ffffff003ee4a790 zfs:zfs_zinactive+e8 ()
> ffffff003ee4a7f0 zfs:zfs_inactive+75 ()
> ffffff003ee4a850 genunix:fop_inactive+76 ()
> ffffff003ee4a880 genunix:vn_rele+82 ()
> ffffff003ee4aa70 zfs:zfs_unlinked_drain++aa ()
> ffffff003ee4aab0 zfs:zfsvfs_setup+e8 ()
> ffffff003ee4ab10 zfs:zfs_domount+131 ()
> ffffff003ee4ac40 zfs:zfs_mount+24f ()
> ffffff003ee4ac70 genunix:fsop_mount+1e ()
> ffffff003ee4ad70 genunix:domount+86b ()
> ffffff003ee4ae80 genunix:mount+167 ()
> ffffff003ee4aec0 genunix: syscall_ap+94 ()
> ffffff003ee4af10 unix:brand_sys_sysenter+1c9 ()

Tell me, do you have an L2ARC on this pool?

And John's suggestion is a very good one:  Boot the 016 ISO, and see if a vanilla "zpool import <big-pool>" causes problems.

Dan



From paladinemishakal at gmail.com  Fri Dec  4 15:53:06 2015
From: paladinemishakal at gmail.com (Lawrence Giam)
Date: Fri, 4 Dec 2015 23:53:06 +0800
Subject: [OmniOS-discuss] core dump while trying to import pool
In-Reply-To: <71C5258A-C99E-44DF-BFE1-A1D5EE0CE686@omniti.com>
References: <CAGueQCfmTuFoch_4xPhy3sWM3KCT5dJB8TOVSmq=Lt_BR_jO1A@mail.gmail.com>
	<71C5258A-C99E-44DF-BFE1-A1D5EE0CE686@omniti.com>
Message-ID: <CAGueQCducyk7VNDkVGoJSj6WNfbtng4+E5126zFoXp0nOwRwBQ@mail.gmail.com>

Hi Dan,

No, I do not have a L2ARC on this server. This server is use to receive
"zfs send" and so I have a very large zfs filesystem.

Should I cancel the scrub and try the method that John suggest?

Regards.

On Fri, Dec 4, 2015 at 11:42 PM, Dan McDonald <danmcd at omniti.com> wrote:

>
> > On Dec 4, 2015, at 4:38 AM, Lawrence Giam <paladinemishakal at gmail.com>
> wrote:
> >
> > Hi All,
> >
> > I have a problem here is that I am upgrading my server OS from
> OpenIndiana 151a7 to OmniOS R151014. While working on the upgrade, I have
> detached the sas expander from the main chassis and the installation was
> proceeding fine.
> >
> > When the upgrade is done, I connect back the SAS expander. I have 2 pool
> which one is on the main chassis and the other one is on the SAS expander.
> When I was trying to import the pool on the main chassis, the system core
> dump and rebooted.
>
> I've seen this stack before:
>
> > After a while, the system core dump again
> > ffffff003ee4a170 unix:die+df ()
> > ffffff003ee4a280 unix:trap+db3 ()
> > ffffff003ee4a290 unix:cmntrap+e6 ()
> > ffffff003ee4a3d0 zfs:zap_leaf_lookup_closest+45 ()
> > ffffff003ee4a470 zfs:fzap_cursor_retrieve+bb ()
> > ffffff003ee4a510 zfs:zap_cursor_trtrieve+11e ()
> > ffffff003ee4a700 zfs:zfs_purgedir+67 ()
> > ffffff003ee4a750 zfs:zfs_rmnode+202 ()
> > ffffff003ee4a790 zfs:zfs_zinactive+e8 ()
> > ffffff003ee4a7f0 zfs:zfs_inactive+75 ()
> > ffffff003ee4a850 genunix:fop_inactive+76 ()
> > ffffff003ee4a880 genunix:vn_rele+82 ()
> > ffffff003ee4aa70 zfs:zfs_unlinked_drain++aa ()
> > ffffff003ee4aab0 zfs:zfsvfs_setup+e8 ()
> > ffffff003ee4ab10 zfs:zfs_domount+131 ()
> > ffffff003ee4ac40 zfs:zfs_mount+24f ()
> > ffffff003ee4ac70 genunix:fsop_mount+1e ()
> > ffffff003ee4ad70 genunix:domount+86b ()
> > ffffff003ee4ae80 genunix:mount+167 ()
> > ffffff003ee4aec0 genunix: syscall_ap+94 ()
> > ffffff003ee4af10 unix:brand_sys_sysenter+1c9 ()
>
> Tell me, do you have an L2ARC on this pool?
>
> And John's suggestion is a very good one:  Boot the 016 ISO, and see if a
> vanilla "zpool import <big-pool>" causes problems.
>
> Dan
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151204/1ce05838/attachment.html>

From danmcd at omniti.com  Fri Dec  4 15:56:20 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Fri, 4 Dec 2015 10:56:20 -0500
Subject: [OmniOS-discuss] core dump while trying to import pool
In-Reply-To: <CAGueQCducyk7VNDkVGoJSj6WNfbtng4+E5126zFoXp0nOwRwBQ@mail.gmail.com>
References: <CAGueQCfmTuFoch_4xPhy3sWM3KCT5dJB8TOVSmq=Lt_BR_jO1A@mail.gmail.com>
	<71C5258A-C99E-44DF-BFE1-A1D5EE0CE686@omniti.com>
	<CAGueQCducyk7VNDkVGoJSj6WNfbtng4+E5126zFoXp0nOwRwBQ@mail.gmail.com>
Message-ID: <589C8043-C3E2-4249-99E8-AA5A35E17892@omniti.com>


> On Dec 4, 2015, at 10:53 AM, Lawrence Giam <paladinemishakal at gmail.com> wrote:
> 
> Should I cancel the scrub and try the method that John suggest?
> 

I'd let the scrub run to be sure.  If it's the class of bug I'm thinking, though, scrub won't catch it.  :(

And if you can provide one of those r151014 core dumps, that'd be great.  If this pool has confidential data, though, I can understand why not.

Dan


From mtalbott at lji.org  Fri Dec  4 19:33:09 2015
From: mtalbott at lji.org (Michael Talbott)
Date: Fri, 4 Dec 2015 11:33:09 -0800
Subject: [OmniOS-discuss] core dump while trying to import pool
In-Reply-To: <589C8043-C3E2-4249-99E8-AA5A35E17892@omniti.com>
References: <CAGueQCfmTuFoch_4xPhy3sWM3KCT5dJB8TOVSmq=Lt_BR_jO1A@mail.gmail.com>
	<71C5258A-C99E-44DF-BFE1-A1D5EE0CE686@omniti.com>
	<CAGueQCducyk7VNDkVGoJSj6WNfbtng4+E5126zFoXp0nOwRwBQ@mail.gmail.com>
	<589C8043-C3E2-4249-99E8-AA5A35E17892@omniti.com>
Message-ID: <CAGP7N4PCESWoS-Et+-N7mpHCa_WwPyZ6=CkPD7Jp+-8YaMfOVQ@mail.gmail.com>

I also came upon this same issue after rebooting one of my OmniOS machines.
I did have l2arc devices on my pool until the announcement of the bug
found. At that point I immediately removed my l2arc devices and didn't
reboot the machine until a convenient time where if something bad were to
happen I could manage it. Well, it was good I planned for that reboot ;)

I was able to boot in single user mode, delete the pool cache file, reboot,
import without mounting (zpool import -N <pool>) and then scrub. Scrub
fixed 16kb of data in my 254TB pool.. then exported and imported the pool
as rw only to discover that it did not fix the problem at all. Importing as
read-only allows proper mounting to pull data off.

The problem for me stemmed around mounting 1 of my 52 filesystem as rw. I
was able to mount the filesystems one by one after a zpool import -N to
discover which filesystem was causing the issue.

I'm still rsync'ng the problem filesystem out since as luck would have it,
it was the only one that I wasn't replicating out (probably a good thing
considering) since I used it for a scratch drive. But my plan is to destroy
then recreate the problem fs after the sync finishes and rsync it back..
And cross my fingers that the problem doesn't come back or get worse..

The problem I'm seeing that causes this is:
BAD TRAP: type=e (#pf Page fault) rp=ffffff00f5cee290 addr=20 occurred in
module "zfs" due to a NULL pointer dereference



Here's the details of my crash which appears to be the same as yours:



root at store2:/var/crash/unknown# mdb unix.2 vmcore.2
Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc apix
scsi_vhci zfs mr_sas sd ip hook neti sockfs arp usba stmf stmf_sbd random
md lofs idm sata cpc crypto kvm mpt_sas ufs logindmux nsmb ptm smbsrv nfs
ipc ]
> $c
zap_leaf_lookup_closest+0x45(ffffff223e7bd290, 0, 0, ffffff00f5cee3f0)
fzap_cursor_retrieve+0xbb(ffffff223e7bd290, ffffff00f5cee650,
ffffff00f5cee530)
zap_cursor_retrieve+0x11e(ffffff00f5cee650, ffffff00f5cee530)
zfs_purgedir+0x67(ffffff2232f41bc0)
zfs_rmnode+0x202(ffffff2232f41bc0)
zfs_zinactive+0xe8(ffffff2232f41bc0)
zfs_inactive+0x75(ffffff2232f44640, ffffff221918b468, 0)
fop_inactive+0x76(ffffff2232f44640, ffffff221918b468, 0)
vn_rele+0x82(ffffff2232f44640)
zfs_unlinked_drain+0xaa(ffffff21f254d000)
zfsvfs_setup+0xe8(ffffff21f254d000, 1)
zfs_domount+0x131(ffffff223d709368, ffffff222916fd80)
zfs_mount+0x24f(ffffff223d709368, ffffff21f2645400, ffffff00f5ceee00,
ffffff221918b468)
fsop_mount+0x1e(ffffff223d709368, ffffff21f2645400, ffffff00f5ceee00,
ffffff221918b468)
domount+0x86b(0, ffffff00f5ceee00, ffffff21f2645400, ffffff221918b468,
ffffff00f5ceee40)
mount+0x167(ffffff2228e61c38, ffffff00f5ceee90)
syscall_ap+0x94()
_sys_sysenter_post_swapgs+0x149()
> ::status
debugging crash dump vmcore.2 (64-bit) from store2
operating system: 5.11 omnios-8322307 (i86pc)
image uuid: 69a1d6dd-f13a-627d-c2a0-b00c9e50a800
panic message:
BAD TRAP: type=e (#pf Page fault) rp=ffffff00f5cee290 addr=20 occurred in
module "zfs" due to a NULL pointer dereference
dump content: kernel pages only
> ::stack
zap_leaf_lookup_closest+0x45(ffffff223e7bd290, 0, 0, ffffff00f5cee3f0)
fzap_cursor_retrieve+0xbb(ffffff223e7bd290, ffffff00f5cee650,
ffffff00f5cee530)
zap_cursor_retrieve+0x11e(ffffff00f5cee650, ffffff00f5cee530)
zfs_purgedir+0x67(ffffff2232f41bc0)
zfs_rmnode+0x202(ffffff2232f41bc0)
zfs_zinactive+0xe8(ffffff2232f41bc0)
zfs_inactive+0x75(ffffff2232f44640, ffffff221918b468, 0)
fop_inactive+0x76(ffffff2232f44640, ffffff221918b468, 0)
vn_rele+0x82(ffffff2232f44640)
zfs_unlinked_drain+0xaa(ffffff21f254d000)
zfsvfs_setup+0xe8(ffffff21f254d000, 1)
zfs_domount+0x131(ffffff223d709368, ffffff222916fd80)
zfs_mount+0x24f(ffffff223d709368, ffffff21f2645400, ffffff00f5ceee00,
ffffff221918b468)
fsop_mount+0x1e(ffffff223d709368, ffffff21f2645400, ffffff00f5ceee00,
ffffff221918b468)
domount+0x86b(0, ffffff00f5ceee00, ffffff21f2645400, ffffff221918b468,
ffffff00f5ceee40)
mount+0x167(ffffff2228e61c38, ffffff00f5ceee90)
syscall_ap+0x94()
_sys_sysenter_post_swapgs+0x149()
> ::panicinfo
             cpu                3
          thread ffffff21f2968440
         message
BAD TRAP: type=e (#pf Page fault) rp=ffffff00f5cee290 addr=20 occurred in
module "zfs" due to a NULL pointer dereference
             rdi ffffff223e7bd290
             rsi                0
             rdx                8
             rcx         4170d6eb
              r8 ffffff00f5cee3f0
              r9 ffffff00f5cee1c8
             rax         4170d6f0
             rbx ffffff00f5cee650
             rbp ffffff00f5cee3d0
             r10 fffffffffb854358
             r11                0
             r12              800
             r13                0
             r14 ffffff00f5cee3f0
             r15 ffffff00f5cee530
          fsbase                0
          gsbase ffffff21f169c000
              ds               4b
              es               4b
              fs                0
              gs              1c3
          trapno                e
             err                0
             rip fffffffff7a11e95
              cs               30
          rflags            10206
             rsp ffffff00f5cee380
              ss               38
          gdt_hi                0
          gdt_lo         700001ef
          idt_hi                0
          idt_lo         40000fff
             ldt                0
            task               70
             cr0         8005003b
             cr2               20
             cr3       206fe00000
             cr4            426f8
>




________________________
Michael Talbott
Systems Administrator
La Jolla Institute

On Dec 4, 2015, at 7:56 AM, Dan McDonald <danmcd at omniti.com> wrote:


On Dec 4, 2015, at 10:53 AM, Lawrence Giam <paladinemishakal at gmail.com>
wrote:

Should I cancel the scrub and try the method that John suggest?


I'd let the scrub run to be sure.  If it's the class of bug I'm thinking,
though, scrub won't catch it.  :(

And if you can provide one of those r151014 core dumps, that'd be great.
If this pool has confidential data, though, I can understand why not.

Dan

_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151204/c2d3ed52/attachment-0001.html>

From waldenvik at gmx.com  Fri Dec  4 20:43:54 2015
From: waldenvik at gmx.com (Martin Waldenvik)
Date: Fri, 4 Dec 2015 21:43:54 +0100
Subject: [OmniOS-discuss] security updates and zones
Message-ID: <etPan.5661fb0a.5b4293c7.96e@pentos.kyriou.net>

Hi

Just a quick question. There was a security update the other day for omnios. What is the correct way to update the zones? Via pkg update in the zone or via zoneadm -z zone attach -u.

Best regards

-- 
Martin
Sent with Airmail

From danmcd at omniti.com  Fri Dec  4 20:56:32 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Fri, 4 Dec 2015 15:56:32 -0500
Subject: [OmniOS-discuss] security updates and zones
In-Reply-To: <etPan.5661fb0a.5b4293c7.96e@pentos.kyriou.net>
References: <etPan.5661fb0a.5b4293c7.96e@pentos.kyriou.net>
Message-ID: <2793292F-933F-431D-AA4C-09D619BF432F@omniti.com>


> On Dec 4, 2015, at 3:43 PM, Martin Waldenvik <waldenvik at gmx.com> wrote:
> 
> Hi
> 
> Just a quick question. There was a security update the other day for omnios. What is the correct way to update the zones? Via pkg update in the zone or via zoneadm -z zone attach -u.

If it's just openssl, you can do pkg update in each zone.

Use "pkg update -nv" to confirm things.  And use "pkg update --no-backup-be" to prevent backup-BEs from being created.

You do know that the "lipkg" zones are linked to global, and update when the global does, right? :)

Dan


From henson at acm.org  Sun Dec  6 02:17:39 2015
From: henson at acm.org (Paul B. Henson)
Date: Sat, 05 Dec 2015 18:17:39 -0800
Subject: [OmniOS-discuss] core dump while trying to import pool
In-Reply-To: <CAGP7N4PCESWoS-Et+-N7mpHCa_WwPyZ6=CkPD7Jp+-8YaMfOVQ@mail.gmail.com>
References: <CAGueQCfmTuFoch_4xPhy3sWM3KCT5dJB8TOVSmq=Lt_BR_jO1A@mail.gmail.com>
	<71C5258A-C99E-44DF-BFE1-A1D5EE0CE686@omniti.com>
	<CAGueQCducyk7VNDkVGoJSj6WNfbtng4+E5126zFoXp0nOwRwBQ@mail.gmail.com>
	<589C8043-C3E2-4249-99E8-AA5A35E17892@omniti.com>
	<CAGP7N4PCESWoS-Et+-N7mpHCa_WwPyZ6=CkPD7Jp+-8YaMfOVQ@mail.gmail.com>
Message-ID: <20151206021738.GT3405@bender.unx.cpp.edu>

On Fri, Dec 04, 2015 at 11:33:09AM -0800, Michael Talbott wrote:
> I also came upon this same issue after rebooting one of my OmniOS machines.
> I did have l2arc devices on my pool until the announcement of the bug
> found. At that point I immediately removed my l2arc devices and didn't
> reboot the machine until a convenient time where if something bad were to
> happen I could manage it. Well, it was good I planned for that reboot ;)

Hmm, out of curiosity, did you run a scrub and a zdb analysis of your
pool before you rebooted? I'm in a similar boat, I have a pool which had
L2ARC devices and might have been impacted by the bug. I removed the
devices, ran a scrub and zdb, with no complaints from either, which left
me reasonably hopeful the pool wasn't corrupted 8-/. I still haven't
rebooted it though, there's really no good time for a pool to go belly
up and potentially be unrecoverable :(. I was planning to do it over
Christmas break, but if you scrubbed and zdb'd your pool successfully
before rebooting and it still died that's gonna make me (extra) nervous
<sigh>.

Thanks...

From mtalbott at lji.org  Sun Dec  6 06:49:59 2015
From: mtalbott at lji.org (Michael Talbott)
Date: Sat, 5 Dec 2015 22:49:59 -0800
Subject: [OmniOS-discuss] core dump while trying to import pool
In-Reply-To: <20151206021738.GT3405@bender.unx.cpp.edu>
References: <CAGueQCfmTuFoch_4xPhy3sWM3KCT5dJB8TOVSmq=Lt_BR_jO1A@mail.gmail.com>
	<71C5258A-C99E-44DF-BFE1-A1D5EE0CE686@omniti.com>
	<CAGueQCducyk7VNDkVGoJSj6WNfbtng4+E5126zFoXp0nOwRwBQ@mail.gmail.com>
	<589C8043-C3E2-4249-99E8-AA5A35E17892@omniti.com>
	<CAGP7N4PCESWoS-Et+-N7mpHCa_WwPyZ6=CkPD7Jp+-8YaMfOVQ@mail.gmail.com>
	<20151206021738.GT3405@bender.unx.cpp.edu>
Message-ID: <317E3C4D-1AD5-4A57-95BD-B12624049595@lji.org>

I did not run a zdb check since this pool was over 200TB and figured it'd take weeks to finish. Maybe more, maybe not? I just planned for worst case scenarios before the reboot and am sure glad I did.

The pool was scrubbed several times between the time the l2arc devices were removed and the reboot all reported no errors. The problem surfaces (at least in my case) when a particular volume tries to mount as rw. After lots of googling I found a few other reports with the same backtrace that say they were able to work around a similar issue by mounting the volumes as readonly first and then after they were mounted to update the mount to rw. I didn't try that, but, maybe that would have worked? If so it sounded like that'd only have been a temporary fix until next reboot...

At any rate, a clean scrub alone is not an indicator of pool health regarding this bug. No clue if a zdb analyses would be a more determining factor.  My personal advise is plan for the worst and hope for the best with backups on hand. Better to plan for it than to let a fluke bug or power incident reveal it unexpectedly.

Since I didn't zdb it first.. Maybe your nerves can be at more ease? Good luck and let me know how things turn out.

Michael
Sent from my iPhone

> On Dec 5, 2015, at 6:17 PM, Paul B. Henson <henson at acm.org> wrote:
> 
>> On Fri, Dec 04, 2015 at 11:33:09AM -0800, Michael Talbott wrote:
>> I also came upon this same issue after rebooting one of my OmniOS machines.
>> I did have l2arc devices on my pool until the announcement of the bug
>> found. At that point I immediately removed my l2arc devices and didn't
>> reboot the machine until a convenient time where if something bad were to
>> happen I could manage it. Well, it was good I planned for that reboot ;)
> 
> Hmm, out of curiosity, did you run a scrub and a zdb analysis of your
> pool before you rebooted? I'm in a similar boat, I have a pool which had
> L2ARC devices and might have been impacted by the bug. I removed the
> devices, ran a scrub and zdb, with no complaints from either, which left
> me reasonably hopeful the pool wasn't corrupted 8-/. I still haven't
> rebooted it though, there's really no good time for a pool to go belly
> up and potentially be unrecoverable :(. I was planning to do it over
> Christmas break, but if you scrubbed and zdb'd your pool successfully
> before rebooting and it still died that's gonna make me (extra) nervous
> <sigh>.
> 
> Thanks...

From jeffpc at josefsipek.net  Sun Dec  6 14:45:14 2015
From: jeffpc at josefsipek.net (Josef 'Jeff' Sipek)
Date: Sun, 6 Dec 2015 09:45:14 -0500
Subject: [OmniOS-discuss] PowerDNS recursor SIGSEGV
Message-ID: <20151206144514.GA1425@meili.valhalla.31bits.net>

I compiled powerdns recursor [1] on 016, but I'm running into an occasional
SIGSEGV.  The SIGSEGV is because of insufficiently aligned memory operand to an
instruction.  (See the powerdns bug I filed for this [2].) The SIGSEGV actually
happens in the deque code which comes from boost (1.58.0 in this case).

Now, the weird thing... I compiled the same powerdns source with the same
version of boost on OI Hipster and OmniOS 016.  Hipster uses gcc 4.9.3,
OmniOS 016 uses 5.1.  The function that causes the SEGV on 016 looks totally
different between the two distros so I haven't see it die on my laptop.

Has anyone seen any strange SIGSEGVs in boost using software?  I hope it isn't
some sort of gcc bug.

Thanks,

Jeff.

P.S. PowerDNS uses {get,set,swap}context, so I haven't ruled out a stack
     alignment bug on their end.

[1] https://www.powerdns.com/
[2] https://github.com/PowerDNS/pdns/issues/3002


OmniOS 016:

_ZNKSt15_Deque_iteratorIcRcPcEmiEi:     pushl  %ebp
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+1:   movl   %esp,%ebp
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+3:   pushl  %ebx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+4:   subl   $0x1c,%esp
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+7:   movl   0xc(%ebp),%eax
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xa: movl   0x8(%ebp),%ebx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xd: movdqu (%eax),%xmm0
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x11:movl   0x10(%ebp),%eax
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x14:movaps %xmm0,-0x18(%ebp)
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x18:negl   %eax
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1a:pushl  %eax
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1b:leal   -0x18(%ebp),%eax
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1e:pushl  %eax
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1f:call   -0x94    <_ZNSt15_Deque_iteratorIcRcPcEpLEi>
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x24:movl   (%eax),%edx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x26:addl   $0x10,%esp
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x29:movl   %edx,(%ebx)
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2b:movl   0x4(%eax),%edx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2e:movl   %edx,0x4(%ebx)
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x31:movl   0x8(%eax),%edx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x34:movl   0xc(%eax),%eax
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x37:movl   %edx,0x8(%ebx)
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a:movl   %eax,0xc(%ebx)
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3d:movl   %ebx,%eax
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3f:movl   -0x4(%ebp),%ebx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x42:leave  
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x43:ret    $0x4


OI Hipster:

_ZNKSt15_Deque_iteratorIcRcPcEmiEi:     pushl  %ebp
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+1:   pushl  %edi
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+2:   pushl  %esi
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+3:   pushl  %ebx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+4:   subl   $0x14,%esp
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+7:   movl   0x2c(%esp),%edx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xb: movl   0x30(%esp),%ebx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xf: movl   0x28(%esp),%eax
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x13:movl   (%edx),%esi
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x15:movl   0x4(%edx),%ecx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x18:movl   0x8(%edx),%edi
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1b:movl   0xc(%edx),%ebp
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1e:movl   %esi,%edx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x20:subl   %ebx,%esi
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x22:subl   %ecx,%edx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x24:subl   %ebx,%edx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x26:cmpl   $0x1ff,%edx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2c:movl   %esi,(%esp)
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2f:jbe    +0x21    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x52>
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x31:movl   %edx,%ebx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x33:sarl   $0x9,%ebx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x36:testl  %edx,%edx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x38:jle    +0x56    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x90>
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a:leal   0x0(%ebp,%ebx,4),%ebp
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3e:movl   0x0(%ebp),%ecx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x41:shll   $0x9,%ebx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x44:subl   %ebx,%edx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x46:leal   (%ecx,%edx),%esi
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x49:leal   0x200(%ecx),%edi
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x4f:movl   %esi,(%esp)
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x52:movl   %edi,0x4(%esp)
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x56:movd   (%esp),%xmm0
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x5b:movl   %ecx,(%esp)
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x5e:movd   0x4(%esp),%xmm1
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x64:movl   %ebp,0x4(%esp)
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x68:movd   (%esp),%xmm3
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x6d:punpckldq %xmm3,%xmm0
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x71:movd   0x4(%esp),%xmm2
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x77:punpckldq %xmm2,%xmm1
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x7b:punpcklqdq %xmm1,%xmm0
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x7f:movdqu %xmm0,(%eax)
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x83:addl   $0x14,%esp
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x86:popl   %ebx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x87:popl   %esi
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x88:popl   %edi
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x89:popl   %ebp
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x8a:ret    $0x4
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x8d:leal   0x0(%esi),%esi
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x90:movl   %edx,%ebx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x92:shrl   $0x9,%ebx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x95:orl    $0xff800000,%ebx
_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x9b:jmp    -0x63    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a>

-- 
I'm somewhere between geek and normal.
		- Linus Torvalds

From danmcd at omniti.com  Sun Dec  6 15:26:00 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Sun, 6 Dec 2015 10:26:00 -0500
Subject: [OmniOS-discuss] PowerDNS recursor SIGSEGV
In-Reply-To: <20151206144514.GA1425@meili.valhalla.31bits.net>
References: <20151206144514.GA1425@meili.valhalla.31bits.net>
Message-ID: <B01D935B-39D6-4D60-BA4B-2A34994815C4@omniti.com>

One other weird thing to try -- build powerdns with the Illumos gcc4.  If the gcc5 bug affects powerdns, that'd isolate it.  If gcc5 affects some non Illumos library, gcc4 won't help and you'll still segv.

If gcc4 Illumos can't build it, you could try 014 and its gcc481.

Dan

Sent from my iPhone (typos, autocorrect, and all)

> On Dec 6, 2015, at 9:45 AM, Josef 'Jeff' Sipek <jeffpc at josefsipek.net> wrote:
> 
> I compiled powerdns recursor [1] on 016, but I'm running into an occasional
> SIGSEGV.  The SIGSEGV is because of insufficiently aligned memory operand to an
> instruction.  (See the powerdns bug I filed for this [2].) The SIGSEGV actually
> happens in the deque code which comes from boost (1.58.0 in this case).
> 
> Now, the weird thing... I compiled the same powerdns source with the same
> version of boost on OI Hipster and OmniOS 016.  Hipster uses gcc 4.9.3,
> OmniOS 016 uses 5.1.  The function that causes the SEGV on 016 looks totally
> different between the two distros so I haven't see it die on my laptop.
> 
> Has anyone seen any strange SIGSEGVs in boost using software?  I hope it isn't
> some sort of gcc bug.
> 
> Thanks,
> 
> Jeff.
> 
> P.S. PowerDNS uses {get,set,swap}context, so I haven't ruled out a stack
>     alignment bug on their end.
> 
> [1] https://www.powerdns.com/
> [2] https://github.com/PowerDNS/pdns/issues/3002
> 
> 
> OmniOS 016:
> 
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi:     pushl  %ebp
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+1:   movl   %esp,%ebp
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+3:   pushl  %ebx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+4:   subl   $0x1c,%esp
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+7:   movl   0xc(%ebp),%eax
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xa: movl   0x8(%ebp),%ebx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xd: movdqu (%eax),%xmm0
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x11:movl   0x10(%ebp),%eax
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x14:movaps %xmm0,-0x18(%ebp)
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x18:negl   %eax
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1a:pushl  %eax
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1b:leal   -0x18(%ebp),%eax
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1e:pushl  %eax
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1f:call   -0x94    <_ZNSt15_Deque_iteratorIcRcPcEpLEi>
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x24:movl   (%eax),%edx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x26:addl   $0x10,%esp
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x29:movl   %edx,(%ebx)
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2b:movl   0x4(%eax),%edx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2e:movl   %edx,0x4(%ebx)
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x31:movl   0x8(%eax),%edx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x34:movl   0xc(%eax),%eax
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x37:movl   %edx,0x8(%ebx)
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a:movl   %eax,0xc(%ebx)
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3d:movl   %ebx,%eax
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3f:movl   -0x4(%ebp),%ebx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x42:leave  
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x43:ret    $0x4
> 
> 
> OI Hipster:
> 
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi:     pushl  %ebp
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+1:   pushl  %edi
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+2:   pushl  %esi
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+3:   pushl  %ebx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+4:   subl   $0x14,%esp
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+7:   movl   0x2c(%esp),%edx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xb: movl   0x30(%esp),%ebx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xf: movl   0x28(%esp),%eax
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x13:movl   (%edx),%esi
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x15:movl   0x4(%edx),%ecx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x18:movl   0x8(%edx),%edi
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1b:movl   0xc(%edx),%ebp
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1e:movl   %esi,%edx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x20:subl   %ebx,%esi
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x22:subl   %ecx,%edx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x24:subl   %ebx,%edx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x26:cmpl   $0x1ff,%edx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2c:movl   %esi,(%esp)
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2f:jbe    +0x21    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x52>
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x31:movl   %edx,%ebx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x33:sarl   $0x9,%ebx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x36:testl  %edx,%edx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x38:jle    +0x56    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x90>
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a:leal   0x0(%ebp,%ebx,4),%ebp
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3e:movl   0x0(%ebp),%ecx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x41:shll   $0x9,%ebx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x44:subl   %ebx,%edx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x46:leal   (%ecx,%edx),%esi
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x49:leal   0x200(%ecx),%edi
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x4f:movl   %esi,(%esp)
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x52:movl   %edi,0x4(%esp)
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x56:movd   (%esp),%xmm0
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x5b:movl   %ecx,(%esp)
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x5e:movd   0x4(%esp),%xmm1
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x64:movl   %ebp,0x4(%esp)
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x68:movd   (%esp),%xmm3
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x6d:punpckldq %xmm3,%xmm0
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x71:movd   0x4(%esp),%xmm2
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x77:punpckldq %xmm2,%xmm1
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x7b:punpcklqdq %xmm1,%xmm0
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x7f:movdqu %xmm0,(%eax)
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x83:addl   $0x14,%esp
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x86:popl   %ebx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x87:popl   %esi
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x88:popl   %edi
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x89:popl   %ebp
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x8a:ret    $0x4
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x8d:leal   0x0(%esi),%esi
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x90:movl   %edx,%ebx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x92:shrl   $0x9,%ebx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x95:orl    $0xff800000,%ebx
> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x9b:jmp    -0x63    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a>
> 
> -- 
> I'm somewhere between geek and normal.
>        - Linus Torvalds
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

From jeffpc at josefsipek.net  Sun Dec  6 20:40:30 2015
From: jeffpc at josefsipek.net (Josef 'Jeff' Sipek)
Date: Sun, 6 Dec 2015 15:40:30 -0500
Subject: [OmniOS-discuss] PowerDNS recursor SIGSEGV
In-Reply-To: <B01D935B-39D6-4D60-BA4B-2A34994815C4@omniti.com>
References: <20151206144514.GA1425@meili.valhalla.31bits.net>
	<B01D935B-39D6-4D60-BA4B-2A34994815C4@omniti.com>
Message-ID: <20151206204030.GA1360@meili.valhalla.31bits.net>

On Sun, Dec 06, 2015 at 10:26:00AM -0500, Dan McDonald wrote:
> One other weird thing to try -- build powerdns with the Illumos gcc4.  If
> the gcc5 bug affects powerdns, that'd isolate it.  If gcc5 affects some
> non Illumos library, gcc4 won't help and you'll still segv.
> 
> If gcc4 Illumos can't build it,

The powerdns devs use a lot of c++11 which makes 4.4.4 *waaay* too old.
Apparently, 4.8 should be good enough.

> you could try 014 and its gcc481.

Yeah, I'll try that.

Thanks,

Jeff.

> 
> Dan
> 
> Sent from my iPhone (typos, autocorrect, and all)
> 
> > On Dec 6, 2015, at 9:45 AM, Josef 'Jeff' Sipek <jeffpc at josefsipek.net> wrote:
> > 
> > I compiled powerdns recursor [1] on 016, but I'm running into an occasional
> > SIGSEGV.  The SIGSEGV is because of insufficiently aligned memory operand to an
> > instruction.  (See the powerdns bug I filed for this [2].) The SIGSEGV actually
> > happens in the deque code which comes from boost (1.58.0 in this case).
> > 
> > Now, the weird thing... I compiled the same powerdns source with the same
> > version of boost on OI Hipster and OmniOS 016.  Hipster uses gcc 4.9.3,
> > OmniOS 016 uses 5.1.  The function that causes the SEGV on 016 looks totally
> > different between the two distros so I haven't see it die on my laptop.
> > 
> > Has anyone seen any strange SIGSEGVs in boost using software?  I hope it isn't
> > some sort of gcc bug.
> > 
> > Thanks,
> > 
> > Jeff.
> > 
> > P.S. PowerDNS uses {get,set,swap}context, so I haven't ruled out a stack
> >     alignment bug on their end.
> > 
> > [1] https://www.powerdns.com/
> > [2] https://github.com/PowerDNS/pdns/issues/3002
> > 
> > 
> > OmniOS 016:
> > 
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi:     pushl  %ebp
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+1:   movl   %esp,%ebp
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+3:   pushl  %ebx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+4:   subl   $0x1c,%esp
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+7:   movl   0xc(%ebp),%eax
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xa: movl   0x8(%ebp),%ebx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xd: movdqu (%eax),%xmm0
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x11:movl   0x10(%ebp),%eax
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x14:movaps %xmm0,-0x18(%ebp)
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x18:negl   %eax
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1a:pushl  %eax
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1b:leal   -0x18(%ebp),%eax
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1e:pushl  %eax
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1f:call   -0x94    <_ZNSt15_Deque_iteratorIcRcPcEpLEi>
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x24:movl   (%eax),%edx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x26:addl   $0x10,%esp
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x29:movl   %edx,(%ebx)
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2b:movl   0x4(%eax),%edx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2e:movl   %edx,0x4(%ebx)
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x31:movl   0x8(%eax),%edx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x34:movl   0xc(%eax),%eax
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x37:movl   %edx,0x8(%ebx)
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a:movl   %eax,0xc(%ebx)
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3d:movl   %ebx,%eax
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3f:movl   -0x4(%ebp),%ebx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x42:leave  
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x43:ret    $0x4
> > 
> > 
> > OI Hipster:
> > 
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi:     pushl  %ebp
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+1:   pushl  %edi
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+2:   pushl  %esi
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+3:   pushl  %ebx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+4:   subl   $0x14,%esp
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+7:   movl   0x2c(%esp),%edx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xb: movl   0x30(%esp),%ebx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xf: movl   0x28(%esp),%eax
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x13:movl   (%edx),%esi
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x15:movl   0x4(%edx),%ecx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x18:movl   0x8(%edx),%edi
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1b:movl   0xc(%edx),%ebp
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1e:movl   %esi,%edx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x20:subl   %ebx,%esi
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x22:subl   %ecx,%edx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x24:subl   %ebx,%edx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x26:cmpl   $0x1ff,%edx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2c:movl   %esi,(%esp)
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2f:jbe    +0x21    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x52>
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x31:movl   %edx,%ebx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x33:sarl   $0x9,%ebx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x36:testl  %edx,%edx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x38:jle    +0x56    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x90>
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a:leal   0x0(%ebp,%ebx,4),%ebp
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3e:movl   0x0(%ebp),%ecx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x41:shll   $0x9,%ebx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x44:subl   %ebx,%edx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x46:leal   (%ecx,%edx),%esi
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x49:leal   0x200(%ecx),%edi
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x4f:movl   %esi,(%esp)
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x52:movl   %edi,0x4(%esp)
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x56:movd   (%esp),%xmm0
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x5b:movl   %ecx,(%esp)
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x5e:movd   0x4(%esp),%xmm1
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x64:movl   %ebp,0x4(%esp)
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x68:movd   (%esp),%xmm3
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x6d:punpckldq %xmm3,%xmm0
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x71:movd   0x4(%esp),%xmm2
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x77:punpckldq %xmm2,%xmm1
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x7b:punpcklqdq %xmm1,%xmm0
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x7f:movdqu %xmm0,(%eax)
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x83:addl   $0x14,%esp
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x86:popl   %ebx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x87:popl   %esi
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x88:popl   %edi
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x89:popl   %ebp
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x8a:ret    $0x4
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x8d:leal   0x0(%esi),%esi
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x90:movl   %edx,%ebx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x92:shrl   $0x9,%ebx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x95:orl    $0xff800000,%ebx
> > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x9b:jmp    -0x63    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a>
> > 
> > -- 
> > I'm somewhere between geek and normal.
> >        - Linus Torvalds
> > _______________________________________________
> > OmniOS-discuss mailing list
> > OmniOS-discuss at lists.omniti.com
> > http://lists.omniti.com/mailman/listinfo/omnios-discuss

-- 
The box said "Windows XP or better required". So I installed Linux.

From jeffpc at josefsipek.net  Sun Dec  6 22:54:26 2015
From: jeffpc at josefsipek.net (Josef 'Jeff' Sipek)
Date: Sun, 6 Dec 2015 17:54:26 -0500
Subject: [OmniOS-discuss] PowerDNS recursor SIGSEGV
In-Reply-To: <20151206204030.GA1360@meili.valhalla.31bits.net>
References: <20151206144514.GA1425@meili.valhalla.31bits.net>
	<B01D935B-39D6-4D60-BA4B-2A34994815C4@omniti.com>
	<20151206204030.GA1360@meili.valhalla.31bits.net>
Message-ID: <20151206225426.GB1360@meili.valhalla.31bits.net>

On Sun, Dec 06, 2015 at 03:40:30PM -0500, Josef 'Jeff' Sipek wrote:
> On Sun, Dec 06, 2015 at 10:26:00AM -0500, Dan McDonald wrote:
> > One other weird thing to try -- build powerdns with the Illumos gcc4.  If
> > the gcc5 bug affects powerdns, that'd isolate it.  If gcc5 affects some
> > non Illumos library, gcc4 won't help and you'll still segv.
> > 
> > If gcc4 Illumos can't build it,
> 
> The powerdns devs use a lot of c++11 which makes 4.4.4 *waaay* too old.
> Apparently, 4.8 should be good enough.
> 
> > you could try 014 and its gcc481.
> 
> Yeah, I'll try that.

Ok.  014 produces the same exact instructions as OI Hipster.  I wonder if
gcc 5 changed some processor default.

Jeff.

> Thanks,
> 
> Jeff.
> 
> > 
> > Dan
> > 
> > Sent from my iPhone (typos, autocorrect, and all)
> > 
> > > On Dec 6, 2015, at 9:45 AM, Josef 'Jeff' Sipek <jeffpc at josefsipek.net> wrote:
> > > 
> > > I compiled powerdns recursor [1] on 016, but I'm running into an occasional
> > > SIGSEGV.  The SIGSEGV is because of insufficiently aligned memory operand to an
> > > instruction.  (See the powerdns bug I filed for this [2].) The SIGSEGV actually
> > > happens in the deque code which comes from boost (1.58.0 in this case).
> > > 
> > > Now, the weird thing... I compiled the same powerdns source with the same
> > > version of boost on OI Hipster and OmniOS 016.  Hipster uses gcc 4.9.3,
> > > OmniOS 016 uses 5.1.  The function that causes the SEGV on 016 looks totally
> > > different between the two distros so I haven't see it die on my laptop.
> > > 
> > > Has anyone seen any strange SIGSEGVs in boost using software?  I hope it isn't
> > > some sort of gcc bug.
> > > 
> > > Thanks,
> > > 
> > > Jeff.
> > > 
> > > P.S. PowerDNS uses {get,set,swap}context, so I haven't ruled out a stack
> > >     alignment bug on their end.
> > > 
> > > [1] https://www.powerdns.com/
> > > [2] https://github.com/PowerDNS/pdns/issues/3002
> > > 
> > > 
> > > OmniOS 016:
> > > 
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi:     pushl  %ebp
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+1:   movl   %esp,%ebp
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+3:   pushl  %ebx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+4:   subl   $0x1c,%esp
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+7:   movl   0xc(%ebp),%eax
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xa: movl   0x8(%ebp),%ebx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xd: movdqu (%eax),%xmm0
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x11:movl   0x10(%ebp),%eax
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x14:movaps %xmm0,-0x18(%ebp)
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x18:negl   %eax
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1a:pushl  %eax
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1b:leal   -0x18(%ebp),%eax
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1e:pushl  %eax
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1f:call   -0x94    <_ZNSt15_Deque_iteratorIcRcPcEpLEi>
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x24:movl   (%eax),%edx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x26:addl   $0x10,%esp
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x29:movl   %edx,(%ebx)
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2b:movl   0x4(%eax),%edx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2e:movl   %edx,0x4(%ebx)
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x31:movl   0x8(%eax),%edx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x34:movl   0xc(%eax),%eax
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x37:movl   %edx,0x8(%ebx)
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a:movl   %eax,0xc(%ebx)
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3d:movl   %ebx,%eax
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3f:movl   -0x4(%ebp),%ebx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x42:leave  
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x43:ret    $0x4
> > > 
> > > 
> > > OI Hipster:
> > > 
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi:     pushl  %ebp
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+1:   pushl  %edi
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+2:   pushl  %esi
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+3:   pushl  %ebx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+4:   subl   $0x14,%esp
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+7:   movl   0x2c(%esp),%edx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xb: movl   0x30(%esp),%ebx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xf: movl   0x28(%esp),%eax
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x13:movl   (%edx),%esi
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x15:movl   0x4(%edx),%ecx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x18:movl   0x8(%edx),%edi
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1b:movl   0xc(%edx),%ebp
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1e:movl   %esi,%edx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x20:subl   %ebx,%esi
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x22:subl   %ecx,%edx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x24:subl   %ebx,%edx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x26:cmpl   $0x1ff,%edx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2c:movl   %esi,(%esp)
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2f:jbe    +0x21    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x52>
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x31:movl   %edx,%ebx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x33:sarl   $0x9,%ebx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x36:testl  %edx,%edx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x38:jle    +0x56    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x90>
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a:leal   0x0(%ebp,%ebx,4),%ebp
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3e:movl   0x0(%ebp),%ecx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x41:shll   $0x9,%ebx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x44:subl   %ebx,%edx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x46:leal   (%ecx,%edx),%esi
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x49:leal   0x200(%ecx),%edi
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x4f:movl   %esi,(%esp)
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x52:movl   %edi,0x4(%esp)
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x56:movd   (%esp),%xmm0
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x5b:movl   %ecx,(%esp)
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x5e:movd   0x4(%esp),%xmm1
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x64:movl   %ebp,0x4(%esp)
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x68:movd   (%esp),%xmm3
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x6d:punpckldq %xmm3,%xmm0
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x71:movd   0x4(%esp),%xmm2
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x77:punpckldq %xmm2,%xmm1
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x7b:punpcklqdq %xmm1,%xmm0
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x7f:movdqu %xmm0,(%eax)
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x83:addl   $0x14,%esp
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x86:popl   %ebx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x87:popl   %esi
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x88:popl   %edi
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x89:popl   %ebp
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x8a:ret    $0x4
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x8d:leal   0x0(%esi),%esi
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x90:movl   %edx,%ebx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x92:shrl   $0x9,%ebx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x95:orl    $0xff800000,%ebx
> > > _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x9b:jmp    -0x63    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a>
> > > 
> > > -- 
> > > I'm somewhere between geek and normal.
> > >        - Linus Torvalds
> > > _______________________________________________
> > > OmniOS-discuss mailing list
> > > OmniOS-discuss at lists.omniti.com
> > > http://lists.omniti.com/mailman/listinfo/omnios-discuss
> 
> -- 
> The box said "Windows XP or better required". So I installed Linux.

-- 
If I have trouble installing Linux, something is wrong. Very wrong.
		- Linus Torvalds

From danmcd at omniti.com  Sun Dec  6 23:42:46 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Sun, 6 Dec 2015 18:42:46 -0500
Subject: [OmniOS-discuss] PowerDNS recursor SIGSEGV
In-Reply-To: <20151206225426.GB1360@meili.valhalla.31bits.net>
References: <20151206144514.GA1425@meili.valhalla.31bits.net>
	<B01D935B-39D6-4D60-BA4B-2A34994815C4@omniti.com>
	<20151206204030.GA1360@meili.valhalla.31bits.net>
	<20151206225426.GB1360@meili.valhalla.31bits.net>
Message-ID: <0112A91C-03C8-4007-A85F-893E6DBE93EE@omniti.com>

I wonder how the 014-compiled binary performs on 016?  More accurately, I wonder if any gcc-51 compiled libs are off?

Dan

Sent from my iPhone (typos, autocorrect, and all)

> On Dec 6, 2015, at 5:54 PM, Josef 'Jeff' Sipek <jeffpc at josefsipek.net> wrote:
> 
>> On Sun, Dec 06, 2015 at 03:40:30PM -0500, Josef 'Jeff' Sipek wrote:
>>> On Sun, Dec 06, 2015 at 10:26:00AM -0500, Dan McDonald wrote:
>>> One other weird thing to try -- build powerdns with the Illumos gcc4.  If
>>> the gcc5 bug affects powerdns, that'd isolate it.  If gcc5 affects some
>>> non Illumos library, gcc4 won't help and you'll still segv.
>>> 
>>> If gcc4 Illumos can't build it,
>> 
>> The powerdns devs use a lot of c++11 which makes 4.4.4 *waaay* too old.
>> Apparently, 4.8 should be good enough.
>> 
>>> you could try 014 and its gcc481.
>> 
>> Yeah, I'll try that.
> 
> Ok.  014 produces the same exact instructions as OI Hipster.  I wonder if
> gcc 5 changed some processor default.
> 
> Jeff.
> 
>> Thanks,
>> 
>> Jeff.
>> 
>>> 
>>> Dan
>>> 
>>> Sent from my iPhone (typos, autocorrect, and all)
>>> 
>>>> On Dec 6, 2015, at 9:45 AM, Josef 'Jeff' Sipek <jeffpc at josefsipek.net> wrote:
>>>> 
>>>> I compiled powerdns recursor [1] on 016, but I'm running into an occasional
>>>> SIGSEGV.  The SIGSEGV is because of insufficiently aligned memory operand to an
>>>> instruction.  (See the powerdns bug I filed for this [2].) The SIGSEGV actually
>>>> happens in the deque code which comes from boost (1.58.0 in this case).
>>>> 
>>>> Now, the weird thing... I compiled the same powerdns source with the same
>>>> version of boost on OI Hipster and OmniOS 016.  Hipster uses gcc 4.9.3,
>>>> OmniOS 016 uses 5.1.  The function that causes the SEGV on 016 looks totally
>>>> different between the two distros so I haven't see it die on my laptop.
>>>> 
>>>> Has anyone seen any strange SIGSEGVs in boost using software?  I hope it isn't
>>>> some sort of gcc bug.
>>>> 
>>>> Thanks,
>>>> 
>>>> Jeff.
>>>> 
>>>> P.S. PowerDNS uses {get,set,swap}context, so I haven't ruled out a stack
>>>>    alignment bug on their end.
>>>> 
>>>> [1] https://www.powerdns.com/
>>>> [2] https://github.com/PowerDNS/pdns/issues/3002
>>>> 
>>>> 
>>>> OmniOS 016:
>>>> 
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi:     pushl  %ebp
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+1:   movl   %esp,%ebp
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+3:   pushl  %ebx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+4:   subl   $0x1c,%esp
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+7:   movl   0xc(%ebp),%eax
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xa: movl   0x8(%ebp),%ebx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xd: movdqu (%eax),%xmm0
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x11:movl   0x10(%ebp),%eax
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x14:movaps %xmm0,-0x18(%ebp)
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x18:negl   %eax
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1a:pushl  %eax
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1b:leal   -0x18(%ebp),%eax
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1e:pushl  %eax
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1f:call   -0x94    <_ZNSt15_Deque_iteratorIcRcPcEpLEi>
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x24:movl   (%eax),%edx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x26:addl   $0x10,%esp
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x29:movl   %edx,(%ebx)
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2b:movl   0x4(%eax),%edx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2e:movl   %edx,0x4(%ebx)
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x31:movl   0x8(%eax),%edx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x34:movl   0xc(%eax),%eax
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x37:movl   %edx,0x8(%ebx)
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a:movl   %eax,0xc(%ebx)
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3d:movl   %ebx,%eax
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3f:movl   -0x4(%ebp),%ebx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x42:leave  
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x43:ret    $0x4
>>>> 
>>>> 
>>>> OI Hipster:
>>>> 
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi:     pushl  %ebp
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+1:   pushl  %edi
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+2:   pushl  %esi
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+3:   pushl  %ebx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+4:   subl   $0x14,%esp
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+7:   movl   0x2c(%esp),%edx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xb: movl   0x30(%esp),%ebx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xf: movl   0x28(%esp),%eax
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x13:movl   (%edx),%esi
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x15:movl   0x4(%edx),%ecx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x18:movl   0x8(%edx),%edi
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1b:movl   0xc(%edx),%ebp
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1e:movl   %esi,%edx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x20:subl   %ebx,%esi
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x22:subl   %ecx,%edx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x24:subl   %ebx,%edx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x26:cmpl   $0x1ff,%edx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2c:movl   %esi,(%esp)
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2f:jbe    +0x21    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x52>
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x31:movl   %edx,%ebx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x33:sarl   $0x9,%ebx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x36:testl  %edx,%edx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x38:jle    +0x56    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x90>
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a:leal   0x0(%ebp,%ebx,4),%ebp
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3e:movl   0x0(%ebp),%ecx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x41:shll   $0x9,%ebx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x44:subl   %ebx,%edx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x46:leal   (%ecx,%edx),%esi
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x49:leal   0x200(%ecx),%edi
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x4f:movl   %esi,(%esp)
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x52:movl   %edi,0x4(%esp)
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x56:movd   (%esp),%xmm0
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x5b:movl   %ecx,(%esp)
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x5e:movd   0x4(%esp),%xmm1
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x64:movl   %ebp,0x4(%esp)
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x68:movd   (%esp),%xmm3
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x6d:punpckldq %xmm3,%xmm0
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x71:movd   0x4(%esp),%xmm2
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x77:punpckldq %xmm2,%xmm1
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x7b:punpcklqdq %xmm1,%xmm0
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x7f:movdqu %xmm0,(%eax)
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x83:addl   $0x14,%esp
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x86:popl   %ebx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x87:popl   %esi
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x88:popl   %edi
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x89:popl   %ebp
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x8a:ret    $0x4
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x8d:leal   0x0(%esi),%esi
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x90:movl   %edx,%ebx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x92:shrl   $0x9,%ebx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x95:orl    $0xff800000,%ebx
>>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x9b:jmp    -0x63    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a>
>>>> 
>>>> -- 
>>>> I'm somewhere between geek and normal.
>>>>       - Linus Torvalds
>>>> _______________________________________________
>>>> OmniOS-discuss mailing list
>>>> OmniOS-discuss at lists.omniti.com
>>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>> 
>> -- 
>> The box said "Windows XP or better required". So I installed Linux.
> 
> -- 
> If I have trouble installing Linux, something is wrong. Very wrong.
>        - Linus Torvalds

From bfriesen at simple.dallas.tx.us  Mon Dec  7 00:19:03 2015
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Sun, 6 Dec 2015 18:19:03 -0600 (CST)
Subject: [OmniOS-discuss] OmniOS r151016 zone has difficulties shutting down
Message-ID: <alpine.GSO.2.01.1512061727590.1673@freddy.simplesystems.org>

On a freshly installed zone with no additional packages installed (but 
with one lofs mount to a filesystem), I am seeing a glitch with 
'zoneadm -z name shutdown', 'zoneadm -z name reboot' or 'reboot' 
within the zone.  This message appears on the console and in the 
/var/adm/messages file of the global zone:

Dec  6 17:17:22 scrappy zoneadmd[17388]: [ID 702911 daemon.error] 
[zone 'pkgbuild'] failed to open console master: Device busy
Dec  6 17:17:22 scrappy zoneadmd[17388]: [ID 702911 daemon.error] 
[zone 'pkgbuild'] WARNING: could not open master side of zone console 
for pkgbuild to release slave handle: Device busy
Dec  6 17:17:22 scrappy zoneadmd[17388]: [ID 702911 daemon.error] 
[zone 'pkgbuild'] WARNING: console /devices//pseudo/zconsnex at 1/zcons at 1 
found, but it could not be removed.: I/O error

and the shutdown hangs.  If I then do a zlogin to the console (or have 
already done so) the shutdown immediately completes:

    scrappy:~% pfexec zlogin -C pkgbuild
    [Connected to zone 'pkgbuild' console]

    [NOTICE: Zone halted]

If I attempt to zlogin into the zone while it is being shut down I get 
this message:

zlogin: login allowed only to running zones (pkgbuild is 
'shutting_down').

If I do 'zoneadm -z name reboot', it works fine, although this is 
documented to be the same as 'shutdown' followed by 'boot'.

If I do the reboot on the zone console then the reboot works fine.

This is the second zone that I have installed and the first zone also 
encountered this issue.  The problem went away with the other zone but 
still persists for this new zone.

Have others encountered this issue?  What can be done to fix it?

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From hasslerd at gmx.li  Mon Dec  7 11:20:34 2015
From: hasslerd at gmx.li (Dominik Hassler)
Date: Mon, 7 Dec 2015 12:20:34 +0100
Subject: [OmniOS-discuss] OmniOS r151016 zone has difficulties shutting
 down
In-Reply-To: <alpine.GSO.2.01.1512061727590.1673@freddy.simplesystems.org>
References: <alpine.GSO.2.01.1512061727590.1673@freddy.simplesystems.org>
Message-ID: <trinity-cd05390f-6256-472f-af6f-f0e13f00c473-1449487234057@3capp-gmx-bs30>

Bob,

I can confirm that this happens occasionally on my systems (all r16 and latest patches applied), too. Since it does not happen every shutdown and for a different zone every time, I could not find a pattern, yet. Usually I just halt the zone if the clean shutdown fails. I don't recall when this occured for the first time but it might be after upgrading to r16.
 

> Gesendet: Montag, 07. Dezember 2015 um 01:19 Uhr
> Von: "Bob Friesenhahn" <bfriesen at simple.dallas.tx.us>
> An: omnios-discuss at lists.omniti.com
> Betreff: [OmniOS-discuss] OmniOS r151016 zone has difficulties shutting down
>
> On a freshly installed zone with no additional packages installed (but 
> with one lofs mount to a filesystem), I am seeing a glitch with 
> 'zoneadm -z name shutdown', 'zoneadm -z name reboot' or 'reboot' 
> within the zone.  This message appears on the console and in the 
> /var/adm/messages file of the global zone:
> 
> Dec  6 17:17:22 scrappy zoneadmd[17388]: [ID 702911 daemon.error] 
> [zone 'pkgbuild'] failed to open console master: Device busy
> Dec  6 17:17:22 scrappy zoneadmd[17388]: [ID 702911 daemon.error] 
> [zone 'pkgbuild'] WARNING: could not open master side of zone console 
> for pkgbuild to release slave handle: Device busy
> Dec  6 17:17:22 scrappy zoneadmd[17388]: [ID 702911 daemon.error] 
> [zone 'pkgbuild'] WARNING: console /devices//pseudo/zconsnex at 1/zcons at 1 
> found, but it could not be removed.: I/O error
> 
> and the shutdown hangs.  If I then do a zlogin to the console (or have 
> already done so) the shutdown immediately completes:
> 
>     scrappy:~% pfexec zlogin -C pkgbuild
>     [Connected to zone 'pkgbuild' console]
> 
>     [NOTICE: Zone halted]
> 
> If I attempt to zlogin into the zone while it is being shut down I get 
> this message:
> 
> zlogin: login allowed only to running zones (pkgbuild is 
> 'shutting_down').
> 
> If I do 'zoneadm -z name reboot', it works fine, although this is 
> documented to be the same as 'shutdown' followed by 'boot'.
> 
> If I do the reboot on the zone console then the reboot works fine.
> 
> This is the second zone that I have installed and the first zone also 
> encountered this issue.  The problem went away with the other zone but 
> still persists for this new zone.
> 
> Have others encountered this issue?  What can be done to fix it?
> 
> Bob
> -- 
> Bob Friesenhahn
> bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
> GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
> 

From danmcd at omniti.com  Mon Dec  7 12:49:47 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 7 Dec 2015 07:49:47 -0500
Subject: [OmniOS-discuss] OmniOS r151016 zone has difficulties shutting
	down
In-Reply-To: <alpine.GSO.2.01.1512061727590.1673@freddy.simplesystems.org>
References: <alpine.GSO.2.01.1512061727590.1673@freddy.simplesystems.org>
Message-ID: <536501D2-EA96-4F6B-8CB2-39A0F9698267@omniti.com>


> On Dec 6, 2015, at 7:19 PM, Bob Friesenhahn <bfriesen at simple.dallas.tx.us> wrote:
> 
> On a freshly installed zone with no additional packages installed (but with one lofs mount to a filesystem), I am seeing a glitch with 'zoneadm -z name shutdown', 'zoneadm -z name reboot' or 'reboot' within the zone.  This message appears on the console and in the /var/adm/messages file of the global zone:
> 
> Dec  6 17:17:22 scrappy zoneadmd[17388]: [ID 702911 daemon.error] [zone 'pkgbuild'] failed to open console master: Device busy
> Dec  6 17:17:22 scrappy zoneadmd[17388]: [ID 702911 daemon.error] [zone 'pkgbuild'] WARNING: could not open master side of zone console for pkgbuild to release slave handle: Device busy
> Dec  6 17:17:22 scrappy zoneadmd[17388]: [ID 702911 daemon.error] [zone 'pkgbuild'] WARNING: console /devices//pseudo/zconsnex at 1/zcons at 1 found, but it could not be removed.: I/O error
> 
> and the shutdown hangs.

I just tried a couple of shutdown/boot loops on a 016 zone of mine.  I did not see the hang, but these errors were in my global:

Dec  7 07:34:45 neuromancer zoneadmd[29496]: [ID 702911 daemon.error] [zone 'minecraft'] WARNING: console /devices//pseudo/zconsnex at 1/zcons at 1 found, but it could not be removed.: I/O error

So I'm guessing the failed-to-open-console-master and "could not open master side of zone console" is what was causing your failure.

The other message (found, but it could not be removed), I only see when there is no "zlogin -C" process attached to my zone's console.

> Have others encountered this issue?  What can be done to fix it?

This message is printed by zoneadmd.  If you or anyone else encounters this hang again, please do the following:

1.) While zoneadm is hung, check the console for the above message, you'll see a pid for zoneadmd (Bob's example was 17388).

2.) See if you can get the stack(s) of zoneadmd that reported the console master error:    pstack <PID>

3.) Grab a corefile of the zoneadmd:  gcore <PID>

4.) Share the corefile somehow.

The pstack and core of the running/hung zoneadm(1M) command would also be useful, I think.

Thanks,
Dan


From davide.poletto at gmail.com  Mon Dec  7 13:13:17 2015
From: davide.poletto at gmail.com (Davide Poletto)
Date: Mon, 7 Dec 2015 14:13:17 +0100
Subject: [OmniOS-discuss] illumos and contributions metrics: how to evaluate
 companies that commercialize illumos based products by examining them in
 the light of their illumos community's contributions.
Message-ID: <CANKMAMYBGfr44MaH+t9=xBwzEH9f4-gn=ODCvCtndxMo0PqNXw@mail.gmail.com>

Hi all,

maybe I'm a little bit fool to ask such type of general question here...yes
- I know - probably the illumos user's mailing list is the proper place to
ask what I'm trying to explore here...but I feel comfortable to place my
doubts here first (just see below why).

At first sight it looks definitely Off Topic in regard with OmniOS and -
also - not relevant to OmniOS (and OmniTI) in itself...so first of
all...really *pardon me* if I jumped in with this type of generic doubts
*but*, at the same time, I hope that, among others, Dan McDonald will read
and give me (and us) his opinion about what I'm going to ask...since I
recently read - and here it's the point for me that legitimates the
discussion here - *his* interesting presentation "2015 illumos Day" (I
found it at http://kebe.com/~danmcd/illumos-day-2015.pdf).

All my interest started once I've read it and, particularly, when I started
to think about the relevance of two slides: the "Non-Upstreamed Technical
Changes" and "Bad Reasons for Not Upstreaming" slides captured my curiosity
exactly while I was in the process of evaluate a NAS/SAN appliance
intensively developed by a relatively young European company. The appliance
I'm referring to, despite the company's marketing approach avoided to refer
to expected terms such "illumos" or "ZFS", is clearly illumos based and
uses, among other value added proprietary technologies, illumos kernel and
ZFS as foundations for all other high level added features/services.

This is quite normal, nothing new here you would say (I add it's sad to see
that particular type of marketing approach in use: to apparently hide the
evidence of your roots not because it is evident enough but because it
isn't a useful topic that help to sell...this, at least, is my perception).

The statement "Even if for a limited time, elapsed time increases
upstreaming difficulty" of the second slide cited above hit my imagination:
so I started to look at some illumos forked repositories (often those
companies have one on GitHub, to cite only the illumos part forgetting
other illumos related projects they may have forked) and the only evident
fact I was able to note immediately is a probable relationship with the
"This branch is n commits behind illumos:master." GitHub assertion...where
the number "n" may (or may not) be an index of how much the company's
project (illumos in this case) has diverged since its initial
fork...leaving me with the impression that all possible related (bad/good)
consequences are going to have a real (bad/good) impact on the future of
the product/project especially if I want to find a relationship between
those possible consequences and what is going to happen on the master
branch (think about how fast things are changing when speaking about ZFS or
the illumos kernel development).

Maybe that only parameter (the n commits behind) is not enough to form a
valid opinion and start to speculate: "The company X develops, produces,
markets, sells and supports illumos based products but, looking at how much
behind their illumos fork is with respect to the illumos master branch,
that is not enough...what's about their grade of contributions to the
community? how good their product/support/development will then be if they
tend to diverge from the community?" and so on with similar questions.

I've also read the interesting "Illumos Productivity and Bus Factor"
illumetric blog entry (available at
https://illumetrics.wordpress.com/2015/01/28/illumos-productivity-and-bus-factor/)
but I didn't found a way to easily understand - as user - if a company is
acting well in terms of commits done and why it is (or it is not) doing
so...or to easily understand if its "public market image" finds a weighted
counterpart in its community image (through the contributions it could give
back to the entire illumos community).

This approach could be also extended/applied to institutions too, I mean
not only to commercial companies seen as special or particular entities
(remembering that committers are individuals that, mostly, work for
companies or for institutions)...but I'm now focused about companies that
sells illumos based technology because they creates profits also through
the essential software components they use as foundation of their products.

Illumetrics released a framework for calculating statistics on illumos
related repositories and data sources (see it here:
https://github.com/nickziv/illumetrics) but, as they stated, it is far from
complete (it seems to consider only contributions made by known names that
reference to yet well known companies without considering also
young/emerging ones in the count). That's not an illumetrics fault, that's
clear...simply the "data cluster" is still little to infer generically
about all illumos forked public projects.

So, after this long preamble, here my legitimate question: is there a way
to easily evaluate how good (and in which way) a commercial company - which
naturally attracts system administrators' attention with their products
(once and especially because those administrators realize that those
products are illumos based) - is in "giving back" (if it does) to the
illumos community (or to related communities) when that exact company
develops, produces, markets and sells appliances by - at best -
technically/commercially hiding (or by tending to hide or tending to not
sufficiently promote with the necessary transparency) the fact that their
products are essentially based and developed on a illumos fork?

What I'm asking here are not names but metrics or, eventually, metrics'
results...to help me form a partial but reasonable opinion.

Is there a way to rank/evaluate and so reward/honour (by, as example,
purchasing their products or by sustaining their development as
testers/free-time contributors) those {individuals, companies,
institutions} that clearly demonstrate not only to have good numbers
(commits) but also that they care about the community and that are more
transparent than others in advertising their commercial offer's origin?

Kind regards, Davide.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151207/70b69497/attachment.html>

From danmcd at omniti.com  Mon Dec  7 13:44:38 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 7 Dec 2015 08:44:38 -0500
Subject: [OmniOS-discuss] illumos and contributions metrics: how to
	evaluate companies that commercialize illumos based products
	by examining them in the light of their illumos community's
	contributions.
In-Reply-To: <CANKMAMYBGfr44MaH+t9=xBwzEH9f4-gn=ODCvCtndxMo0PqNXw@mail.gmail.com>
References: <CANKMAMYBGfr44MaH+t9=xBwzEH9f4-gn=ODCvCtndxMo0PqNXw@mail.gmail.com>
Message-ID: <DD34332E-D02A-434E-976A-26840ABDD96B@omniti.com>


> On Dec 7, 2015, at 8:13 AM, Davide Poletto <davide.poletto at gmail.com> wrote:
> 
> Is there a way to rank/evaluate and so reward/honour (by, as example, purchasing their products or by sustaining their development as testers/free-time contributors) those {individuals, companies, institutions} that clearly demonstrate not only to have good numbers (commits) but also that they care about the community and that are more transparent than others in advertising their commercial offer's origin?

That's a damned good question.  It's also very tricky.

Some firms keep things closed until they've released, or after some time after they've released.  Some find this fair enough, others find it annoying.  Because people are different, it may be hard to get a consensus on how to rank/evaluate firms the way you wish.  BTW, I lean toward "fair enough" so long as there's consistency and not going back on one's word.  
Keeping to one's word is important to me.  I didn't leave Oracle because of the Solaris-closing: if you read the text of that leaked email, it implied a source-dump-on-release model. Only after I left Oracle did it become clear that it was all a big lie.

You're chasing a hard problem.  You may not get much sympathy.  Making things MORE complicated is that "illumos" as a brand is still tightly tied up by its owner. Many feel that it's tied up too tightly, and that is why you rarely see "illumos" mentioned in marketing materials, especially not the trademarked symbol.

I'm sorry I don't have better answers for you right now.  It's a hard problem, and many of us who might be able to help clarify things are trying to keep all of the machinery moving as smoothly as we can.

Dan



From jeffpc at josefsipek.net  Mon Dec  7 15:07:17 2015
From: jeffpc at josefsipek.net (Josef 'Jeff' Sipek)
Date: Mon, 7 Dec 2015 10:07:17 -0500
Subject: [OmniOS-discuss] PowerDNS recursor SIGSEGV
In-Reply-To: <0112A91C-03C8-4007-A85F-893E6DBE93EE@omniti.com>
References: <20151206144514.GA1425@meili.valhalla.31bits.net>
	<B01D935B-39D6-4D60-BA4B-2A34994815C4@omniti.com>
	<20151206204030.GA1360@meili.valhalla.31bits.net>
	<20151206225426.GB1360@meili.valhalla.31bits.net>
	<0112A91C-03C8-4007-A85F-893E6DBE93EE@omniti.com>
Message-ID: <20151207150717.GD1359@meili.valhalla.31bits.net>

On Sun, Dec 06, 2015 at 06:42:46PM -0500, Dan McDonald wrote:
> I wonder how the 014-compiled binary performs on 016?  More accurately, I
> wonder if any gcc-51 compiled libs are off?

I'll try it out, but I expect it to work just fine - or die for a totally
different reason.  This is because the SIGSEGV is caused by this instruction:

_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x14:movaps %xmm0,-0x18(%ebp)

The _ZNKSt15_Deque_iteratorIcRcPcEmiEi function comes from a boost header and
it ends up in the pdns_recursor executable itself.  The executable is pretty
boring as far as libs are concerned:

# ldd /usr/sbin/pdns_recursor 
	libresolv.so.2 =>	 /lib/libresolv.so.2
	libsocket.so.1 =>	 /lib/libsocket.so.1
	libnsl.so.1 =>	 /lib/libnsl.so.1
	libstdc++.so.6 =>	 /usr/lib/libstdc++.so.6
	libm.so.2 =>	 /lib/libm.so.2
	librt.so.1 =>	 /lib/librt.so.1
	libgcc_s.so.1 =>	 /usr/lib/libgcc_s.so.1
	libpthread.so.1 =>	 /lib/libpthread.so.1
	libc.so.1 =>	 /lib/libc.so.1
	libmd.so.1 =>	 /lib/libmd.so.1
	libmp.so.2 =>	 /lib/libmp.so.2

gcc 4.8/4.9 compiled powerdns doesn't use this instruction at all.  (The SEGV
is because the memory operand is 8-byte aligned instead of the required 16-byte
alignment.  This causes #gp which turns into a SIGSEGV via the normal trap code
in the kernel.)

Jeff.

> Dan
> 
> Sent from my iPhone (typos, autocorrect, and all)
> 
> > On Dec 6, 2015, at 5:54 PM, Josef 'Jeff' Sipek <jeffpc at josefsipek.net> wrote:
> > 
> >> On Sun, Dec 06, 2015 at 03:40:30PM -0500, Josef 'Jeff' Sipek wrote:
> >>> On Sun, Dec 06, 2015 at 10:26:00AM -0500, Dan McDonald wrote:
> >>> One other weird thing to try -- build powerdns with the Illumos gcc4.  If
> >>> the gcc5 bug affects powerdns, that'd isolate it.  If gcc5 affects some
> >>> non Illumos library, gcc4 won't help and you'll still segv.
> >>> 
> >>> If gcc4 Illumos can't build it,
> >> 
> >> The powerdns devs use a lot of c++11 which makes 4.4.4 *waaay* too old.
> >> Apparently, 4.8 should be good enough.
> >> 
> >>> you could try 014 and its gcc481.
> >> 
> >> Yeah, I'll try that.
> > 
> > Ok.  014 produces the same exact instructions as OI Hipster.  I wonder if
> > gcc 5 changed some processor default.
> > 
> > Jeff.
> > 
> >> Thanks,
> >> 
> >> Jeff.
> >> 
> >>> 
> >>> Dan
> >>> 
> >>> Sent from my iPhone (typos, autocorrect, and all)
> >>> 
> >>>> On Dec 6, 2015, at 9:45 AM, Josef 'Jeff' Sipek <jeffpc at josefsipek.net> wrote:
> >>>> 
> >>>> I compiled powerdns recursor [1] on 016, but I'm running into an occasional
> >>>> SIGSEGV.  The SIGSEGV is because of insufficiently aligned memory operand to an
> >>>> instruction.  (See the powerdns bug I filed for this [2].) The SIGSEGV actually
> >>>> happens in the deque code which comes from boost (1.58.0 in this case).
> >>>> 
> >>>> Now, the weird thing... I compiled the same powerdns source with the same
> >>>> version of boost on OI Hipster and OmniOS 016.  Hipster uses gcc 4.9.3,
> >>>> OmniOS 016 uses 5.1.  The function that causes the SEGV on 016 looks totally
> >>>> different between the two distros so I haven't see it die on my laptop.
> >>>> 
> >>>> Has anyone seen any strange SIGSEGVs in boost using software?  I hope it isn't
> >>>> some sort of gcc bug.
> >>>> 
> >>>> Thanks,
> >>>> 
> >>>> Jeff.
> >>>> 
> >>>> P.S. PowerDNS uses {get,set,swap}context, so I haven't ruled out a stack
> >>>>    alignment bug on their end.
> >>>> 
> >>>> [1] https://www.powerdns.com/
> >>>> [2] https://github.com/PowerDNS/pdns/issues/3002
> >>>> 
> >>>> 
> >>>> OmniOS 016:
> >>>> 
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi:     pushl  %ebp
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+1:   movl   %esp,%ebp
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+3:   pushl  %ebx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+4:   subl   $0x1c,%esp
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+7:   movl   0xc(%ebp),%eax
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xa: movl   0x8(%ebp),%ebx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xd: movdqu (%eax),%xmm0
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x11:movl   0x10(%ebp),%eax
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x14:movaps %xmm0,-0x18(%ebp)
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x18:negl   %eax
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1a:pushl  %eax
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1b:leal   -0x18(%ebp),%eax
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1e:pushl  %eax
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1f:call   -0x94    <_ZNSt15_Deque_iteratorIcRcPcEpLEi>
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x24:movl   (%eax),%edx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x26:addl   $0x10,%esp
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x29:movl   %edx,(%ebx)
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2b:movl   0x4(%eax),%edx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2e:movl   %edx,0x4(%ebx)
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x31:movl   0x8(%eax),%edx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x34:movl   0xc(%eax),%eax
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x37:movl   %edx,0x8(%ebx)
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a:movl   %eax,0xc(%ebx)
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3d:movl   %ebx,%eax
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3f:movl   -0x4(%ebp),%ebx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x42:leave  
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x43:ret    $0x4
> >>>> 
> >>>> 
> >>>> OI Hipster:
> >>>> 
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi:     pushl  %ebp
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+1:   pushl  %edi
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+2:   pushl  %esi
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+3:   pushl  %ebx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+4:   subl   $0x14,%esp
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+7:   movl   0x2c(%esp),%edx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xb: movl   0x30(%esp),%ebx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0xf: movl   0x28(%esp),%eax
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x13:movl   (%edx),%esi
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x15:movl   0x4(%edx),%ecx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x18:movl   0x8(%edx),%edi
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1b:movl   0xc(%edx),%ebp
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x1e:movl   %esi,%edx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x20:subl   %ebx,%esi
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x22:subl   %ecx,%edx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x24:subl   %ebx,%edx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x26:cmpl   $0x1ff,%edx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2c:movl   %esi,(%esp)
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x2f:jbe    +0x21    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x52>
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x31:movl   %edx,%ebx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x33:sarl   $0x9,%ebx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x36:testl  %edx,%edx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x38:jle    +0x56    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x90>
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a:leal   0x0(%ebp,%ebx,4),%ebp
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3e:movl   0x0(%ebp),%ecx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x41:shll   $0x9,%ebx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x44:subl   %ebx,%edx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x46:leal   (%ecx,%edx),%esi
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x49:leal   0x200(%ecx),%edi
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x4f:movl   %esi,(%esp)
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x52:movl   %edi,0x4(%esp)
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x56:movd   (%esp),%xmm0
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x5b:movl   %ecx,(%esp)
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x5e:movd   0x4(%esp),%xmm1
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x64:movl   %ebp,0x4(%esp)
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x68:movd   (%esp),%xmm3
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x6d:punpckldq %xmm3,%xmm0
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x71:movd   0x4(%esp),%xmm2
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x77:punpckldq %xmm2,%xmm1
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x7b:punpcklqdq %xmm1,%xmm0
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x7f:movdqu %xmm0,(%eax)
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x83:addl   $0x14,%esp
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x86:popl   %ebx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x87:popl   %esi
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x88:popl   %edi
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x89:popl   %ebp
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x8a:ret    $0x4
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x8d:leal   0x0(%esi),%esi
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x90:movl   %edx,%ebx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x92:shrl   $0x9,%ebx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x95:orl    $0xff800000,%ebx
> >>>> _ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x9b:jmp    -0x63    <_ZNKSt15_Deque_iteratorIcRcPcEmiEi+0x3a>
> >>>> 
> >>>> -- 
> >>>> I'm somewhere between geek and normal.
> >>>>       - Linus Torvalds
> >>>> _______________________________________________
> >>>> OmniOS-discuss mailing list
> >>>> OmniOS-discuss at lists.omniti.com
> >>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
> >> 
> >> -- 
> >> The box said "Windows XP or better required". So I installed Linux.
> > 
> > -- 
> > If I have trouble installing Linux, something is wrong. Very wrong.
> >        - Linus Torvalds

-- 
Humans were created by water to transport it upward.

From lkateley at kateley.com  Mon Dec  7 16:13:02 2015
From: lkateley at kateley.com (Linda Kateley)
Date: Mon, 7 Dec 2015 10:13:02 -0600
Subject: [OmniOS-discuss] illumos and contributions metrics: how to
 evaluate companies that commercialize illumos based products by examining
 them in the light of their illumos community's contributions.
In-Reply-To: <DD34332E-D02A-434E-976A-26840ABDD96B@omniti.com>
References: <CANKMAMYBGfr44MaH+t9=xBwzEH9f4-gn=ODCvCtndxMo0PqNXw@mail.gmail.com>
	<DD34332E-D02A-434E-976A-26840ABDD96B@omniti.com>
Message-ID: <5665B00E.70307@kateley.com>

Blackduck does this for you.

https://www.openhub.net/p?ref=homepage&query=illumos

On 12/7/15 7:44 AM, Dan McDonald wrote:
>> On Dec 7, 2015, at 8:13 AM, Davide Poletto <davide.poletto at gmail.com> wrote:
>>
>> Is there a way to rank/evaluate and so reward/honour (by, as example, purchasing their products or by sustaining their development as testers/free-time contributors) those {individuals, companies, institutions} that clearly demonstrate not only to have good numbers (commits) but also that they care about the community and that are more transparent than others in advertising their commercial offer's origin?
> That's a damned good question.  It's also very tricky.
>
> Some firms keep things closed until they've released, or after some time after they've released.  Some find this fair enough, others find it annoying.  Because people are different, it may be hard to get a consensus on how to rank/evaluate firms the way you wish.  BTW, I lean toward "fair enough" so long as there's consistency and not going back on one's word.
> Keeping to one's word is important to me.  I didn't leave Oracle because of the Solaris-closing: if you read the text of that leaked email, it implied a source-dump-on-release model. Only after I left Oracle did it become clear that it was all a big lie.
>
> You're chasing a hard problem.  You may not get much sympathy.  Making things MORE complicated is that "illumos" as a brand is still tightly tied up by its owner. Many feel that it's tied up too tightly, and that is why you rarely see "illumos" mentioned in marketing materials, especially not the trademarked symbol.
>
> I'm sorry I don't have better answers for you right now.  It's a hard problem, and many of us who might be able to help clarify things are trying to keep all of the machinery moving as smoothly as we can.
>
> Dan
>
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss


From doug at will.to  Tue Dec  8 11:26:42 2015
From: doug at will.to (Doug Hughes)
Date: Tue, 8 Dec 2015 16:56:42 +0530
Subject: [OmniOS-discuss] Sol 11 zone on OmniOS system
Message-ID: <CAOpmc6wESQVaNbm-9wWDpnVXrbCPziwCZu_8CN=zWbO3+0C3uA@mail.gmail.com>

Has anybody done this or is anybody doing this? Any tips/tricks? I want to
migrate a working Sol11 zone to an OmniOS system, if possible, I figure
I'll have to use a special big of 'branding' to make it go.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151208/6ef04083/attachment.html>

From danmcd at omniti.com  Tue Dec  8 12:23:30 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 8 Dec 2015 07:23:30 -0500
Subject: [OmniOS-discuss] Sol 11 zone on OmniOS system
In-Reply-To: <CAOpmc6wESQVaNbm-9wWDpnVXrbCPziwCZu_8CN=zWbO3+0C3uA@mail.gmail.com>
References: <CAOpmc6wESQVaNbm-9wWDpnVXrbCPziwCZu_8CN=zWbO3+0C3uA@mail.gmail.com>
Message-ID: <7EE9651A-2AF3-436A-8E1B-1D692E00BF12@omniti.com>


> On Dec 8, 2015, at 6:26 AM, Doug Hughes <doug at will.to> wrote:
> 
> Has anybody done this or is anybody doing this? Any tips/tricks? I want to migrate a working Sol11 zone to an OmniOS system, if possible, I figure I'll have to use a special big of 'branding' to make it go.

It's going to be worse.  S11 and illumos diverged going back to its beginnings.

You're going to want to treat this more like a platform-to-platform move.  I don't think there are much in the way of tricks to help you, CERTAINLY not with zone branding.

Sorry,
Dan


From chip at innovates.com  Tue Dec  8 16:16:42 2015
From: chip at innovates.com (Schweiss, Chip)
Date: Tue, 8 Dec 2015 10:16:42 -0600
Subject: [OmniOS-discuss] NFS Server restart
Message-ID: <CALeZrrTtffmWa-CREK41XzYctmZ_d1Pdsj-FYW-GFtsK_6ZZTw@mail.gmail.com>

I had an NFS server become unresponsive on one of my production systems.
The NFS server service would not restart, out of desperation I rebooted
which fixed the problem.

Before reboot I tried restarting all NFS related service with no-avail.
The reboot probably wasn't necessary but the correct list and order of
services to restart is.

Can someone fill me in on which services in what order should be
stopped/started to get NFS fully reset?

Thanks!
-Chip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151208/1f0b07f5/attachment.html>

From danmcd at omniti.com  Tue Dec  8 17:47:35 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 8 Dec 2015 12:47:35 -0500
Subject: [OmniOS-discuss] NFS Server restart
In-Reply-To: <CALeZrrTtffmWa-CREK41XzYctmZ_d1Pdsj-FYW-GFtsK_6ZZTw@mail.gmail.com>
References: <CALeZrrTtffmWa-CREK41XzYctmZ_d1Pdsj-FYW-GFtsK_6ZZTw@mail.gmail.com>
Message-ID: <8408B53E-87CA-44BB-9A6A-F767FB5DCC62@omniti.com>


> On Dec 8, 2015, at 11:16 AM, Schweiss, Chip <chip at innovates.com> wrote:
> 
> Can someone fill me in on which services in what order should be stopped/started to get NFS fully reset?

You can start here...

shell(~)[0]% svcs -d nfs/server
STATE          STIME    FMRI
disabled       Nov_16   svc:/network/rpc/keyserv:default
online         Nov_16   svc:/milestone/network:default
online         Nov_16   svc:/network/rpc/bind:default
online         Nov_16   svc:/system/filesystem/local:default
online         Nov_16   svc:/network/shares/group:default
online         Nov_16   svc:/network/shares/group:smb
online         Nov_16   svc:/system/filesystem/reparse:default
online         Nov_16   svc:/network/nfs/nlockmgr:default
online         Nov_16   svc:/network/nfs/mapid:default
online         Nov_16   svc:/network/rpc/gss:default
online         Nov_16   svc:/network/shares/group:zfs
shell(~)[0]% 

and further descend the rabbit hole if you need, but most of the intermediate NFS services all depend on rpc/bind.

Dan


From danmcd at omniti.com  Tue Dec  8 20:09:58 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 8 Dec 2015 15:09:58 -0500
Subject: [OmniOS-discuss] Attention OmniOS AMI users
References: <8EC8ABD5-3B59-407F-8173-899493192D15@omniti.com>
Message-ID: <A9994DE7-AC60-43C7-9076-5DAF600B4402@omniti.com>

If you are using any OmniOS AMI r151012 or earlier, please read this.  If you're using r151014, you may ignore this message.

It has come to our attention that some of the older OmniOS images, including images for r151006 and r151012, may have stored SSH host keys included with them, which could be used to execute a man in the middle attack. 

If you are currently running one of these older versions, we suggest you verify and regenerate your keys, and/or move to a current OmniOS AMI.

For r151006 users, there is a new image named "OmniOS r151006 LTS" which should be available in your region.  We recommend that users of r151012 (and any other older versions which are now ESOL) move to a current r151014 AMI.

Again, the OmniOS r151014 AMIs DO NOT HAVE stored SSH host keys and are *NOT* vulnerable.

Thanks and sorry for any inconvenience,
Dan

p.s. This is also on the AWS forums:  https://forums.aws.amazon.com/thread.jspa?threadID=221330

From henson at acm.org  Wed Dec  9 02:31:45 2015
From: henson at acm.org (Paul B. Henson)
Date: Tue, 08 Dec 2015 18:31:45 -0800
Subject: [OmniOS-discuss] core dump while trying to import pool
In-Reply-To: <317E3C4D-1AD5-4A57-95BD-B12624049595@lji.org>
References: <CAGueQCfmTuFoch_4xPhy3sWM3KCT5dJB8TOVSmq=Lt_BR_jO1A@mail.gmail.com>
	<71C5258A-C99E-44DF-BFE1-A1D5EE0CE686@omniti.com>
	<CAGueQCducyk7VNDkVGoJSj6WNfbtng4+E5126zFoXp0nOwRwBQ@mail.gmail.com>
	<589C8043-C3E2-4249-99E8-AA5A35E17892@omniti.com>
	<CAGP7N4PCESWoS-Et+-N7mpHCa_WwPyZ6=CkPD7Jp+-8YaMfOVQ@mail.gmail.com>
	<20151206021738.GT3405@bender.unx.cpp.edu>
	<317E3C4D-1AD5-4A57-95BD-B12624049595@lji.org>
Message-ID: <0d7e01d13229$bfa00bc0$3ee02340$@acm.org>

> From: Michael Talbott
> Sent: Saturday, December 05, 2015 10:50 PM
> 
> I did not run a zdb check since this pool was over 200TB and figured it'd
take
> weeks to finish.

Ah, mine is only 22TB available with a bit over 10TB in use; it took about
five hours as I recall.

> At any rate, a clean scrub alone is not an indicator of pool health
regarding
> this bug. No clue if a zdb analyses would be a more determining factor.

My understanding of the particular zdb invocation I used is that it scans
every block of data and metadata and verifies its checksum, so I think it
should have found potential corruption, even if scrub did not.

> Since I didn't zdb it first.. Maybe your nerves can be at more ease? Good
luck
> and let me know how things turn out.

Perhaps not more at ease, but at least not less at ease :). Thanks much for
the info.


From tobi at oetiker.ch  Wed Dec  9 07:05:30 2015
From: tobi at oetiker.ch (Tobias Oetiker)
Date: Wed, 9 Dec 2015 08:05:30 +0100 (CET)
Subject: [OmniOS-discuss] considering an SSD pool ... which SSD
Message-ID: <alpine.DEB.2.20.1512090802040.23336@engelberg>

We are looking into the possibility of setting up our first SSD
based pool ... any recommendations for SSDs to use ?

Our System Integrator recommends the use of Intel SSDs as opposed
to Samsung since Samsung would be changing their lineup every few
weeks and thus make it difficult to source replacement disks, in
case one should fail.

any thoughts ?

cheers
tobi





-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
www.oetiker.ch tobi at oetiker.ch +41 62 775 9902


From alka at hfg-gmuend.de  Wed Dec  9 08:10:58 2015
From: alka at hfg-gmuend.de (Guenther Alka)
Date: Wed, 9 Dec 2015 09:10:58 +0100
Subject: [OmniOS-discuss] considering an SSD pool ... which SSD
In-Reply-To: <alpine.DEB.2.20.1512090802040.23336@engelberg>
References: <alpine.DEB.2.20.1512090802040.23336@engelberg>
Message-ID: <5667E212.5070000@hfg-gmuend.de>

I use several SSD pools for years.

As there is no trim support I check for
- high quality controller for controller internal garbage collection
- large overprovisioning. Enterprise SSDs offer 40% or more
- quality flash with a high write endurance
- powerloss protection, optionally you can use a higher class Slog with 
this feature

In my newer pools I use Intel S3610 as they are "quite" affordable
- not as expensice as the 37x0 line and with a better write performance 
than the 35x0 line.

In my last cheaper pools I chosed the Sandisk Pro extreme. The Samsung 
Pro was an option
but the Sandisks have a larger overprovisioning per default with a lower 
write performance
degration under load. As they do not have powerloss protection, I added 
fast Slog (S3700 or ZeusRAM)


Gea


Am 09.12.2015 um 08:05 schrieb Tobias Oetiker:
> We are looking into the possibility of setting up our first SSD
> based pool ... any recommendations for SSDs to use ?
>
> Our System Integrator recommends the use of Intel SSDs as opposed
> to Samsung since Samsung would be changing their lineup every few
> weeks and thus make it difficult to source replacement disks, in
> case one should fail.
>
> any thoughts ?
>
> cheers
> tobi
>
>
>
>
>


From doug at will.to  Wed Dec  9 08:24:07 2015
From: doug at will.to (Doug Hughes)
Date: Wed, 9 Dec 2015 13:54:07 +0530
Subject: [OmniOS-discuss] considering an SSD pool ... which SSD
In-Reply-To: <alpine.DEB.2.20.1512090802040.23336@engelberg>
References: <alpine.DEB.2.20.1512090802040.23336@engelberg>
Message-ID: <CAOpmc6x-Hpbm6cc9fkxi8uSJfUk1s1KixzFf+xRsayMWUD6EyQ@mail.gmail.com>

I like the Samsung 850 line. It has been around and quite stable for some
time. We haven't had any problems with device availability. (depending on
how much writing you are doing, you'd probably best avoid the EVO and stick
with straight 850 or 850 pro)



On Wed, Dec 9, 2015 at 12:35 PM, Tobias Oetiker <tobi at oetiker.ch> wrote:

> We are looking into the possibility of setting up our first SSD
> based pool ... any recommendations for SSDs to use ?
>
> Our System Integrator recommends the use of Intel SSDs as opposed
> to Samsung since Samsung would be changing their lineup every few
> weeks and thus make it difficult to source replacement disks, in
> case one should fail.
>
> any thoughts ?
>
> cheers
> tobi
>
>
>
>
>
> --
> Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
> www.oetiker.ch tobi at oetiker.ch +41 62 775 9902
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151209/86227405/attachment.html>

From wonko at 4amlunch.net  Wed Dec  9 13:14:17 2015
From: wonko at 4amlunch.net (Brian Hechinger)
Date: Wed, 9 Dec 2015 08:14:17 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
Message-ID: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>

So I decided to do some testing on the pool I have that is made up of a pair of Samsung 851 NVMe drives.

I?ve got it partitioned as I?m using part of it to test as SLOG against the ?spinning rust pool?. Yes I know these aren?t ideal for this, but they will do for now.

I setup the other slices as a mirror and ran iozone against it.

It wrote fast. Really fast.

Then it stopped.

Now the pool seems to be wedged. At first I thought it might be the drives themselves, but I see them still functioning as SLOG just fine, so it?s not that I don?t believe.

root at basket1:/root# zpool status -v zoom
  pool: zoom
 state: ONLINE
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://illumos.org/msg/ZFS-8000-HC
  scan: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        zoom          ONLINE       0     0     1
          mirror-0    ONLINE       0     0     6
            c4t1d0s1  ONLINE       0     0     6
            c5t1d0s1  ONLINE       0     0     6

errors: List of errors unavailable (insufficient privileges)
root at basket1:/root# ls /zoom/
iozone.DUMMY.0  iozone.DUMMY.10  iozone.DUMMY.12  iozone.DUMMY.14  iozone.DUMMY.2  iozone.DUMMY.4  iozone.DUMMY.6  iozone.DUMMY.8
iozone.DUMMY.1  iozone.DUMMY.11  iozone.DUMMY.13  iozone.DUMMY.15  iozone.DUMMY.3  iozone.DUMMY.5  iozone.DUMMY.7  iozone.DUMMY.9
root at basket1:/root# touch /zoom/hi

So read access appears to be ok. Writes are totally boned, however.  That touch just hangs forever.

So what do I need to do to provide you all with the information you need to diagnose this.

Thanks!

-brian

From davide.poletto at gmail.com  Wed Dec  9 14:02:50 2015
From: davide.poletto at gmail.com (Davide Poletto)
Date: Wed, 9 Dec 2015 15:02:50 +0100
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
Message-ID: <CANKMAMYuOgrf+_1i8XGSOvQ=1-+89TZj2r57VBJ9DA2qGGXbUQ@mail.gmail.com>

Hi Brian,

a side note: are you sure that your Samsung 851 drive (I think you're
referring more specifically to the Samsung PM851 SSD Drive) supports the
NVMe interface standard?

I think it doesn't...at least looking at its released interface's
specifications: it uses SATA 3 (6.0 Gbps) interface instead of the NVMe 1.1
used by "disks" like the Samsung PM/SM951, PM1725, XS/SM1715 or the
PM/SM953...just to name some.

Regards, Davide.

On Wed, Dec 9, 2015 at 2:14 PM, Brian Hechinger <wonko at 4amlunch.net> wrote:

> So I decided to do some testing on the pool I have that is made up of a
> pair of Samsung 851 NVMe drives.
>
> I?ve got it partitioned as I?m using part of it to test as SLOG against
> the ?spinning rust pool?. Yes I know these aren?t ideal for this, but they
> will do for now.
>
> I setup the other slices as a mirror and ran iozone against it.
>
> It wrote fast. Really fast.
>
> Then it stopped.
>
> Now the pool seems to be wedged. At first I thought it might be the drives
> themselves, but I see them still functioning as SLOG just fine, so it?s not
> that I don?t believe.
>
> root at basket1:/root# zpool status -v zoom
>   pool: zoom
>  state: ONLINE
> status: One or more devices are faulted in response to IO failures.
> action: Make sure the affected devices are connected, then run 'zpool
> clear'.
>    see: http://illumos.org/msg/ZFS-8000-HC
>   scan: none requested
> config:
>
>         NAME          STATE     READ WRITE CKSUM
>         zoom          ONLINE       0     0     1
>           mirror-0    ONLINE       0     0     6
>             c4t1d0s1  ONLINE       0     0     6
>             c5t1d0s1  ONLINE       0     0     6
>
> errors: List of errors unavailable (insufficient privileges)
> root at basket1:/root# ls /zoom/
> iozone.DUMMY.0  iozone.DUMMY.10  iozone.DUMMY.12  iozone.DUMMY.14
> iozone.DUMMY.2  iozone.DUMMY.4  iozone.DUMMY.6  iozone.DUMMY.8
> iozone.DUMMY.1  iozone.DUMMY.11  iozone.DUMMY.13  iozone.DUMMY.15
> iozone.DUMMY.3  iozone.DUMMY.5  iozone.DUMMY.7  iozone.DUMMY.9
> root at basket1:/root# touch /zoom/hi
>
> So read access appears to be ok. Writes are totally boned, however.  That
> touch just hangs forever.
>
> So what do I need to do to provide you all with the information you need
> to diagnose this.
>
> Thanks!
>
> -brian
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151209/2297911a/attachment-0001.html>

From wonko at 4amlunch.net  Wed Dec  9 14:04:36 2015
From: wonko at 4amlunch.net (Brian Hechinger)
Date: Wed, 9 Dec 2015 09:04:36 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <CANKMAMYuOgrf+_1i8XGSOvQ=1-+89TZj2r57VBJ9DA2qGGXbUQ@mail.gmail.com>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<CANKMAMYuOgrf+_1i8XGSOvQ=1-+89TZj2r57VBJ9DA2qGGXbUQ@mail.gmail.com>
Message-ID: <36DAE0E6-6350-4370-BF80-250628287A59@4amlunch.net>

Sorry, typo-ed that.

These are SM951

-brian

> On Dec 9, 2015, at 9:02 AM, Davide Poletto <davide.poletto at gmail.com> wrote:
> 
> Hi Brian,
> 
> a side note: are you sure that your Samsung 851 drive (I think you're referring more specifically to the Samsung PM851 SSD Drive) supports the NVMe interface standard?
> 
> I think it doesn't...at least looking at its released interface's specifications: it uses SATA 3 (6.0 Gbps) interface instead of the NVMe 1.1 used by "disks" like the Samsung PM/SM951, PM1725, XS/SM1715 or the PM/SM953...just to name some.
> 
> Regards, Davide.
> 
> On Wed, Dec 9, 2015 at 2:14 PM, Brian Hechinger <wonko at 4amlunch.net <mailto:wonko at 4amlunch.net>> wrote:
> So I decided to do some testing on the pool I have that is made up of a pair of Samsung 851 NVMe drives.
> 
> I?ve got it partitioned as I?m using part of it to test as SLOG against the ?spinning rust pool?. Yes I know these aren?t ideal for this, but they will do for now.
> 
> I setup the other slices as a mirror and ran iozone against it.
> 
> It wrote fast. Really fast.
> 
> Then it stopped.
> 
> Now the pool seems to be wedged. At first I thought it might be the drives themselves, but I see them still functioning as SLOG just fine, so it?s not that I don?t believe.
> 
> root at basket1:/root# zpool status -v zoom
>   pool: zoom
>  state: ONLINE
> status: One or more devices are faulted in response to IO failures.
> action: Make sure the affected devices are connected, then run 'zpool clear'.
>    see: http://illumos.org/msg/ZFS-8000-HC <http://illumos.org/msg/ZFS-8000-HC>
>   scan: none requested
> config:
> 
>         NAME          STATE     READ WRITE CKSUM
>         zoom          ONLINE       0     0     1
>           mirror-0    ONLINE       0     0     6
>             c4t1d0s1  ONLINE       0     0     6
>             c5t1d0s1  ONLINE       0     0     6
> 
> errors: List of errors unavailable (insufficient privileges)
> root at basket1:/root# ls /zoom/
> iozone.DUMMY.0  iozone.DUMMY.10  iozone.DUMMY.12  iozone.DUMMY.14  iozone.DUMMY.2  iozone.DUMMY.4  iozone.DUMMY.6  iozone.DUMMY.8
> iozone.DUMMY.1  iozone.DUMMY.11  iozone.DUMMY.13  iozone.DUMMY.15  iozone.DUMMY.3  iozone.DUMMY.5  iozone.DUMMY.7  iozone.DUMMY.9
> root at basket1:/root# touch /zoom/hi
> 
> So read access appears to be ok. Writes are totally boned, however.  That touch just hangs forever.
> 
> So what do I need to do to provide you all with the information you need to diagnose this.
> 
> Thanks!
> 
> -brian
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com <mailto:OmniOS-discuss at lists.omniti.com>
> http://lists.omniti.com/mailman/listinfo/omnios-discuss <http://lists.omniti.com/mailman/listinfo/omnios-discuss>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151209/ff8a1a80/attachment.html>

From danmcd at omniti.com  Wed Dec  9 15:00:50 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 9 Dec 2015 10:00:50 -0500
Subject: [OmniOS-discuss] considering an SSD pool ... which SSD
In-Reply-To: <alpine.DEB.2.20.1512090802040.23336@engelberg>
References: <alpine.DEB.2.20.1512090802040.23336@engelberg>
Message-ID: <907860CD-CFAC-4F54-8B9E-3A84373E7EA9@omniti.com>


> On Dec 9, 2015, at 2:05 AM, Tobias Oetiker <tobi at oetiker.ch> wrote:
> 
> Our System Integrator recommends the use of Intel SSDs as opposed

We use Intel ones in-house, because they have the best reputation for wear.  I suspect the other folks are catching up, but Intel had a lead.

My $0.02,
Dan


From trey at mailchimp.com  Wed Dec  9 15:13:35 2015
From: trey at mailchimp.com (Trey Palmer)
Date: Wed, 9 Dec 2015 10:13:35 -0500
Subject: [OmniOS-discuss] considering an SSD pool ... which SSD
In-Reply-To: <alpine.DEB.2.20.1512090802040.23336@engelberg>
References: <alpine.DEB.2.20.1512090802040.23336@engelberg>
Message-ID: <CADRROpVXEY1A1bwP7iObSU_6YFyRhbOOHBz_n3NWnGd9FKk+7g@mail.gmail.com>

Tobias,

We use the Intel DC S3700.

I haven't tried the S3710's yet.   The S3700's are very reliable.

   -- Trey


On Wed, Dec 9, 2015 at 2:05 AM, Tobias Oetiker <tobi at oetiker.ch> wrote:

> We are looking into the possibility of setting up our first SSD
> based pool ... any recommendations for SSDs to use ?
>
> Our System Integrator recommends the use of Intel SSDs as opposed
> to Samsung since Samsung would be changing their lineup every few
> weeks and thus make it difficult to source replacement disks, in
> case one should fail.
>
> any thoughts ?
>
> cheers
> tobi
>
>
>
>
>
> --
> Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
> www.oetiker.ch tobi at oetiker.ch +41 62 775 9902
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151209/4ec3939a/attachment.html>

From danmcd at omniti.com  Wed Dec  9 15:16:03 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 9 Dec 2015 10:16:03 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
Message-ID: <A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>


> On Dec 9, 2015, at 8:14 AM, Brian Hechinger <wonko at 4amlunch.net> wrote:
> 
> So read access appears to be ok. Writes are totally boned, however.  That touch just hangs forever.
> 
> So what do I need to do to provide you all with the information you need to diagnose this.

Do you literally have a touch process hanging right now?  Or is it something you can ^C out of?

Does anything stand out in /var/adm/messages?  Maybe the kernel is complaining about something there.

My final inclination is heavy-handed:

	- Make sure you have at least one process stuck on writing to that filesystem.

	- "reboot -d" and take a kernel coredump

Unless you have sensitive information, a kernel coredump you can share would be the best thing to do.


Dan

p.s. I'm at the Dr. the rest of the day starting in 90 mins, pardon any latency.

From davide.poletto at gmail.com  Wed Dec  9 15:16:13 2015
From: davide.poletto at gmail.com (Davide Poletto)
Date: Wed, 9 Dec 2015 16:16:13 +0100
Subject: [OmniOS-discuss] considering an SSD pool ... which SSD
In-Reply-To: <907860CD-CFAC-4F54-8B9E-3A84373E7EA9@omniti.com>
References: <alpine.DEB.2.20.1512090802040.23336@engelberg>
	<907860CD-CFAC-4F54-8B9E-3A84373E7EA9@omniti.com>
Message-ID: <CANKMAMb0eSoCd9F6GrqLi6ve5YV_OUx4TB0xXLDvG-YBnR29ng@mail.gmail.com>

Yep, eventually in evaluating which could be the best Intel SSD for you, I
used information on this Wiki page
<https://www.thomas-krenn.com/en/wiki/Intel_SSDs_Overview> (it's not
directly provided by Intel but it provides an overall summary and also goes
specific on each Intel SSD drives; maybe it's not so updated).
Davide.

On Wed, Dec 9, 2015 at 4:00 PM, Dan McDonald <danmcd at omniti.com> wrote:

>
> > On Dec 9, 2015, at 2:05 AM, Tobias Oetiker <tobi at oetiker.ch> wrote:
> >
> > Our System Integrator recommends the use of Intel SSDs as opposed
>
> We use Intel ones in-house, because they have the best reputation for
> wear.  I suspect the other folks are catching up, but Intel had a lead.
>
> My $0.02,
> Dan
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151209/c5f47e92/attachment-0001.html>

From wonko at 4amlunch.net  Wed Dec  9 15:20:15 2015
From: wonko at 4amlunch.net (Brian Hechinger)
Date: Wed, 9 Dec 2015 10:20:15 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
Message-ID: <3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>

I cannot ^C out of the touch.

wonko at basket1:/export/home/wonko$ ps -ef | grep touch
    root  2459  2447   0 08:12:09 ?           0:00 touch /zoom/hi
    root  2050  2049   0   Dec 07 ?           0:00 touch hi
    root  2049     1   0   Dec 07 ?           0:00 sudo touch hi

Also, kill -9 doesn?t touch them.

the only thing in messages is:

Dec  7 14:31:56 basket1 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-HC, TYPE: Error, VER: 1, SEVERITY: Major
Dec  7 14:31:56 basket1 EVENT-TIME: Mon Dec  7 14:31:56 EST 2015
Dec  7 14:31:56 basket1 PLATFORM: X8DTL, CSN: 1234567890, HOSTNAME: basket1
Dec  7 14:31:56 basket1 SOURCE: zfs-diagnosis, REV: 1.0
Dec  7 14:31:56 basket1 EVENT-ID: 585f9fa2-4a84-4184-8c87-c2f9c600e1a1
Dec  7 14:31:56 basket1 DESC: The ZFS pool has experienced currently unrecoverable I/O
Dec  7 14:31:56 basket1             failures.  Refer to http://illumos.org/msg/ZFS-8000-HC for more information.
Dec  7 14:31:56 basket1 AUTO-RESPONSE: No automated response will be taken.
Dec  7 14:31:56 basket1 IMPACT: Read and write I/Os cannot be serviced.
Dec  7 14:31:56 basket1 REC-ACTION: Make sure the affected devices are connected, then run
Dec  7 14:31:56 basket1             'zpool clear?.

I can definitely share a kernel coredump, that?s not a problem. Just need to schedule a time to shut down all the VMs first.

Maybe later tonight.

-brian

> On Dec 9, 2015, at 10:16 AM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> 
>> On Dec 9, 2015, at 8:14 AM, Brian Hechinger <wonko at 4amlunch.net> wrote:
>> 
>> So read access appears to be ok. Writes are totally boned, however.  That touch just hangs forever.
>> 
>> So what do I need to do to provide you all with the information you need to diagnose this.
> 
> Do you literally have a touch process hanging right now?  Or is it something you can ^C out of?
> 
> Does anything stand out in /var/adm/messages?  Maybe the kernel is complaining about something there.
> 
> My final inclination is heavy-handed:
> 
> 	- Make sure you have at least one process stuck on writing to that filesystem.
> 
> 	- "reboot -d" and take a kernel coredump
> 
> Unless you have sensitive information, a kernel coredump you can share would be the best thing to do.
> 
> 
> Dan
> 
> p.s. I'm at the Dr. the rest of the day starting in 90 mins, pardon any latency.


From wonko at 4amlunch.net  Wed Dec  9 15:23:50 2015
From: wonko at 4amlunch.net (Brian Hechinger)
Date: Wed, 9 Dec 2015 10:23:50 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
Message-ID: <81B1D1AC-A063-455F-958B-6BBCF1879BB0@4amlunch.net>

Just did a ?zpool clear? on that pool and now I see:

errors: Permanent errors have been detected in the following files:

        <metadata>:<0x59>

> On Dec 9, 2015, at 10:16 AM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> 
>> On Dec 9, 2015, at 8:14 AM, Brian Hechinger <wonko at 4amlunch.net> wrote:
>> 
>> So read access appears to be ok. Writes are totally boned, however.  That touch just hangs forever.
>> 
>> So what do I need to do to provide you all with the information you need to diagnose this.
> 
> Do you literally have a touch process hanging right now?  Or is it something you can ^C out of?
> 
> Does anything stand out in /var/adm/messages?  Maybe the kernel is complaining about something there.
> 
> My final inclination is heavy-handed:
> 
> 	- Make sure you have at least one process stuck on writing to that filesystem.
> 
> 	- "reboot -d" and take a kernel coredump
> 
> Unless you have sensitive information, a kernel coredump you can share would be the best thing to do.
> 
> 
> Dan
> 
> p.s. I'm at the Dr. the rest of the day starting in 90 mins, pardon any latency.


From danmcd at omniti.com  Wed Dec  9 15:25:07 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 9 Dec 2015 10:25:07 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
Message-ID: <7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>


> On Dec 9, 2015, at 10:20 AM, Brian Hechinger <wonko at 4amlunch.net> wrote:
> 
> I cannot ^C out of the touch.
> 
> wonko at basket1:/export/home/wonko$ ps -ef | grep touch

You do know about pgrep(1), right?  :)

> Also, kill -9 doesn?t touch them.

Okay!  This means something in-kernel is locking them up.  More reason for a coredump.

> the only thing in messages is:
> 
> Dec  7 14:31:56 basket1 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-HC, TYPE: Error, VER: 1, SEVERITY: Major
> Dec  7 14:31:56 basket1 EVENT-TIME: Mon Dec  7 14:31:56 EST 2015
> Dec  7 14:31:56 basket1 PLATFORM: X8DTL, CSN: 1234567890, HOSTNAME: basket1
> Dec  7 14:31:56 basket1 SOURCE: zfs-diagnosis, REV: 1.0
> Dec  7 14:31:56 basket1 EVENT-ID: 585f9fa2-4a84-4184-8c87-c2f9c600e1a1
> Dec  7 14:31:56 basket1 DESC: The ZFS pool has experienced currently unrecoverable I/O
> Dec  7 14:31:56 basket1             failures.  Refer to http://illumos.org/msg/ZFS-8000-HC for more information.
> Dec  7 14:31:56 basket1 AUTO-RESPONSE: No automated response will be taken.
> Dec  7 14:31:56 basket1 IMPACT: Read and write I/Os cannot be serviced.
> Dec  7 14:31:56 basket1 REC-ACTION: Make sure the affected devices are connected, then run
> Dec  7 14:31:56 basket1             'zpool clear?.

You sure there's nothing before the FMA complaints?  It might be one line, but it may be enough to show something.

> I can definitely share a kernel coredump, that?s not a problem. Just need to schedule a time to shut down all the VMs first.

Take your time, do it on your schedule, that's fine.

So I know where to put it:  Which OmniOS release are you running?

	head /etc/release ; uname -a

Thanks,
Dan


From rjahnel at ellipseinc.com  Wed Dec  9 15:36:10 2015
From: rjahnel at ellipseinc.com (Richard Jahnel)
Date: Wed, 9 Dec 2015 15:36:10 +0000
Subject: [OmniOS-discuss] considering an SSD pool ... which SSD
In-Reply-To: <alpine.DEB.2.20.1512090802040.23336@engelberg>
References: <alpine.DEB.2.20.1512090802040.23336@engelberg>
Message-ID: <65DC5816D4BEE043885A89FD54E273FC6CF687EA@MAIL101.Ellipseinc.com>

We have successfully used in order:

For the raidz2/3 vdevs
Patriot Torqx 2010
Crucial C300 2011
Samsung 840 Pros 2013 to current.

Samsung 850 Pros are in testing now.

If we could afford them we would prefer to use the Intel S3710 drives, but we can only afford enough of those for slogs.

For slog various Intel drives over the years.

No L2 cache is needed for SSD pools.

Key for us has been using double and more recently triple parity pools.

We have found in the last 5 years that as SSDs age they will take longer to ready up new pages for writes. Eventually they will start taking too long and falling out of the pool.
Usually doing a zpool clear will place them back online and be your sign that's it's time to start working on either replacing the entire pool or at least the drives one by one as they drop out more than once or twice.


-----Original Message-----
From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] On Behalf Of Tobias Oetiker
Sent: Wednesday, December 09, 2015 1:06 AM
To: omnios-discuss at lists.omniti.com
Subject: [OmniOS-discuss] considering an SSD pool ... which SSD

We are looking into the possibility of setting up our first SSD based pool ... any recommendations for SSDs to use ?

Our System Integrator recommends the use of Intel SSDs as opposed to Samsung since Samsung would be changing their lineup every few weeks and thus make it difficult to source replacement disks, in case one should fail.

any thoughts ?

cheers
tobi





--
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland www.oetiker.ch tobi at oetiker.ch +41 62 775 9902

_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss
________________________________

The content of this e-mail (including any attachments) is strictly confidential and may be commercially sensitive. If you are not, or believe you may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.

From bfriesen at simple.dallas.tx.us  Wed Dec  9 15:51:25 2015
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Wed, 9 Dec 2015 09:51:25 -0600 (CST)
Subject: [OmniOS-discuss] considering an SSD pool ... which SSD
In-Reply-To: <CADRROpVXEY1A1bwP7iObSU_6YFyRhbOOHBz_n3NWnGd9FKk+7g@mail.gmail.com>
References: <alpine.DEB.2.20.1512090802040.23336@engelberg>
	<CADRROpVXEY1A1bwP7iObSU_6YFyRhbOOHBz_n3NWnGd9FKk+7g@mail.gmail.com>
Message-ID: <alpine.GSO.2.01.1512090947001.22248@freddy.simplesystems.org>

On Wed, 9 Dec 2015, Trey Palmer wrote:

> Tobias,
> We use the Intel DC S3700.
> 
> I haven't tried the S3710's yet. ? The S3700's are very reliable.

I am using 6 S3710s for the main pool store (in raidz2), with no 
dedicated ZIL device.  No problems yet with OmniOS.  It is an 
expensive solution.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From wonko at 4amlunch.net  Wed Dec  9 16:13:11 2015
From: wonko at 4amlunch.net (Brian Hechinger)
Date: Wed, 9 Dec 2015 11:13:11 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
	<7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
Message-ID: <A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>

I didn?t know about pgrep, no. :)

So the ?zpool clear? has fixed things a bit. The touch processes have all exited.

I can now touch a file on that pool.

A zpool scrub later and this is the status:

  pool: zoom
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: scrub repaired 6K in 0h0m with 0 errors on Wed Dec  9 10:25:33 2015
config:

        NAME          STATE     READ WRITE CKSUM
        zoom          ONLINE       0     0     0
          mirror-0    ONLINE       0     0     0
            c4t1d0s1  ONLINE       0     0     0
            c5t1d0s1  ONLINE       0     0     2

errors: No known data errors

I?m going to try to re-run iozone later and see if I can?t get it to happen again.

This is concerning.

The previous entry in messages is 4 days prior talking about ntpd.

-brian

> On Dec 9, 2015, at 10:25 AM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> 
>> On Dec 9, 2015, at 10:20 AM, Brian Hechinger <wonko at 4amlunch.net> wrote:
>> 
>> I cannot ^C out of the touch.
>> 
>> wonko at basket1:/export/home/wonko$ ps -ef | grep touch
> 
> You do know about pgrep(1), right?  :)
> 
>> Also, kill -9 doesn?t touch them.
> 
> Okay!  This means something in-kernel is locking them up.  More reason for a coredump.
> 
>> the only thing in messages is:
>> 
>> Dec  7 14:31:56 basket1 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-HC, TYPE: Error, VER: 1, SEVERITY: Major
>> Dec  7 14:31:56 basket1 EVENT-TIME: Mon Dec  7 14:31:56 EST 2015
>> Dec  7 14:31:56 basket1 PLATFORM: X8DTL, CSN: 1234567890, HOSTNAME: basket1
>> Dec  7 14:31:56 basket1 SOURCE: zfs-diagnosis, REV: 1.0
>> Dec  7 14:31:56 basket1 EVENT-ID: 585f9fa2-4a84-4184-8c87-c2f9c600e1a1
>> Dec  7 14:31:56 basket1 DESC: The ZFS pool has experienced currently unrecoverable I/O
>> Dec  7 14:31:56 basket1             failures.  Refer to http://illumos.org/msg/ZFS-8000-HC for more information.
>> Dec  7 14:31:56 basket1 AUTO-RESPONSE: No automated response will be taken.
>> Dec  7 14:31:56 basket1 IMPACT: Read and write I/Os cannot be serviced.
>> Dec  7 14:31:56 basket1 REC-ACTION: Make sure the affected devices are connected, then run
>> Dec  7 14:31:56 basket1             'zpool clear?.
> 
> You sure there's nothing before the FMA complaints?  It might be one line, but it may be enough to show something.
> 
>> I can definitely share a kernel coredump, that?s not a problem. Just need to schedule a time to shut down all the VMs first.
> 
> Take your time, do it on your schedule, that's fine.
> 
> So I know where to put it:  Which OmniOS release are you running?
> 
> 	head /etc/release ; uname -a
> 
> Thanks,
> Dan
> 


From danmcd at omniti.com  Wed Dec  9 16:17:38 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 9 Dec 2015 11:17:38 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
	<7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
	<A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>
Message-ID: <4B858828-C823-4251-84A9-417028B01B3C@omniti.com>


> On Dec 9, 2015, at 11:13 AM, Brian Hechinger <wonko at 4amlunch.net> wrote:
> 
> I didn?t know about pgrep, no. :)

The Solaris/illumos ptools are a huge win.  Learn about 'em.  :)

Back to the main discussion...

> So the ?zpool clear? has fixed things a bit. The touch processes have all exited.
> 
> I can now touch a file on that pool.
> 
> A zpool scrub later and this is the status:
> 
>  pool: zoom
> state: ONLINE
> status: One or more devices has experienced an unrecoverable error.  An
>        attempt was made to correct the error.  Applications are unaffected.
> action: Determine if the device needs to be replaced, and clear the errors
>        using 'zpool clear' or replace the device with 'zpool replace'.
>   see: http://illumos.org/msg/ZFS-8000-9P
>  scan: scrub repaired 6K in 0h0m with 0 errors on Wed Dec  9 10:25:33 2015
> config:
> 
>        NAME          STATE     READ WRITE CKSUM
>        zoom          ONLINE       0     0     0
>          mirror-0    ONLINE       0     0     0
>            c4t1d0s1  ONLINE       0     0     0
>            c5t1d0s1  ONLINE       0     0     2
> 
> errors: No known data errors
> 
> I?m going to try to re-run iozone later and see if I can?t get it to happen again.
> 
> This is concerning.

I see this, and I think "c5t1d0" is broken HW and needs to be replaced.

Combine that with "unrecoverable IO failures" and you really should be planning to replace that drive.

Dan


From wonko at 4amlunch.net  Wed Dec  9 16:18:21 2015
From: wonko at 4amlunch.net (Brian Hechinger)
Date: Wed, 9 Dec 2015 11:18:21 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <4B858828-C823-4251-84A9-417028B01B3C@omniti.com>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
	<7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
	<A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>
	<4B858828-C823-4251-84A9-417028B01B3C@omniti.com>
Message-ID: <584980F4-502A-4700-A58F-E720CB398BF0@4amlunch.net>

It?s brand new!!

-brian

> On Dec 9, 2015, at 11:17 AM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> 
>> On Dec 9, 2015, at 11:13 AM, Brian Hechinger <wonko at 4amlunch.net> wrote:
>> 
>> I didn?t know about pgrep, no. :)
> 
> The Solaris/illumos ptools are a huge win.  Learn about 'em.  :)
> 
> Back to the main discussion...
> 
>> So the ?zpool clear? has fixed things a bit. The touch processes have all exited.
>> 
>> I can now touch a file on that pool.
>> 
>> A zpool scrub later and this is the status:
>> 
>> pool: zoom
>> state: ONLINE
>> status: One or more devices has experienced an unrecoverable error.  An
>>       attempt was made to correct the error.  Applications are unaffected.
>> action: Determine if the device needs to be replaced, and clear the errors
>>       using 'zpool clear' or replace the device with 'zpool replace'.
>>  see: http://illumos.org/msg/ZFS-8000-9P
>> scan: scrub repaired 6K in 0h0m with 0 errors on Wed Dec  9 10:25:33 2015
>> config:
>> 
>>       NAME          STATE     READ WRITE CKSUM
>>       zoom          ONLINE       0     0     0
>>         mirror-0    ONLINE       0     0     0
>>           c4t1d0s1  ONLINE       0     0     0
>>           c5t1d0s1  ONLINE       0     0     2
>> 
>> errors: No known data errors
>> 
>> I?m going to try to re-run iozone later and see if I can?t get it to happen again.
>> 
>> This is concerning.
> 
> I see this, and I think "c5t1d0" is broken HW and needs to be replaced.
> 
> Combine that with "unrecoverable IO failures" and you really should be planning to replace that drive.
> 
> Dan
> 


From wonko at 4amlunch.net  Wed Dec  9 16:21:01 2015
From: wonko at 4amlunch.net (Brian Hechinger)
Date: Wed, 9 Dec 2015 11:21:01 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <4B858828-C823-4251-84A9-417028B01B3C@omniti.com>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
	<7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
	<A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>
	<4B858828-C823-4251-84A9-417028B01B3C@omniti.com>
Message-ID: <A3ECCE66-0150-4E99-8F7C-2CEDC85FCA6A@4amlunch.net>

Also, I would expect the other slice to be affected as well?  It?s been humming along just fine as SLOG with no errors:

        logs
          mirror-3    ONLINE       0     0     0
            c4t1d0s0  ONLINE       0     0     0
            c5t1d0s0  ONLINE       0     0     0

> On Dec 9, 2015, at 11:17 AM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> 
>> On Dec 9, 2015, at 11:13 AM, Brian Hechinger <wonko at 4amlunch.net> wrote:
>> 
>> I didn?t know about pgrep, no. :)
> 
> The Solaris/illumos ptools are a huge win.  Learn about 'em.  :)
> 
> Back to the main discussion...
> 
>> So the ?zpool clear? has fixed things a bit. The touch processes have all exited.
>> 
>> I can now touch a file on that pool.
>> 
>> A zpool scrub later and this is the status:
>> 
>> pool: zoom
>> state: ONLINE
>> status: One or more devices has experienced an unrecoverable error.  An
>>       attempt was made to correct the error.  Applications are unaffected.
>> action: Determine if the device needs to be replaced, and clear the errors
>>       using 'zpool clear' or replace the device with 'zpool replace'.
>>  see: http://illumos.org/msg/ZFS-8000-9P
>> scan: scrub repaired 6K in 0h0m with 0 errors on Wed Dec  9 10:25:33 2015
>> config:
>> 
>>       NAME          STATE     READ WRITE CKSUM
>>       zoom          ONLINE       0     0     0
>>         mirror-0    ONLINE       0     0     0
>>           c4t1d0s1  ONLINE       0     0     0
>>           c5t1d0s1  ONLINE       0     0     2
>> 
>> errors: No known data errors
>> 
>> I?m going to try to re-run iozone later and see if I can?t get it to happen again.
>> 
>> This is concerning.
> 
> I see this, and I think "c5t1d0" is broken HW and needs to be replaced.
> 
> Combine that with "unrecoverable IO failures" and you really should be planning to replace that drive.
> 
> Dan
> 


From danmcd at omniti.com  Wed Dec  9 16:22:21 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 9 Dec 2015 11:22:21 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <584980F4-502A-4700-A58F-E720CB398BF0@4amlunch.net>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
	<7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
	<A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>
	<4B858828-C823-4251-84A9-417028B01B3C@omniti.com>
	<584980F4-502A-4700-A58F-E720CB398BF0@4amlunch.net>
Message-ID: <4B0CFB00-2181-4E38-B0E1-8AAAA3E6136C@omniti.com>


> On Dec 9, 2015, at 11:18 AM, Brian Hechinger <wonko at 4amlunch.net> wrote:
> 
> It?s brand new!!

Sometimes you get flaky HW that's new.  I've had to return new spinning-rust disks, for example.

> Also, I would expect the other slice to be affected as well?  It?s been humming along just fine as SLOG with no errors:
> 
>        logs
>          mirror-3    ONLINE       0     0     0
>            c4t1d0s0  ONLINE       0     0     0
>            c5t1d0s0  ONLINE       0     0     0

Could just be bad luck your slog hasn't encountered the bad portion of this drive.

Also, what OmniOS revision are you running? If you're not up to the latest November r151014 update, you may be missing some NVMe fixes.

Dan


From wonko at 4amlunch.net  Wed Dec  9 16:27:25 2015
From: wonko at 4amlunch.net (Brian Hechinger)
Date: Wed, 9 Dec 2015 11:27:25 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <4B0CFB00-2181-4E38-B0E1-8AAAA3E6136C@omniti.com>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
	<7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
	<A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>
	<4B858828-C823-4251-84A9-417028B01B3C@omniti.com>
	<584980F4-502A-4700-A58F-E720CB398BF0@4amlunch.net>
	<4B0CFB00-2181-4E38-B0E1-8AAAA3E6136C@omniti.com>
Message-ID: <7D06CC38-9841-4189-80CD-6341E025B10C@4amlunch.net>


> On Dec 9, 2015, at 11:22 AM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> 
>> On Dec 9, 2015, at 11:18 AM, Brian Hechinger <wonko at 4amlunch.net> wrote:
>> 
>> It?s brand new!!
> 
> Sometimes you get flaky HW that's new.  I've had to return new spinning-rust disks, for example.

Bah. :(

> 
>> Also, I would expect the other slice to be affected as well?  It?s been humming along just fine as SLOG with no errors:
>> 
>>       logs
>>         mirror-3    ONLINE       0     0     0
>>           c4t1d0s0  ONLINE       0     0     0
>>           c5t1d0s0  ONLINE       0     0     0
> 
> Could just be bad luck your slog hasn't encountered the bad portion of this drive.

I suppose. You think there is a maybe a good way to test this device before I try to get it RMA-ed?

> Also, what OmniOS revision are you running? If you're not up to the latest November r151014 update, you may be missing some NVMe fixes.

Oh right, totally forgot to do that for you:

wonko at basket1:/var/adm$ head /etc/release ; uname -a
  OmniOS v11 r151016
  Copyright 2015 OmniTI Computer Consulting, Inc. All rights reserved.
  Use is subject to license terms.
SunOS basket1 5.11 omnios-073d8c0 i86pc i386 i86pc


From nsmith at careyweb.com  Wed Dec  9 17:24:39 2015
From: nsmith at careyweb.com (Nate Smith)
Date: Wed, 09 Dec 2015 12:24:39 -0500
Subject: [OmniOS-discuss]
 =?utf-8?q?considering_an_SSD_pool_=2E=2E=2E_whic?= =?utf-8?q?h_SSD?=
Message-ID: <2095707765-3740@mail.careyweb.com>

I love the Intel 730s nice mix of price/server level reliability. 

--
Nate Smith 
Sr.  Network /Systems Analyst 
Carey Color Inc. 
6835 Ridge Rd./PO Box 609
Sharon Center, OH 44274
330-239-1835


-----Original Message-----
From: Dan McDonald [danmcd at omniti.com]
Received: Wednesday, 09 Dec 2015, 10:02AM
To: Tobias Oetiker [tobi at oetiker.ch]
CC: omnios-discuss [omnios-discuss at lists.omniti.com]
Subject: Re: [OmniOS-discuss] considering an SSD pool ... which SSD


> On Dec 9, 2015, at 2:05 AM, Tobias Oetiker <tobi at oetiker.ch> wrote:
> 
> Our System Integrator recommends the use of Intel SSDs as opposed

We use Intel ones in-house, because they have the best reputation for wear.  I suspect the other folks are catching up, but Intel had a lead.

My $0.02,
Dan

_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss



From tom.robinson at motec.com.au  Wed Dec  9 22:24:23 2015
From: tom.robinson at motec.com.au (Tom Robinson)
Date: Thu, 10 Dec 2015 09:24:23 +1100
Subject: [OmniOS-discuss] LUN (in)visibility
Message-ID: <5668AA17.6050105@motec.com.au>

OmniOS v11 r151012

Hi,

We are using iSCSI over 10G ethernet and Infiniband to connect our storage (OmniOS) to our virtual
infrastructure (KVM/ESXi). Our KVM host is using multipath and is configured to prefer Infiniband.
We also run an ESXi host (in legacy - we are changing to KVM) with a similar path selection
configuration.

A few weeks ago Infiniband failed to connect (although we couldn't see any reason why) but
thankfully all the initiators failed over to iSCSI on both KVM and ESXi. The iSCSI connection seemed
to be stable and would we could see all existing LUNs offered by the OmniOS server.

Since then I've been working on solving the Infiniband issue but couldn't find any obvious problems
with the configuration. In doing so I have rebooted both KVM and ESXi mulitple times. The Infiniband
fabric, however, didn't come back.

A couple of days ago I needed new storage for the virtual infrastructure but had an issue with
COMSTAR LUN visibility on the initiator end. I created two new target LUNs for iSCSI and the COMSTAR
end looked perfect but neither the KVM nor ESXi host picked them up on rescan (as they usually
would). At that time I had 68 active LUNs which were seen and active by the KVM and ESXi hosts but
the two new ones weren't appearing. I've done the procedure for adding new targets multiple times
before and it's always worked so this stumped me.

Yesterday I shutdown all our infrastructure and started OmniOS first, then KVM and ESXi. It's all
working again. The two new targets came up along with all of the others, Infinband and iSCSI are
working.

So great, but now I'm thinking I have an unknown instability issue in the storage system.

Has anyone seen this behaviour before? Where can I begin to look for the cause of the issue?

Kind regards,
Tom

-- 

Tom Robinson
IT Manager/System Administrator

MoTeC Pty Ltd

121 Merrindale Drive
Croydon South
3136 Victoria
Australia

T: +61 3 9761 5050
F: +61 3 9761 5051
E: tom.robinson at motec.com.au

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <https://omniosce.org/ml-archive/attachments/20151210/d775ef3c/attachment-0001.bin>

From danmcd at omniti.com  Wed Dec  9 23:00:59 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 9 Dec 2015 18:00:59 -0500
Subject: [OmniOS-discuss] LUN (in)visibility
In-Reply-To: <5668AA17.6050105@motec.com.au>
References: <5668AA17.6050105@motec.com.au>
Message-ID: <9F96534F-D20A-4AF8-BCF2-84D62402635B@omniti.com>


> On Dec 9, 2015, at 5:24 PM, Tom Robinson <tom.robinson at motec.com.au> wrote:
> 
> OmniOS v11 r151012

First off, OmniOS r151012 has reached end of service life.  You should upgrade to at least r151014 (the current LTS) or r151016 (the current Stable).

> 
> Yesterday I shutdown all our infrastructure and started OmniOS first, then KVM and ESXi. It's all
> working again. The two new targets came up along with all of the others, Infinband and iSCSI are
> working.
> 
> So great, but now I'm thinking I have an unknown instability issue in the storage system.

I'd recommend first getting up to date with either r151014 or r151016.  From there people can figure things out a little easier.

Dan


From tom.robinson at motec.com.au  Wed Dec  9 23:41:29 2015
From: tom.robinson at motec.com.au (Tom Robinson)
Date: Thu, 10 Dec 2015 10:41:29 +1100
Subject: [OmniOS-discuss] LUN (in)visibility
In-Reply-To: <9F96534F-D20A-4AF8-BCF2-84D62402635B@omniti.com>
References: <5668AA17.6050105@motec.com.au>
	<9F96534F-D20A-4AF8-BCF2-84D62402635B@omniti.com>
Message-ID: <5668BC29.8090803@motec.com.au>

On 10/12/15 10:00, Dan McDonald wrote:
> 
>> On Dec 9, 2015, at 5:24 PM, Tom Robinson <tom.robinson at motec.com.au> wrote:
>>
>> OmniOS v11 r151012
> 
> First off, OmniOS r151012 has reached end of service life.  You should upgrade to at least r151014 (the current LTS) or r151016 (the current Stable).

Yes, I was looking at either r151014 or r151016 yesterday. We will plan to do that upgrade. Is there
an announce list as I was unaware that r151012 had reached end of service life.

> 
> I'd recommend first getting up to date with either r151014 or r151016.  From there people can figure things out a little easier.

I appreciate that we should be moving onto a supported platform but it would be good to know where I
would even start to look for reasons why this is happening.

Kind regards,
Tom

-- 

Tom Robinson
IT Manager/System Administrator

MoTeC Pty Ltd

121 Merrindale Drive
Croydon South
3136 Victoria
Australia

T: +61 3 9761 5050
F: +61 3 9761 5051
E: tom.robinson at motec.com.au

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <https://omniosce.org/ml-archive/attachments/20151210/116a8c91/attachment.bin>

From danmcd at omniti.com  Thu Dec 10 00:06:53 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 9 Dec 2015 19:06:53 -0500
Subject: [OmniOS-discuss] LUN (in)visibility
In-Reply-To: <5668BC29.8090803@motec.com.au>
References: <5668AA17.6050105@motec.com.au>
	<9F96534F-D20A-4AF8-BCF2-84D62402635B@omniti.com>
	<5668BC29.8090803@motec.com.au>
Message-ID: <B20CC262-700A-461D-B80A-C30BBE8069F8@omniti.com>


> On Dec 9, 2015, at 6:41 PM, Tom Robinson <tom.robinson at motec.com.au> wrote:
> 
> Yes, I was looking at either r151014 or r151016 yesterday. We will plan to do that upgrade. Is there
> an announce list as I was unaware that r151012 had reached end of service life.

Our release cycle is documented:

	http://omnios.omniti.com/wiki.php/ReleaseCycle

and on the omnios-discuss list, I announce EOSLs alongside new releases.

>> 
>> I'd recommend first getting up to date with either r151014 or r151016.  From there people can figure things out a little easier.
> 
> I appreciate that we should be moving onto a supported platform but it would be good to know where I
> would even start to look for reasons why this is happening.

I'd be digging into the source code.  COMSTAR is a bit brittle sometimes.

One other thing you can do is restart COMSTAR:

	svcadm restart stmf

Other people have recently reported that one may also need to restart the iSCSI target as well:

	svcadm restart iscsi/target

And if you feel you need to start both, disable them, then re-enable them:

	svcadm disable -st stmf iscsi/target ; svcadm enable stmf iscsi/target

That may kicks things around enough without you rebooting everything on your storage box.

Dan


From jdg117 at elvis.arl.psu.edu  Thu Dec 10 00:55:15 2015
From: jdg117 at elvis.arl.psu.edu (John D Groenveld)
Date: Wed, 09 Dec 2015 19:55:15 -0500
Subject: [OmniOS-discuss] LUN (in)visibility
In-Reply-To: Your message of "Wed, 09 Dec 2015 19:06:53 EST."
	<B20CC262-700A-461D-B80A-C30BBE8069F8@omniti.com> 
References: <5668AA17.6050105@motec.com.au>
	<9F96534F-D20A-4AF8-BCF2-84D62402635B@omniti.com>
	<5668BC29.8090803@motec.com.au>
	<B20CC262-700A-461D-B80A-C30BBE8069F8@omniti.com> 
Message-ID: <201512100055.tBA0tF1a001552@elvis.arl.psu.edu>

In message <B20CC262-700A-461D-B80A-C30BBE8069F8 at omniti.com>, Dan McDonald writ
es:
>and on the omnios-discuss list, I announce EOSLs alongside new releases.

Perhaps more trouble than its worth right now, but... 
as usage grows and M/L volume increases, I hope you'll
consider an omnios-announce moderated list for releases
and other notices.

John
groenveld at acm.org

From tom.robinson at motec.com.au  Thu Dec 10 01:18:51 2015
From: tom.robinson at motec.com.au (Tom Robinson)
Date: Thu, 10 Dec 2015 12:18:51 +1100
Subject: [OmniOS-discuss] LUN (in)visibility
In-Reply-To: <201512100055.tBA0tF1a001552@elvis.arl.psu.edu>
References: <5668AA17.6050105@motec.com.au>
	<9F96534F-D20A-4AF8-BCF2-84D62402635B@omniti.com>
	<5668BC29.8090803@motec.com.au>
	<B20CC262-700A-461D-B80A-C30BBE8069F8@omniti.com>
	<201512100055.tBA0tF1a001552@elvis.arl.psu.edu>
Message-ID: <5668D2FB.8040506@motec.com.au>

On 10/12/15 11:55, John D Groenveld wrote:
> as usage grows and M/L volume increases, I hope you'll
> consider an omnios-announce moderated list for releases
> and other notices.
> 

I'll second that.

-- 

Tom Robinson
IT Manager/System Administrator

MoTeC Pty Ltd

121 Merrindale Drive
Croydon South
3136 Victoria
Australia

T: +61 3 9761 5050
F: +61 3 9761 5051
E: tom.robinson at motec.com.au

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <https://omniosce.org/ml-archive/attachments/20151210/b56d9458/attachment.bin>

From johan.kragsterman at capvert.se  Thu Dec 10 07:55:28 2015
From: johan.kragsterman at capvert.se (Johan Kragsterman)
Date: Thu, 10 Dec 2015 08:55:28 +0100
Subject: [OmniOS-discuss] Ang: Re:  LUN (in)visibility
In-Reply-To: <5668BC29.8090803@motec.com.au>
References: <5668BC29.8090803@motec.com.au>,
	<5668AA17.6050105@motec.com.au>	<9F96534F-D20A-4AF8-BCF2-84D62402635B@omniti.com>
Message-ID: <OF7E508FC2.9A496DF6-ONC1257F17.002AB2B1-C1257F17.002B87E5@inse.com>


Hi!


-----"OmniOS-discuss" <omnios-discuss-bounces at lists.omniti.com> skrev: -----
Till: Dan McDonald <danmcd at omniti.com>
Fr?n: Tom Robinson 
S?nt av: "OmniOS-discuss" 
Datum: 2015-12-10 00:43
Kopia: omnios-discuss <omnios-discuss at lists.omniti.com>
?rende: Re: [OmniOS-discuss] LUN (in)visibility

On 10/12/15 10:00, Dan McDonald wrote:
> 
>> On Dec 9, 2015, at 5:24 PM, Tom Robinson <tom.robinson at motec.com.au> wrote:
>>
>> OmniOS v11 r151012
> 
> First off, OmniOS r151012 has reached end of service life. ?You should upgrade to at least r151014 (the current LTS) or r151016 (the current Stable).

Yes, I was looking at either r151014 or r151016 yesterday. We will plan to do that upgrade. Is there
an announce list as I was unaware that r151012 had reached end of service life.

> 
> I'd recommend first getting up to date with either r151014 or r151016. ?From there people can figure things out a little easier.

I appreciate that we should be moving onto a supported platform but it would be good to know where I
would even start to look for reasons why this is happening.

Kind regards,
Tom




You say "infiniband". Do you mean SRP? Where do you have your subnet manager? In the IB switch? If so, did you check the switch SM logs?

I suppose you checked the data links? dladm show-link? What exactly did you check?

How about multipath? How many paths did/do you have to each LUN? I know there were a discussion about too many paths to a LUN earlier on this list. That was fibre channel, though.

I can't really comment on iScsi since I never use it...

Rgrds Johan



-- 

Tom Robinson
IT Manager/System Administrator

MoTeC Pty Ltd

121 Merrindale Drive
Croydon South
3136 Victoria
Australia

T: +61 3 9761 5050
F: +61 3 9761 5051
E: tom.robinson at motec.com.au

_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


[bilagan "signature.asc" borttagen av Johan Kragsterman/Capvert]


From tobi at oetiker.ch  Thu Dec 10 12:58:42 2015
From: tobi at oetiker.ch (Tobias Oetiker)
Date: Thu, 10 Dec 2015 13:58:42 +0100 (CET)
Subject: [OmniOS-discuss] Samsung SM863
Message-ID: <alpine.DEB.2.20.1512101357360.23336@engelberg>

Just found that samsung now has an ssd with  power loss protection

http://www.storagereview.com/samsung_sm863_ssd_review

what do you think ?

cheers
tobi


-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
www.oetiker.ch tobi at oetiker.ch +41 62 775 9902


From daleg at omniti.com  Thu Dec 10 15:54:12 2015
From: daleg at omniti.com (Dale Ghent)
Date: Thu, 10 Dec 2015 10:54:12 -0500
Subject: [OmniOS-discuss] LUN (in)visibility
In-Reply-To: <B20CC262-700A-461D-B80A-C30BBE8069F8@omniti.com>
References: <5668AA17.6050105@motec.com.au>
	<9F96534F-D20A-4AF8-BCF2-84D62402635B@omniti.com>
	<5668BC29.8090803@motec.com.au>
	<B20CC262-700A-461D-B80A-C30BBE8069F8@omniti.com>
Message-ID: <AE439C9B-7A4D-404B-AF94-30B18676BA8E@omniti.com>


> On Dec 9, 2015, at 7:06 PM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> 
>> On Dec 9, 2015, at 6:41 PM, Tom Robinson <tom.robinson at motec.com.au> wrote:
>> 
>> Yes, I was looking at either r151014 or r151016 yesterday. We will plan to do that upgrade. Is there
>> an announce list as I was unaware that r151012 had reached end of service life.
> 
> Our release cycle is documented:
> 
> 	http://omnios.omniti.com/wiki.php/ReleaseCycle
> 
> and on the omnios-discuss list, I announce EOSLs alongside new releases.

Yeah, the general rule is that there's a 1 year shelf life for Stable releases. We certainly don't mind people running Stables (I do so myself) as long as that is kept in mind.

/dale
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://omniosce.org/ml-archive/attachments/20151210/a23668e1/attachment-0001.bin>

From davide.poletto at gmail.com  Thu Dec 10 16:57:46 2015
From: davide.poletto at gmail.com (Davide Poletto)
Date: Thu, 10 Dec 2015 17:57:46 +0100
Subject: [OmniOS-discuss] illumos and contributions metrics: how to
 evaluate companies that commercialize illumos based products by examining
 them in the light of their illumos community's contributions.
In-Reply-To: <5665B00E.70307@kateley.com>
References: <CANKMAMYBGfr44MaH+t9=xBwzEH9f4-gn=ODCvCtndxMo0PqNXw@mail.gmail.com>
	<DD34332E-D02A-434E-976A-26840ABDD96B@omniti.com>
	<5665B00E.70307@kateley.com>
Message-ID: <CANKMAMY64P2UHmgWxKbO8zKWONKupTmrGQvVsMp7LQXyQDOwfg@mail.gmail.com>

Thanks Dan and Linda for your answers (thanks Linda for the useful link!):
mine was just sane curiosity and, probably, not a real/relevant problem, at
least, nothing blocking my evaluating activity...it's just like a slow bee
flying around my head...especially considering that, as Dan said, community
member's energy is used to keep the "machinery" running and in good health!

Kind regards, Davide.

P.S. slightly OT:
I can't understand the consequences of Dan's statement: "Making things MORE
complicated is that "illumos" as a brand is still tightly tied up by its
owner."...what does it mean?

On Mon, Dec 7, 2015 at 5:13 PM, Linda Kateley <lkateley at kateley.com> wrote:

> Blackduck does this for you.
>
> https://www.openhub.net/p?ref=homepage&query=illumos
>
>
> On 12/7/15 7:44 AM, Dan McDonald wrote:
>
>> On Dec 7, 2015, at 8:13 AM, Davide Poletto <davide.poletto at gmail.com>
>>> wrote:
>>>
>>> Is there a way to rank/evaluate and so reward/honour (by, as example,
>>> purchasing their products or by sustaining their development as
>>> testers/free-time contributors) those {individuals, companies,
>>> institutions} that clearly demonstrate not only to have good numbers
>>> (commits) but also that they care about the community and that are more
>>> transparent than others in advertising their commercial offer's origin?
>>>
>> That's a damned good question.  It's also very tricky.
>>
>> Some firms keep things closed until they've released, or after some time
>> after they've released.  Some find this fair enough, others find it
>> annoying.  Because people are different, it may be hard to get a consensus
>> on how to rank/evaluate firms the way you wish.  BTW, I lean toward "fair
>> enough" so long as there's consistency and not going back on one's word.
>> Keeping to one's word is important to me.  I didn't leave Oracle because
>> of the Solaris-closing: if you read the text of that leaked email, it
>> implied a source-dump-on-release model. Only after I left Oracle did it
>> become clear that it was all a big lie.
>>
>> You're chasing a hard problem.  You may not get much sympathy.  Making
>> things MORE complicated is that "illumos" as a brand is still tightly tied
>> up by its owner. Many feel that it's tied up too tightly, and that is why
>> you rarely see "illumos" mentioned in marketing materials, especially not
>> the trademarked symbol.
>>
>> I'm sorry I don't have better answers for you right now.  It's a hard
>> problem, and many of us who might be able to help clarify things are trying
>> to keep all of the machinery moving as smoothly as we can.
>>
>> Dan
>>
>>
>> _______________________________________________
>> OmniOS-discuss mailing list
>> OmniOS-discuss at lists.omniti.com
>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>>
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151210/dd4b6f13/attachment.html>

From richard.elling at richardelling.com  Thu Dec 10 18:13:12 2015
From: richard.elling at richardelling.com (Richard Elling)
Date: Thu, 10 Dec 2015 10:13:12 -0800
Subject: [OmniOS-discuss] Samsung SM863
In-Reply-To: <alpine.DEB.2.20.1512101357360.23336@engelberg>
References: <alpine.DEB.2.20.1512101357360.23336@engelberg>
Message-ID: <63AA1781-EFEC-4991-B2AD-A7F6A7D2EDFC@richardelling.com>


> On Dec 10, 2015, at 4:58 AM, Tobias Oetiker <tobi at oetiker.ch> wrote:
> 
> Just found that samsung now has an ssd with  power loss protection
> 
> http://www.storagereview.com/samsung_sm863_ssd_review
> 
> what do you think ?

Power-loss protection is not required (ZFS works on HDDs :-) but it is a nice feature.
Overall, this looks like a very nice SSD. I expect more enterprise-grade SSDs from
Samsung in the future.
 -- richard


From dave-oo at pooserville.com  Thu Dec 10 20:02:45 2015
From: dave-oo at pooserville.com (Dave Pooser)
Date: Thu, 10 Dec 2015 14:02:45 -0600
Subject: [OmniOS-discuss] Samsung SM863
In-Reply-To: <63AA1781-EFEC-4991-B2AD-A7F6A7D2EDFC@richardelling.com>
References: <alpine.DEB.2.20.1512101357360.23336@engelberg>
	<63AA1781-EFEC-4991-B2AD-A7F6A7D2EDFC@richardelling.com>
Message-ID: <D28F3618.35B8B7%dave-lists@pooserville.com>

On 12/10/15, 12:13 PM, "OmniOS-discuss on behalf of Richard Elling"
<omnios-discuss-bounces at lists.omniti.com on behalf of
richard.elling at richardelling.com> wrote:

>
>> On Dec 10, 2015, at 4:58 AM, Tobias Oetiker <tobi at oetiker.ch> wrote:
>> 
>> Just found that samsung now has an ssd with  power loss protection
>> 
>> http://www.storagereview.com/samsung_sm863_ssd_review
>> 
>> what do you think ?
>
>Power-loss protection is not required (ZFS works on HDDs :-) but it is a
>nice feature.

On a device that will likely be used for ZIL, I'd call power-loss
protection required. HDDs don't lie to the OS about when data has been
flushed from cache to disk the way SSDs do, right?

On a device that's going to be L2ARC I care a lot less, obviously. ;-)
-- 
Dave Pooser
Cat-Herder-in-Chief, Pooserville.com



From jimklimov at cos.ru  Thu Dec 10 20:20:19 2015
From: jimklimov at cos.ru (Jim Klimov)
Date: Thu, 10 Dec 2015 21:20:19 +0100
Subject: [OmniOS-discuss] Samsung SM863
In-Reply-To: <63AA1781-EFEC-4991-B2AD-A7F6A7D2EDFC@richardelling.com>
References: <alpine.DEB.2.20.1512101357360.23336@engelberg>
	<63AA1781-EFEC-4991-B2AD-A7F6A7D2EDFC@richardelling.com>
Message-ID: <45D2D4ED-E2B5-48B8-A170-5D8AB70C809B@cos.ru>

10 ??????? 2015??. 19:13:12 CET, Richard Elling <richard.elling at richardelling.com> ?????:
>
>> On Dec 10, 2015, at 4:58 AM, Tobias Oetiker <tobi at oetiker.ch> wrote:
>> 
>> Just found that samsung now has an ssd with  power loss protection
>> 
>> http://www.storagereview.com/samsung_sm863_ssd_review
>> 
>> what do you think ?
>
>Power-loss protection is not required (ZFS works on HDDs :-) but it is
>a nice feature.
>Overall, this looks like a very nice SSD. I expect more
>enterprise-grade SSDs from
>Samsung in the future.
> -- richard
>
>_______________________________________________
>OmniOS-discuss mailing list
>OmniOS-discuss at lists.omniti.com
>http://lists.omniti.com/mailman/listinfo/omnios-discuss

IIRC the historical issue was not per se with powerloss protection for ZFS needs, but with drives and firmwares that could misbehave when power disappeared if they did not yet flush ram to flash - including corruption of ssd metadata which bricked the device, and also in cases of graceful shutdown when the host cut its own power off afterwards. These effects were not seen as often (or ever) on ssds with capacitors or equivalent protection.

I do not know how much of this is FUD or relevant with today's devices vs. vendors' first steps a few years back, but the rule of thumb was to use protected ssds for anything other than scratch use (e.g. whole device dedicated as l2arc or other cache area) since nobody knew what's really good and what's not.

Jim
--
Typos courtesy of K-9 Mail on my Samsung Android

From richard.elling at richardelling.com  Thu Dec 10 20:43:25 2015
From: richard.elling at richardelling.com (Richard Elling)
Date: Thu, 10 Dec 2015 12:43:25 -0800
Subject: [OmniOS-discuss] Samsung SM863
In-Reply-To: <D28F3618.35B8B7%dave-lists@pooserville.com>
References: <alpine.DEB.2.20.1512101357360.23336@engelberg>
	<63AA1781-EFEC-4991-B2AD-A7F6A7D2EDFC@richardelling.com>
	<D28F3618.35B8B7%dave-lists@pooserville.com>
Message-ID: <2C49B16E-5D7E-4A7B-8903-64BE04566DA0@richardelling.com>


> On Dec 10, 2015, at 12:02 PM, Dave Pooser <dave-oo at pooserville.com> wrote:
> 
> On 12/10/15, 12:13 PM, "OmniOS-discuss on behalf of Richard Elling"
> <omnios-discuss-bounces at lists.omniti.com on behalf of
> richard.elling at richardelling.com> wrote:
> 
>> 
>>> On Dec 10, 2015, at 4:58 AM, Tobias Oetiker <tobi at oetiker.ch> wrote:
>>> 
>>> Just found that samsung now has an ssd with  power loss protection
>>> 
>>> http://www.storagereview.com/samsung_sm863_ssd_review
>>> 
>>> what do you think ?
>> 
>> Power-loss protection is not required (ZFS works on HDDs :-) but it is a
>> nice feature.
> 
> On a device that will likely be used for ZIL, I'd call power-loss
> protection required. HDDs don't lie to the OS about when data has been
> flushed from cache to disk the way SSDs do, right?

You do need a device that honors cache flush commands. But that goes for
any device or RAID array, not just SSDs.
 -- richard

> 
> On a device that's going to be L2ARC I care a lot less, obviously. ;-)
> -- 
> Dave Pooser
> Cat-Herder-in-Chief, Pooserville.com
> 
> 
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss


From tom.robinson at motec.com.au  Thu Dec 10 22:10:46 2015
From: tom.robinson at motec.com.au (Tom Robinson)
Date: Fri, 11 Dec 2015 09:10:46 +1100
Subject: [OmniOS-discuss] Ang: Re:  LUN (in)visibility
In-Reply-To: <OF7E508FC2.9A496DF6-ONC1257F17.002AB2B1-C1257F17.002B87E5@inse.com>
References: <5668BC29.8090803@motec.com.au> <5668AA17.6050105@motec.com.au>
	<9F96534F-D20A-4AF8-BCF2-84D62402635B@omniti.com>
	<OF7E508FC2.9A496DF6-ONC1257F17.002AB2B1-C1257F17.002B87E5@inse.com>
Message-ID: <5669F866.2030204@motec.com.au>

On 10/12/15 18:55, Johan Kragsterman wrote:
> You say "infiniband". Do you mean SRP? Where do you have your subnet manager? In the IB switch? If so, did you check the switch SM logs?
>
> I suppose you checked the data links? dladm show-link? What exactly did you check?
>
> How about multipath? How many paths did/do you have to each LUN? I know there were a discussion about too many paths to a LUN earlier on this list. That was fibre channel, though.
>
> I can't really comment on iScsi since I never use it...

Hi Johan,

Yes, SRP. As I said, we had everything working fine before which means we also have a subnet
manager. The SM actually runs on it's own litlte box.

----------      -----------      -----
| storage|======|IB Switch|======|KVM|
----------      -----------      -----
                  |     |
                ----   ------
                |SM|   |ESXi|
                ----   ------

Normally there are only three paths; one iSCSI and two SRP.

I spent a lot of time hunting around on the KVM system looking for clues as at that time I didn't
see any issues else where in the setup.

On OmniOS, in /var/adm/messages I had this:

Oct 26 07:28:43 monza.motec.com.au genunix: [ID 408789 kern.warning] WARNING: hermon0: fault
detected external to device; service unavailable
Oct 26 07:28:43 monza.motec.com.au genunix: [ID 451854 kern.warning] WARNING: hermon0: port 2 down
Oct 26 07:28:47 monza.motec.com.au genunix: [ID 408822 kern.info] NOTICE: hermon0: fault detected
external to device; service still unavailable
Oct 26 07:28:47 monza.motec.com.au genunix: [ID 611667 kern.info] NOTICE: hermon0: port 1 down
Oct 26 07:30:12 monza.motec.com.au genunix: [ID 408789 kern.notice] NOTICE: hermon0: fault cleared
external to device; service available
Oct 26 07:30:12 monza.motec.com.au genunix: [ID 451854 kern.notice] NOTICE: hermon0: port 2 up
Oct 26 07:30:12 monza.motec.com.au genunix: [ID 408822 kern.info] NOTICE: hermon0: no fault external
to device; service available
Oct 26 07:30:12 monza.motec.com.au genunix: [ID 611667 kern.info] NOTICE: hermon0: port 1 up
Oct 26 07:31:31 monza.motec.com.au genunix: [ID 408789 kern.warning] WARNING: hermon0: fault
detected external to device; service unavailable
Oct 26 07:31:31 monza.motec.com.au genunix: [ID 451854 kern.warning] WARNING: hermon0: port 2 down
Oct 26 07:31:38 monza.motec.com.au genunix: [ID 408822 kern.info] NOTICE: hermon0: fault detected
external to device; service still unavailable
Oct 26 07:31:38 monza.motec.com.au genunix: [ID 611667 kern.info] NOTICE: hermon0: port 1 down
Oct 26 07:32:12 monza.motec.com.au genunix: [ID 408789 kern.notice] NOTICE: hermon0: fault cleared
external to device; service available
Oct 26 07:32:12 monza.motec.com.au genunix: [ID 451854 kern.notice] NOTICE: hermon0: port 2 up
Oct 26 07:32:12 monza.motec.com.au genunix: [ID 408822 kern.info] NOTICE: hermon0: no fault external
to device; service available
Oct 26 07:32:12 monza.motec.com.au genunix: [ID 611667 kern.info] NOTICE: hermon0: port 1 up

Isn't the hermon0 driver for the Mellanox cards?

Kind regards,
Tom

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <https://omniosce.org/ml-archive/attachments/20151211/be0e5f8f/attachment-0001.bin>

From danmcd at omniti.com  Thu Dec 10 22:12:55 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Thu, 10 Dec 2015 17:12:55 -0500
Subject: [OmniOS-discuss] Ang: Re:  LUN (in)visibility
In-Reply-To: <5669F866.2030204@motec.com.au>
References: <5668BC29.8090803@motec.com.au> <5668AA17.6050105@motec.com.au>
	<9F96534F-D20A-4AF8-BCF2-84D62402635B@omniti.com>
	<OF7E508FC2.9A496DF6-ONC1257F17.002AB2B1-C1257F17.002B87E5@inse.com>
	<5669F866.2030204@motec.com.au>
Message-ID: <3653512F-6EB5-4D25-90BD-BB47ECE5C28F@omniti.com>


> On Dec 10, 2015, at 5:10 PM, Tom Robinson <tom.robinson at motec.com.au> wrote:
> 
> Isn't the hermon0 driver for the Mellanox cards?

Yes it is!

NAME
       hermon - ConnectX MT25408/MT25418/MT25428 InfiniBand (IB) Driver

DESCRIPTION
       The hermon driver is an IB Architecture-compliant implementation of an
       HCA, which operates on the Mellanox MT25408,  MT25418 and MT25428
       InfiniBand ASSPs using host memory for context storage  rather than
       locally  attached memory on the card. Cards based  on these ASSP's
       utilize the PCI-Express I/O bus. These  ASSP's  support  the  link and
       physical layers of the InfiniBand specification while  the ASSP and the
       driver support the transport layer.


Dan


From johan.kragsterman at capvert.se  Fri Dec 11 07:45:37 2015
From: johan.kragsterman at capvert.se (Johan Kragsterman)
Date: Fri, 11 Dec 2015 08:45:37 +0100
Subject: [OmniOS-discuss] Ang: Re: Ang: Re:  LUN (in)visibility
In-Reply-To: <5669F866.2030204@motec.com.au>
References: <5669F866.2030204@motec.com.au>,
	<5668BC29.8090803@motec.com.au> <5668AA17.6050105@motec.com.au>
	<9F96534F-D20A-4AF8-BCF2-84D62402635B@omniti.com>
	<OF7E508FC2.9A496DF6-ONC1257F17.002AB2B1-C1257F17.002B87E5@inse.com>
Message-ID: <OF9793D64B.36D9E2DB-ONC1257F18.002A71C8-C1257F18.002AA0FB@inse.com>


Hi!


-----Tom Robinson <tom.robinson at motec.com.au> skrev: -----
Till: Johan Kragsterman <johan.kragsterman at capvert.se>
Fr?n: Tom Robinson <tom.robinson at motec.com.au>
Datum: 2015-12-10 23:10
Kopia: Dan McDonald <danmcd at omniti.com>, omnios-discuss <omnios-discuss at lists.omniti.com>
?rende: Re: Ang: Re: [OmniOS-discuss] LUN (in)visibility

On 10/12/15 18:55, Johan Kragsterman wrote:
> You say "infiniband". Do you mean SRP? Where do you have your subnet manager? In the IB switch? If so, did you check the switch SM logs?
>
> I suppose you checked the data links? dladm show-link? What exactly did you check?
>
> How about multipath? How many paths did/do you have to each LUN? I know there were a discussion about too many paths to a LUN earlier on this list. That was fibre channel, though.
>
> I can't really comment on iScsi since I never use it...

Hi Johan,

Yes, SRP. As I said, we had everything working fine before which means we also have a subnet
manager. The SM actually runs on it's own litlte box.

---------- ? ? ?----------- ? ? ?-----
| storage|======|IB Switch|======|KVM|
---------- ? ? ?----------- ? ? ?-----
?? ? ? ? ? ? ? ? ?| ? ? |
?? ? ? ? ? ? ? ?---- ? ------
?? ? ? ? ? ? ? ?|SM| ? |ESXi|
?? ? ? ? ? ? ? ?---- ? ------

Normally there are only three paths; one iSCSI and two SRP.

I spent a lot of time hunting around on the KVM system looking for clues as at that time I didn't
see any issues else where in the setup.

On OmniOS, in /var/adm/messages I had this:

Oct 26 07:28:43 monza.motec.com.au genunix: [ID 408789 kern.warning] WARNING: hermon0: fault
detected external to device; service unavailable
Oct 26 07:28:43 monza.motec.com.au genunix: [ID 451854 kern.warning] WARNING: hermon0: port 2 down
Oct 26 07:28:47 monza.motec.com.au genunix: [ID 408822 kern.info] NOTICE: hermon0: fault detected
external to device; service still unavailable
Oct 26 07:28:47 monza.motec.com.au genunix: [ID 611667 kern.info] NOTICE: hermon0: port 1 down
Oct 26 07:30:12 monza.motec.com.au genunix: [ID 408789 kern.notice] NOTICE: hermon0: fault cleared
external to device; service available
Oct 26 07:30:12 monza.motec.com.au genunix: [ID 451854 kern.notice] NOTICE: hermon0: port 2 up
Oct 26 07:30:12 monza.motec.com.au genunix: [ID 408822 kern.info] NOTICE: hermon0: no fault external
to device; service available
Oct 26 07:30:12 monza.motec.com.au genunix: [ID 611667 kern.info] NOTICE: hermon0: port 1 up
Oct 26 07:31:31 monza.motec.com.au genunix: [ID 408789 kern.warning] WARNING: hermon0: fault
detected external to device; service unavailable
Oct 26 07:31:31 monza.motec.com.au genunix: [ID 451854 kern.warning] WARNING: hermon0: port 2 down
Oct 26 07:31:38 monza.motec.com.au genunix: [ID 408822 kern.info] NOTICE: hermon0: fault detected
external to device; service still unavailable
Oct 26 07:31:38 monza.motec.com.au genunix: [ID 611667 kern.info] NOTICE: hermon0: port 1 down
Oct 26 07:32:12 monza.motec.com.au genunix: [ID 408789 kern.notice] NOTICE: hermon0: fault cleared
external to device; service available
Oct 26 07:32:12 monza.motec.com.au genunix: [ID 451854 kern.notice] NOTICE: hermon0: port 2 up
Oct 26 07:32:12 monza.motec.com.au genunix: [ID 408822 kern.info] NOTICE: hermon0: no fault external
to device; service available
Oct 26 07:32:12 monza.motec.com.au genunix: [ID 611667 kern.info] NOTICE: hermon0: port 1 up

Isn't the hermon0 driver for the Mellanox cards?

Kind regards,
Tom




Yeah, that's the driver, and this seem to me like a data link problem. And a data link problem could be one or mer things among many things, like I suggested before to check the SM, if you got any logs there.
The msg: "fault detected external to device" is of coarse the key here, but I can't decipher it, unfortunatly...

Do you run the iScsi service over the same IB infrastructure?


Rgrds Johan





[bilagan "signature.asc" borttagen av Johan Kragsterman/Capvert]


From danmcd at omniti.com  Fri Dec 11 15:48:15 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Fri, 11 Dec 2015 10:48:15 -0500
Subject: [OmniOS-discuss] Bloody update for December 11th
Message-ID: <EF95F9F8-855F-45AE-B288-0052FDCEC73E@omniti.com>

This will be the last bloody update for 2015.  The new install media (ISO, USB-DD, or kayak .zfs.bz2) can be obtained via here:

	http://omnios.omniti.com/wiki.php/Installation

illumos-omnios is built from revision 2e8c0ba.  omnios-build is built from b7ab647, but is a faster moving target.

New in this bloody:

- PCRE to 8.38 (the one already patched in 014 & 016).

- OpenSSH now works properly with the illumos audit system.  This will be backported to 014 & 016 soon.

- OpenSSL to 1.0.2e (already patched in 014 & 016).

- Package variant support for DEBUG kernels now built-in (more on this below).

- ZFS receive now works with replication streams with intermediate snapshots that exceed refquota (a requested fix)

- SMB2 support (thanks Nexenta)


The "package variant" support (thanks to Jeff Sipek for the inspiration) allows a user of OmniOS to create a distinct boot environment with same-time-compiled bits, but with DEBUG enabled.  To create such a BE, you start by upgrading to these bits, and then you can use the "pkg change-variant" command.  The change-variant subcommand works a lot like install.  Here's an example:

	Last login: Fri Dec 11 10:30:56 2015
	OmniOS 5.11     omnios-2e8c0ba  December 2015
	# pkg variant -a
	VARIANT                                                                VALUE
	arch                                                                   i386
	debug.illumos                                                          false
	opensolaris.zone                                                       global
	# pkg change-variant -n debug.illumos=true
	            Packages to change: 188
	     Variants/Facets to change:   1
	       Create boot environment: Yes
	Create backup boot environment:  No

	Planning linked: 0/1 done; 1 working: zone:tz2
	Linked image 'zone:tz2' output:
	| No updates necessary for this image. (zone:tz2)
	`
	Planning linked: 1/1 done
	# 

If I wasn't using -n:  boom, instant DEBUG kernel in a new BE.  This will be handy for people who encounter problems.  We can request you create a DEBUG BE and reproduce your problem without having to ONU or do any other weirdness.

As I said earlier, this is the last bloody update for 2015.  Have a happy holiday season (whatever you celebrate, or not) and a happy new year!

Dan


From wonko at 4amlunch.net  Fri Dec 11 16:33:35 2015
From: wonko at 4amlunch.net (Brian Hechinger)
Date: Fri, 11 Dec 2015 11:33:35 -0500
Subject: [OmniOS-discuss] OpenSM for OmniOS
Message-ID: <0F467FE2-9E8C-40D8-90AC-7B62638072F9@4amlunch.net>

I?ve found that supposedly this works. I just need to get a copy and build it.

Does anyone know where I would get a copy?  I cannot find it for the life of me!

Thanks,

-brian

From johan.kragsterman at capvert.se  Fri Dec 11 16:49:47 2015
From: johan.kragsterman at capvert.se (Johan Kragsterman)
Date: Fri, 11 Dec 2015 17:49:47 +0100
Subject: [OmniOS-discuss] Ang:  OpenSM for OmniOS
In-Reply-To: <0F467FE2-9E8C-40D8-90AC-7B62638072F9@4amlunch.net>
References: <0F467FE2-9E8C-40D8-90AC-7B62638072F9@4amlunch.net>
Message-ID: <OF92A45255.50390FE5-ONC1257F18.005C5FCA-C1257F18.005C730E@inse.com>

Hi!


-----"OmniOS-discuss" <omnios-discuss-bounces at lists.omniti.com> skrev: -----
Till: omnios-discuss <omnios-discuss at lists.omniti.com>
Fr?n: Brian Hechinger 
S?nt av: "OmniOS-discuss" 
Datum: 2015-12-11 17:35
?rende: [OmniOS-discuss] OpenSM for OmniOS

I&#8217;ve found that supposedly this works. I just need to get a copy and build it.

Does anyone know where I would get a copy? ?I cannot find it for the life of me!

Thanks,

-brian




That software is pretty old, and I never tested it myself, but heard about people successfully running it. Don't know about production, though...

I can give you some links:

https://syoyo.wordpress.com/category/infiniband/

https://github.com/syoyo/solaris-infiniband-tools


Rgrds Johan

_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss



From wonko at 4amlunch.net  Fri Dec 11 16:51:40 2015
From: wonko at 4amlunch.net (Brian Hechinger)
Date: Fri, 11 Dec 2015 11:51:40 -0500
Subject: [OmniOS-discuss] Ang:  OpenSM for OmniOS
In-Reply-To: <OF92A45255.50390FE5-ONC1257F18.005C5FCA-C1257F18.005C730E@inse.com>
References: <0F467FE2-9E8C-40D8-90AC-7B62638072F9@4amlunch.net>
	<OF92A45255.50390FE5-ONC1257F18.005C5FCA-C1257F18.005C730E@inse.com>
Message-ID: <4B0A0C6D-D48D-4DF2-8D74-9DB3EDAA8785@4amlunch.net>

Yeah, I?ve found that. The problem is I can?t find the source for this or the patch.

Eric pointed me at http://code.openhub.net/project?pid=&ipid=303919&fp=303919&mp&projSelected=true&filterChecked <http://code.openhub.net/project?pid=&ipid=303919&fp=303919&mp&projSelected=true&filterChecked>

I?m trying to get that to clone (it fails, sigh). It?s at least newer and maybe (hopefully) doesn?t need to be patched for Solaris/Illumos?

-brian

> On Dec 11, 2015, at 11:49 AM, Johan Kragsterman <johan.kragsterman at capvert.se> wrote:
> 
> Hi!
> 
> 
> -----"OmniOS-discuss" <omnios-discuss-bounces at lists.omniti.com> skrev: -----
> Till: omnios-discuss <omnios-discuss at lists.omniti.com>
> Fr?n: Brian Hechinger 
> S?nt av: "OmniOS-discuss" 
> Datum: 2015-12-11 17:35
> ?rende: [OmniOS-discuss] OpenSM for OmniOS
> 
> I&#8217;ve found that supposedly this works. I just need to get a copy and build it.
> 
> Does anyone know where I would get a copy?  I cannot find it for the life of me!
> 
> Thanks,
> 
> -brian
> 
> 
> 
> 
> That software is pretty old, and I never tested it myself, but heard about people successfully running it. Don't know about production, though...
> 
> I can give you some links:
> 
> https://syoyo.wordpress.com/category/infiniband/
> 
> https://github.com/syoyo/solaris-infiniband-tools
> 
> 
> Rgrds Johan
> 
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151211/05a46081/attachment-0001.html>

From johan.kragsterman at capvert.se  Fri Dec 11 17:02:24 2015
From: johan.kragsterman at capvert.se (Johan Kragsterman)
Date: Fri, 11 Dec 2015 18:02:24 +0100
Subject: [OmniOS-discuss] Ang: Re: Ang:  OpenSM for OmniOS
In-Reply-To: <4B0A0C6D-D48D-4DF2-8D74-9DB3EDAA8785@4amlunch.net>
References: <4B0A0C6D-D48D-4DF2-8D74-9DB3EDAA8785@4amlunch.net>,
	<0F467FE2-9E8C-40D8-90AC-7B62638072F9@4amlunch.net>
	<OF92A45255.50390FE5-ONC1257F18.005C5FCA-C1257F18.005C730E@inse.com>
Message-ID: <OF70CB2AD0.30947838-ONC1257F18.005D9AB0-C1257F18.005D9AB2@inse.com>

Hi!



-----Brian Hechinger <wonko at 4amlunch.net> skrev: -----
Till: Johan Kragsterman <johan.kragsterman at capvert.se>
Fr?n: Brian Hechinger <wonko at 4amlunch.net>
Datum: 2015-12-11 17:51
Kopia: omnios-discuss <omnios-discuss at lists.omniti.com>
?rende: Re: Ang: [OmniOS-discuss] OpenSM for OmniOS

Yeah, I&#8217;ve found that. The problem is I can&#8217;t find the source for this or the patch.

Eric pointed me at?http://code.openhub.net/project?pid=&ipid=303919&fp=303919&mp&projSelected=true&filterChecked

I&#8217;m trying to get that to clone (it fails, sigh). It&#8217;s at least newer and maybe (hopefully) doesn&#8217;t need to be patched for Solaris/Illumos?

-brian




Did you try it from here:


http://git.openfabrics.org/~alexnetes/opensm.git/


Rgrds Johan


From wonko at 4amlunch.net  Fri Dec 11 17:09:32 2015
From: wonko at 4amlunch.net (Brian Hechinger)
Date: Fri, 11 Dec 2015 12:09:32 -0500
Subject: [OmniOS-discuss] Ang: Re: Ang:  OpenSM for OmniOS
In-Reply-To: <OF70CB2AD0.30947838-ONC1257F18.005D9AB0-C1257F18.005D9AB2@inse.com>
References: <4B0A0C6D-D48D-4DF2-8D74-9DB3EDAA8785@4amlunch.net>
	<0F467FE2-9E8C-40D8-90AC-7B62638072F9@4amlunch.net>
	<OF92A45255.50390FE5-ONC1257F18.005C5FCA-C1257F18.005C730E@inse.com>
	<OF70CB2AD0.30947838-ONC1257F18.005D9AB0-C1257F18.005D9AB2@inse.com>
Message-ID: <03F2E265-758A-4512-91F5-B082C2FE67DD@4amlunch.net>

That fails to clone, but I did manage to eventually find it on that site.

Got 3.3.19

It needs libibumad so I got that, but that explodes horribly when I try to build it. :(

Stuff like this:

./include/infiniband/umad.h:84:62: error: expected expression before 'uint32_t'
 #define IB_USER_MAD_UNREGISTER_AGENT _IOW(IB_IOCTL_MAGIC, 2, uint32_t)
                                                              ^
src/umad.c:979:19: note: in expansion of macro 'IB_USER_MAD_UNREGISTER_AGENT'
  return ioctl(fd, IB_USER_MAD_UNREGISTER_AGENT, &agentid);

-brian

> On Dec 11, 2015, at 12:02 PM, Johan Kragsterman <johan.kragsterman at capvert.se> wrote:
> 
> Hi!
> 
> 
> 
> -----Brian Hechinger <wonko at 4amlunch.net> skrev: -----
> Till: Johan Kragsterman <johan.kragsterman at capvert.se>
> Fr?n: Brian Hechinger <wonko at 4amlunch.net>
> Datum: 2015-12-11 17:51
> Kopia: omnios-discuss <omnios-discuss at lists.omniti.com>
> ?rende: Re: Ang: [OmniOS-discuss] OpenSM for OmniOS
> 
> Yeah, I&#8217;ve found that. The problem is I can&#8217;t find the source for this or the patch.
> 
> Eric pointed me at http://code.openhub.net/project?pid=&ipid=303919&fp=303919&mp&projSelected=true&filterChecked
> 
> I&#8217;m trying to get that to clone (it fails, sigh). It&#8217;s at least newer and maybe (hopefully) doesn&#8217;t need to be patched for Solaris/Illumos?
> 
> -brian
> 
> 
> 
> 
> Did you try it from here:
> 
> 
> http://git.openfabrics.org/~alexnetes/opensm.git/
> 
> 
> Rgrds Johan
> 


From alka at hfg-gmuend.de  Fri Dec 11 17:30:04 2015
From: alka at hfg-gmuend.de (=?UTF-8?Q?G=c3=bcnther_Alka?=)
Date: Fri, 11 Dec 2015 18:30:04 +0100
Subject: [OmniOS-discuss] Bloody update for December 11th
In-Reply-To: <EF95F9F8-855F-45AE-B288-0052FDCEC73E@omniti.com>
References: <EF95F9F8-855F-45AE-B288-0052FDCEC73E@omniti.com>
Message-ID: <566B081C.8@hfg-gmuend.de>

Many Thanks to Nexenta
and to OmniTi for this december bloody with SMB 2

I have just done some tests on OSX under Solaris 11.3 to check some 
configuration
options for a ZFS video editing storage server for my Mac Pros.

There are two must have principles: SMB2 and Jumboframes
see http://napp-it.org/doc/downloads/performance_smb2.pdf


Gea



On 11.12.2015 16:48, Dan McDonald wrote:
> This will be the last bloody update for 2015.  The new install media (ISO, USB-DD, or kayak .zfs.bz2) can be obtained via here:
>
> 	http://omnios.omniti.com/wiki.php/Installation
>
> illumos-omnios is built from revision 2e8c0ba.  omnios-build is built from b7ab647, but is a faster moving target.
>
> New in this bloody:
>
> - PCRE to 8.38 (the one already patched in 014 & 016).
>
> - OpenSSH now works properly with the illumos audit system.  This will be backported to 014 & 016 soon.
>
> - OpenSSL to 1.0.2e (already patched in 014 & 016).
>
> - Package variant support for DEBUG kernels now built-in (more on this below).
>
> - ZFS receive now works with replication streams with intermediate snapshots that exceed refquota (a requested fix)
>
> - SMB2 support (thanks Nexenta)
>
>
> The "package variant" support (thanks to Jeff Sipek for the inspiration) allows a user of OmniOS to create a distinct boot environment with same-time-compiled bits, but with DEBUG enabled.  To create such a BE, you start by upgrading to these bits, and then you can use the "pkg change-variant" command.  The change-variant subcommand works a lot like install.  Here's an example:
>
> 	Last login: Fri Dec 11 10:30:56 2015
> 	OmniOS 5.11     omnios-2e8c0ba  December 2015
> 	# pkg variant -a
> 	VARIANT                                                                VALUE
> 	arch                                                                   i386
> 	debug.illumos                                                          false
> 	opensolaris.zone                                                       global
> 	# pkg change-variant -n debug.illumos=true
> 	            Packages to change: 188
> 	     Variants/Facets to change:   1
> 	       Create boot environment: Yes
> 	Create backup boot environment:  No
>
> 	Planning linked: 0/1 done; 1 working: zone:tz2
> 	Linked image 'zone:tz2' output:
> 	| No updates necessary for this image. (zone:tz2)
> 	`
> 	Planning linked: 1/1 done
> 	#
>
> If I wasn't using -n:  boom, instant DEBUG kernel in a new BE.  This will be handy for people who encounter problems.  We can request you create a DEBUG BE and reproduce your problem without having to ONU or do any other weirdness.
>
> As I said earlier, this is the last bloody update for 2015.  Have a happy holiday season (whatever you celebrate, or not) and a happy new year!
>
> Dan
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss


-- 

H          f   G
Hochschule f?r Gestaltung
university of design

Schw?bisch Gm?nd
Rektor Klaus Str. 100
73525 Schw?bisch Gm?nd

Guenther Alka, Dipl.-Ing. (FH)
Leiter des Rechenzentrums
head of computer center

Tel 07171 602 624
Fax 07171 69259
guenther.alka at hfg-gmuend.de
http://rz.hfg-gmuend.de


From lists at marzocchi.net  Fri Dec 11 19:01:56 2015
From: lists at marzocchi.net (Olaf Marzocchi)
Date: Fri, 11 Dec 2015 22:31:56 +0330
Subject: [OmniOS-discuss] Bloody update for December 11th
In-Reply-To: <EF95F9F8-855F-45AE-B288-0052FDCEC73E@omniti.com>
References: <EF95F9F8-855F-45AE-B288-0052FDCEC73E@omniti.com>
Message-ID: <DB564D01-D035-4DF3-A72C-10E5237E63A5@marzocchi.net>

Any plan to backport it to 014? Since 014 is LTS, lack of backport means no SMB2 for a long time for all the LTS users.

Olaf



Il 11 dicembre 2015 19:18:15 GMT+03:30, Dan McDonald <danmcd at omniti.com> ha scritto:

>
>- SMB2 support (thanks Nexenta)
>


From danmcd at omniti.com  Fri Dec 11 19:03:38 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Fri, 11 Dec 2015 14:03:38 -0500
Subject: [OmniOS-discuss] Bloody update for December 11th
In-Reply-To: <DB564D01-D035-4DF3-A72C-10E5237E63A5@marzocchi.net>
References: <EF95F9F8-855F-45AE-B288-0052FDCEC73E@omniti.com>
	<DB564D01-D035-4DF3-A72C-10E5237E63A5@marzocchi.net>
Message-ID: <12128792-E69E-43B8-9477-354E2CE9F6AB@omniti.com>


> On Dec 11, 2015, at 2:01 PM, Olaf Marzocchi <lists at marzocchi.net> wrote:
> 
> Any plan to backport it to 014? Since 014 is LTS, lack of backport means no SMB2 for a long time for all the LTS users.

I deliberately left it out of 016 because of its size & complexity (it came in the day after 016 did its last upstream merge).  NO WAY will something this big go back into LTS/014 without a LOT of convincing (and a suitable test cycle).

Sorry,
Dan


From johan.kragsterman at capvert.se  Fri Dec 11 19:08:15 2015
From: johan.kragsterman at capvert.se (Johan Kragsterman)
Date: Fri, 11 Dec 2015 20:08:15 +0100
Subject: [OmniOS-discuss] Ang: Re: Ang: Re: Ang:  OpenSM for OmniOS
In-Reply-To: <03F2E265-758A-4512-91F5-B082C2FE67DD@4amlunch.net>
References: <03F2E265-758A-4512-91F5-B082C2FE67DD@4amlunch.net>,
	<4B0A0C6D-D48D-4DF2-8D74-9DB3EDAA8785@4amlunch.net>
	<0F467FE2-9E8C-40D8-90AC-7B62638072F9@4amlunch.net>
	<OF92A45255.50390FE5-ONC1257F18.005C5FCA-C1257F18.005C730E@inse.com>
	<OF70CB2AD0.30947838-ONC1257F18.005D9AB0-C1257F18.005D9AB2@inse.com>
Message-ID: <OF5368E0FE.219E830E-ONC1257F18.00692056-C1257F18.00692058@inse.com>


Hi!

Would be nice if you keep us(list) updated...

Rgrds Johan


-----Brian Hechinger <wonko at 4amlunch.net> skrev: -----
Till: Johan Kragsterman <johan.kragsterman at capvert.se>
Fr?n: Brian Hechinger <wonko at 4amlunch.net>
Datum: 2015-12-11 18:09
Kopia: omnios-discuss <omnios-discuss at lists.omniti.com>
?rende: Re: Ang: Re: Ang: [OmniOS-discuss] OpenSM for OmniOS

That fails to clone, but I did manage to eventually find it on that site.

Got 3.3.19

It needs libibumad so I got that, but that explodes horribly when I try to build it. :(

Stuff like this:

./include/infiniband/umad.h:84:62: error: expected expression before 'uint32_t'
?#define IB_USER_MAD_UNREGISTER_AGENT _IOW(IB_IOCTL_MAGIC, 2, uint32_t)
?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?^
src/umad.c:979:19: note: in expansion of macro 'IB_USER_MAD_UNREGISTER_AGENT'
??return ioctl(fd, IB_USER_MAD_UNREGISTER_AGENT, &agentid);

-brian

> On Dec 11, 2015, at 12:02 PM, Johan Kragsterman <johan.kragsterman at capvert.se> wrote:
> 
> Hi!
> 
> 
> 
> -----Brian Hechinger <wonko at 4amlunch.net> skrev: -----
> Till: Johan Kragsterman <johan.kragsterman at capvert.se>
> Fr?n: Brian Hechinger <wonko at 4amlunch.net>
> Datum: 2015-12-11 17:51
> Kopia: omnios-discuss <omnios-discuss at lists.omniti.com>
> ?rende: Re: Ang: [OmniOS-discuss] OpenSM for OmniOS
> 
> Yeah, I&#8217;ve found that. The problem is I can&#8217;t find the source for this or the patch.
> 
> Eric pointed me at http://code.openhub.net/project?pid=&ipid=303919&fp=303919&mp&projSelected=true&filterChecked
> 
> I&#8217;m trying to get that to clone (it fails, sigh). It&#8217;s at least newer and maybe (hopefully) doesn&#8217;t need to be patched for Solaris/Illumos?
> 
> -brian
> 
> 
> 
> 
> Did you try it from here:
> 
> 
> http://git.openfabrics.org/~alexnetes/opensm.git/
> 
> 
> Rgrds Johan
> 





From philip.yuengling at circonus.com  Fri Dec 11 20:25:28 2015
From: philip.yuengling at circonus.com (Philip Yuengling)
Date: Fri, 11 Dec 2015 15:25:28 -0500
Subject: [OmniOS-discuss] SSH versions on global and non-global zones
Message-ID: <CABOv2v6828yRnb-h8CiAabFaQ5+dH0fC27OEJKtCnV0GdTfh-w@mail.gmail.com>

It seems that when installing LTS 151014 from the kayak image the global
gets OpenSSH_7.1p1, but non-global zones get Sun_SSH_1.5.

Obviously some work can be done to get them to match, but it may be good to
have them match from the start?  Or am I missing something.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151211/bf475dd4/attachment.html>

From danmcd at omniti.com  Fri Dec 11 20:38:30 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Fri, 11 Dec 2015 15:38:30 -0500
Subject: [OmniOS-discuss] SSH versions on global and non-global zones
In-Reply-To: <CABOv2v6828yRnb-h8CiAabFaQ5+dH0fC27OEJKtCnV0GdTfh-w@mail.gmail.com>
References: <CABOv2v6828yRnb-h8CiAabFaQ5+dH0fC27OEJKtCnV0GdTfh-w@mail.gmail.com>
Message-ID: <1380CE53-3591-40B9-B0B5-8DEA92AA9713@omniti.com>


> On Dec 11, 2015, at 3:25 PM, Philip Yuengling <philip.yuengling at circonus.com> wrote:
> 
> It seems that when installing LTS 151014 from the kayak image the global gets OpenSSH_7.1p1, but non-global zones get Sun_SSH_1.5.
> 
> Obviously some work can be done to get them to match, but it may be good to have them match from the start?  Or am I missing something.

Huh... I had NO idea it would do that.  I assumed (probably incorrectly) that the NGZs would get "entire" just like the global one would.


Ahhh, I see the problem:

	https://github.com/omniti-labs/pkg5/blob/omnios/src/brand/pkgcreatezone#L545

"entire" populates the global zone.  Whatever is in pkgcreatezone works for ipkg & lipkg zones.

"entire" can support both, and due to IPS's rules (higher version number wins), OpenSSH7.1 beats SunSSH0.151xxx.

Not sure if patching pkgcreatezone is the best option OR if we should inherit-from-global more intelligently in the pkgcreatezone script.

Thanks for finding this!
Dan


From eric.sproul at circonus.com  Fri Dec 11 20:51:01 2015
From: eric.sproul at circonus.com (Eric Sproul)
Date: Fri, 11 Dec 2015 15:51:01 -0500
Subject: [OmniOS-discuss] SSH versions on global and non-global zones
In-Reply-To: <1380CE53-3591-40B9-B0B5-8DEA92AA9713@omniti.com>
References: <CABOv2v6828yRnb-h8CiAabFaQ5+dH0fC27OEJKtCnV0GdTfh-w@mail.gmail.com>
	<1380CE53-3591-40B9-B0B5-8DEA92AA9713@omniti.com>
Message-ID: <CAO8hXRCLmVhcAO3a=Wz7gX2jf96nNpGfBubJMH4E7dguDgCYOQ@mail.gmail.com>

On Fri, Dec 11, 2015 at 3:38 PM, Dan McDonald <danmcd at omniti.com> wrote:
> Huh... I had NO idea it would do that.  I assumed (probably incorrectly) that the NGZs would get "entire" just like the global one would.
>
>
> Ahhh, I see the problem:
>
>         https://github.com/omniti-labs/pkg5/blob/omnios/src/brand/pkgcreatezone#L545
>
> "entire" populates the global zone.  Whatever is in pkgcreatezone works for ipkg & lipkg zones.
>
> "entire" can support both, and due to IPS's rules (higher version number wins), OpenSSH7.1 beats SunSSH0.151xxx.
>
> Not sure if patching pkgcreatezone is the best option OR if we should inherit-from-global more intelligently in the pkgcreatezone script.

This is fallout from our abuse of entire.  The ipkg brand scripts
assume entire is just an incorporation, so they explicitly install a
bunch of basic packages (including ssh).  I'd like to see us try to
undo that early mistake (for which I'm partly to blame!) and get back
to having "slim_install" fill the role that we forced "entire" into
back at the beginning, when we didn't fully understand what
distro_const was actually doing.

We (Circonus) can work around this for now, and make it part of our
'014 zone bootstrap process to switch out ssh daemons.

Eric

From bfriesen at simple.dallas.tx.us  Sat Dec 12 21:07:42 2015
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Sat, 12 Dec 2015 15:07:42 -0600 (CST)
Subject: [OmniOS-discuss] Bloody update for December 11th
In-Reply-To: <566B081C.8@hfg-gmuend.de>
References: <EF95F9F8-855F-45AE-B288-0052FDCEC73E@omniti.com>
	<566B081C.8@hfg-gmuend.de>
Message-ID: <alpine.GSO.2.01.1512121504530.1673@freddy.simplesystems.org>

On Fri, 11 Dec 2015, G?nther Alka wrote:

> Many Thanks to Nexenta
> and to OmniTi for this december bloody with SMB 2
>
> I have just done some tests on OSX under Solaris 11.3 to check some 
> configuration
> options for a ZFS video editing storage server for my Mac Pros.

Do you plan to add tests with the implementation in OmniOS bloody? 
The Nexenta implementation might be quite a lot different than the 
Oracle Solaris one.  Perhaps it might even fail with your tests.

> There are two must have principles: SMB2 and Jumboframes
> see http://napp-it.org/doc/downloads/performance_smb2.pdf

I was surprised to see the huge improvement with jumbo frames.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From alka at hfg-gmuend.de  Sun Dec 13 08:38:43 2015
From: alka at hfg-gmuend.de (Guenther Alka)
Date: Sun, 13 Dec 2015 09:38:43 +0100
Subject: [OmniOS-discuss] Bloody update for December 11th
In-Reply-To: <alpine.GSO.2.01.1512121504530.1673@freddy.simplesystems.org>
References: <EF95F9F8-855F-45AE-B288-0052FDCEC73E@omniti.com>
	<566B081C.8@hfg-gmuend.de>
	<alpine.GSO.2.01.1512121504530.1673@freddy.simplesystems.org>
Message-ID: <566D2E93.3020505@hfg-gmuend.de>

OmniOS is my preferred platform.
I will add SMB2 results with the same config to the pdf this week


Am 12.12.2015 um 22:07 schrieb Bob Friesenhahn:
> On Fri, 11 Dec 2015, G?nther Alka wrote:
>
>> Many Thanks to Nexenta
>> and to OmniTi for this december bloody with SMB 2
>>
>> I have just done some tests on OSX under Solaris 11.3 to check some 
>> configuration
>> options for a ZFS video editing storage server for my Mac Pros.
>
> Do you plan to add tests with the implementation in OmniOS bloody? The 
> Nexenta implementation might be quite a lot different than the Oracle 
> Solaris one.  Perhaps it might even fail with your tests.
>
>> There are two must have principles: SMB2 and Jumboframes
>> see http://napp-it.org/doc/downloads/performance_smb2.pdf
>
> I was surprised to see the huge improvement with jumbo frames.
>
> Bob


From vab at bb-c.de  Sun Dec 13 12:42:57 2015
From: vab at bb-c.de (Volker A. Brandt)
Date: Sun, 13 Dec 2015 13:42:57 +0100
Subject: [OmniOS-discuss] Bloody update for December 11th
In-Reply-To: <EF95F9F8-855F-45AE-B288-0052FDCEC73E@omniti.com>
References: <EF95F9F8-855F-45AE-B288-0052FDCEC73E@omniti.com>
Message-ID: <22125.26577.758055.199016@glaurung.bb-c.de>

Hi Dan!


Thanks for your good work on OmniOS!

> This will be the last bloody update for 2015.  The new install media
> (ISO, USB-DD, or kayak .zfs.bz2) can be obtained via here:
>
> 	http://omnios.omniti.com/wiki.php/Installation

Note that the link for the ZFS installation root on this page still
points to the Nov 09 version.  I used wget to retrieve the Dec 11
version, so the file itself is there, just the link is wrong.


Regards -- Volker
-- 
------------------------------------------------------------------------
Volker A. Brandt               Consulting and Support for Oracle Solaris
Brandt & Brandt Computer GmbH                   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim, GERMANY            Email: vab at bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513              Schuhgr??e: 46
Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt

"When logic and proportion have fallen sloppy dead"

From danmcd at omniti.com  Mon Dec 14 15:18:41 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 14 Dec 2015 10:18:41 -0500
Subject: [OmniOS-discuss] Bloody update for December 11th
In-Reply-To: <22125.26577.758055.199016@glaurung.bb-c.de>
References: <EF95F9F8-855F-45AE-B288-0052FDCEC73E@omniti.com>
	<22125.26577.758055.199016@glaurung.bb-c.de>
Message-ID: <0ECB9C46-3301-437E-B223-448AC083015C@omniti.com>


> On Dec 13, 2015, at 7:42 AM, Volker A. Brandt <vab at bb-c.de> wrote:
> 
> Note that the link for the ZFS installation root on this page still
> points to the Nov 09 version.  I used wget to retrieve the Dec 11
> version, so the file itself is there, just the link is wrong.

Fixed:

http://omnios.omniti.com/changeset.php/default/wiki/810badbcdcd1a0d0430008a79be2f755b5dd99e8

Thanks!
Dan


From stephan.budach at JVM.DE  Mon Dec 14 15:25:18 2015
From: stephan.budach at JVM.DE (Stephan Budach)
Date: Mon, 14 Dec 2015 16:25:18 +0100
Subject: [OmniOS-discuss] How to configure FCoE target in OmniOS?
Message-ID: <566EDF5E.3000004@jvm.de>

Hi guys,

I am trying to configure a FCoE target in OmniOS r016, but I seem to 
cannot get it right. I started out with the documentation for Solaris 
11, which seemd appropriate and configured a fc target and added a view, 
which granted the fcoe port access to that LUN, but the ort doesn't seem 
to login to my Nexus 5500 at all.

This is what I got so far:

root at nfsvmpool03:/root# pkg list | grep fcoe
driver/network/fcoe 0.5.11-0.151016            i--
driver/network/fcoet 0.5.11-0.151016            i--
system/library/libfcoe 0.5.11-0.151016            i--

svcs -a | grep fcoe
disabled       15:35:16 svc:/system/fcoe_initiator:default
online         15:35:39 svc:/system/fcoe_target:default

fcadm hba-port
HBA Port WWN: 2000a0369f590a20
         Port Mode: Target
         Port ID: 0
         OS Device Name: Not Applicable
         Manufacturer: Sun Microsystems, Inc.
         Model: FCoE Virtual FC HBA
         Firmware Version: N/A
         FCode/BIOS Version: N/A
         Serial Number: N/A
         Driver Name: COMSTAR FCoET
         Driver Version: v20091123-1.02
         Type: unknown
         State: offline
         Supported Speeds: 1Gb 10Gb
         Current Speed: not established
         Node WWN: 1000a0369f590a20

stmfadm list-target -v wwn.2000A0369F590A20
Target: wwn.2000A0369F590A20
     Operational Status: Online
     Provider Name     : fcoet
     Alias             : fcoet0
     Protocol          : Fibre Channel
     Sessions          : 0

stmfadm list-view -l 600144F07A34AC66000054D1DEB50001
View Entry: 0
     Host group   : ovmHosts
     Target group : fcoeNFSVMPOOL03
     LUN          : 0

stmfadm list-tg -v fcoeNFSVMPOOL03
Target Group: fcoeNFSVMPOOL03
         Member: wwn.2000A0369F590A20
         Member: wwn.1000A0369F590A20

stmfadm list-hg -v ovmHosts
Host Group: ovmHosts
         Member: wwn.2000A0369F1A171D

I was able to successfully connect my Linux FCoE initiator to the 
fabric, but not the OmniOS target. Is there anything obvious wrong with 
my config?

Thanks,
Stephan

From danmcd at omniti.com  Mon Dec 14 15:35:16 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 14 Dec 2015 10:35:16 -0500
Subject: [OmniOS-discuss] How to configure FCoE target in OmniOS?
In-Reply-To: <566EDF5E.3000004@jvm.de>
References: <566EDF5E.3000004@jvm.de>
Message-ID: <F6497D11-29A6-4F0E-ABDE-787512ABCC3B@omniti.com>


> On Dec 14, 2015, at 10:25 AM, Stephan Budach <stephan.budach at JVM.DE> wrote:
> 
> Hi guys,
> 
> I am trying to configure a FCoE target in OmniOS r016, but I seem to cannot get it right. I started out with the documentation for Solaris 11, which seemd appropriate and configured a fc target and added a view, which granted the fcoe port access to that LUN, but the ort doesn't seem to login to my Nexus 5500 at all.
> 
> This is what I got so far:
> 
> I was able to successfully connect my Linux FCoE initiator to the fabric, but not the OmniOS target. Is there anything obvious wrong with my config?

Do you have any LUs configured?

	stmfadm list-lu

You may want to make sure you have at least one.  Create a ZFS volume and then add it:

	zfs create -V poolname/volname

	stmfadm create-lu /dev/zvol/dsk/poolname/volname

Dan


From stephan.budach at JVM.DE  Mon Dec 14 15:46:53 2015
From: stephan.budach at JVM.DE (Stephan Budach)
Date: Mon, 14 Dec 2015 16:46:53 +0100
Subject: [OmniOS-discuss] How to configure FCoE target in OmniOS?
In-Reply-To: <F6497D11-29A6-4F0E-ABDE-787512ABCC3B@omniti.com>
References: <566EDF5E.3000004@jvm.de>
	<F6497D11-29A6-4F0E-ABDE-787512ABCC3B@omniti.com>
Message-ID: <566EE46D.5050507@jvm.de>

Well? creating a view, requires a LUN, doesn't it? But anyway, I do have 
some LUNs configured already, which I formerly presented using iSCSI. 
The one I chose for my FCoE testing is this one:

stmfadm list-lu -v
LU Name: 600144F07A34AC66000054D1DEB50001
     Operational Status: Online
     Provider Name     : sbd
     Alias             : /dev/zvol/rdsk/sasTank/nfsvmpool03sas
     View Entry Count  : 1
     Data File         : /dev/zvol/rdsk/sasTank/nfsvmpool03sas
     Meta File         : not set
     Size              : 1342177280000
     Block Size        : 512
     Management URL    : not set
     Vendor ID         : SUN
     Product ID        : COMSTAR
     Serial Num        : not set
     Write Protect     : Disabled
     Writeback Cache   : Disabled
     Access State      : Active

Thanks,
Stephan

Am 14.12.15 um 16:35 schrieb Dan McDonald:
>> On Dec 14, 2015, at 10:25 AM, Stephan Budach <stephan.budach at JVM.DE> wrote:
>>
>> Hi guys,
>>
>> I am trying to configure a FCoE target in OmniOS r016, but I seem to cannot get it right. I started out with the documentation for Solaris 11, which seemd appropriate and configured a fc target and added a view, which granted the fcoe port access to that LUN, but the ort doesn't seem to login to my Nexus 5500 at all.
>>
>> This is what I got so far:
>>
>> I was able to successfully connect my Linux FCoE initiator to the fabric, but not the OmniOS target. Is there anything obvious wrong with my config?
> Do you have any LUs configured?
>
> 	stmfadm list-lu
>
> You may want to make sure you have at least one.  Create a ZFS volume and then add it:
>
> 	zfs create -V poolname/volname
>
> 	stmfadm create-lu /dev/zvol/dsk/poolname/volname
>
> Dan
>

From johan.kragsterman at capvert.se  Mon Dec 14 16:26:19 2015
From: johan.kragsterman at capvert.se (Johan Kragsterman)
Date: Mon, 14 Dec 2015 17:26:19 +0100
Subject: [OmniOS-discuss] Ang: Re: How to configure FCoE target in OmniOS?
In-Reply-To: <566EE46D.5050507@jvm.de>
References: <566EE46D.5050507@jvm.de>,
	<566EDF5E.3000004@jvm.de>	<F6497D11-29A6-4F0E-ABDE-787512ABCC3B@omniti.com>
Message-ID: <OFDDFD314A.F1444C01-ONC1257F1B.0058C006-C1257F1B.005A4CCC@inse.com>


Hi!

Is there a question in this mail I miss somewhere...?

Anyway, check further down...



-----"OmniOS-discuss" <omnios-discuss-bounces at lists.omniti.com> skrev: -----
Till: Dan McDonald <danmcd at omniti.com>
Fr?n: Stephan Budach 
S?nt av: "OmniOS-discuss" 
Datum: 2015-12-14 16:48
Kopia: omnios-discuss <omnios-discuss at lists.omniti.com>
?rende: Re: [OmniOS-discuss] How to configure FCoE target in OmniOS?

Well&#8230; creating a view, requires a LUN, doesn't it? But anyway, I do have 
some LUNs configured already, which I formerly presented using iSCSI. 
The one I chose for my FCoE testing is this one:

stmfadm list-lu -v
LU Name: 600144F07A34AC66000054D1DEB50001
?? ? Operational Status: Online
?? ? Provider Name ? ? : sbd
?? ? Alias ? ? ? ? ? ? : /dev/zvol/rdsk/sasTank/nfsvmpool03sas
?? ? View Entry Count ?: 1
?? ? Data File ? ? ? ? : /dev/zvol/rdsk/sasTank/nfsvmpool03sas
?? ? Meta File ? ? ? ? : not set
?? ? Size ? ? ? ? ? ? ?: 1342177280000
?? ? Block Size ? ? ? ?: 512
?? ? Management URL ? ?: not set
?? ? Vendor ID ? ? ? ? : SUN
?? ? Product ID ? ? ? ?: COMSTAR
?? ? Serial Num ? ? ? ?: not set
?? ? Write Protect ? ? : Disabled
?? ? Writeback Cache ? : Disabled
?? ? Access State ? ? ?: Active

Thanks,
Stephan



Have you enabled the: svcadm enable svc:/system/fcoe_target:default  ?

What I remember(years ago I did this...), FCoE HBA's show up when you run: stmfadm list-target (-v for verbose). From there you can get the wwnn and wwpn, which you need to configure fcoe ports:

# fcadm create-fcoe-port -i -p Port_WWN -n Node_WWN Ethernet_Interface

Then, if you connect the FCoE initiator to the fabric, and search for a LUN, you will see that if you again run:

stmfadm list-target -v

as a logged in initiator, and from there you will get the wwn on the initiator.

With that info you can create the view for the LUN, for the initiator to be able to access it.

Hope this is the right working order, it was a long time ago I did this...

Regards Johan


Am 14.12.15 um 16:35 schrieb Dan McDonald:
>> On Dec 14, 2015, at 10:25 AM, Stephan Budach <stephan.budach at JVM.DE> wrote:
>>
>> Hi guys,
>>
>> I am trying to configure a FCoE target in OmniOS r016, but I seem to cannot get it right. I started out with the documentation for Solaris 11, which seemd appropriate and configured a fc target and added a view, which granted the fcoe port access to that LUN, but the ort doesn't seem to login to my Nexus 5500 at all.
>>
>> This is what I got so far:
>>
>> I was able to successfully connect my Linux FCoE initiator to the fabric, but not the OmniOS target. Is there anything obvious wrong with my config?
> Do you have any LUs configured?
>
> stmfadm list-lu
>
> You may want to make sure you have at least one. ?Create a ZFS volume and then add it:
>
> zfs create -V poolname/volname
>
> stmfadm create-lu /dev/zvol/dsk/poolname/volname
>
> Dan
>
_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss



From rjahnel at ellipseinc.com  Mon Dec 14 18:22:50 2015
From: rjahnel at ellipseinc.com (Richard Jahnel)
Date: Mon, 14 Dec 2015 18:22:50 +0000
Subject: [OmniOS-discuss] A useful tidbit or two for ESX admins running
	OmniOS Fibre Targets
Message-ID: <65DC5816D4BEE043885A89FD54E273FC6CF693F1@MAIL101.Ellipseinc.com>

Limiting the feature flags to those used in R151006 will eliminate the eager zero panic bug currently present in versions R151010 and later including the current LTS R151014.

Example:

zpool create -d \
-o feature at async_destroy=enabled \
-o feature at empty_bpobj=enabled \
-o feature at lz4_compress=enabled \
poolname \
raidz3 %disks% \
raidz3 %disks% \
<....>
log mirror %disks% \
cache %disks%

Also in an unrelated hint. Disabling the write back cache in the fibre target will prevent corrupted VMs that might otherwise result from a panic.

PS. Be sure to have an SSD backed log mirror to minimize the write performance impact.

Example:

stmfadm modify-lu -p wcd=true <Lun Number here>


[Ellipse Communications]


Richard Jahnel | Senior Network Engineer
Ellipse Communications - Corporate Office
14800 Quorum Dr, Suite 420  Dallas, TX 75254
TF: 888-678-3869 | F: 972-479-9115
Email<mailto:rjahnel at ellipseinc.com> * Website<https://www.ellipseinc.com> * Facebook<https://www.facebook.com/ellipsegroup> * Twitter<https://www.twitter.com/theellipsecow>


________________________________

The content of this e-mail (including any attachments) is strictly confidential and may be commercially sensitive. If you are not, or believe you may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151214/e1a9e304/attachment.html>

From danmcd at omniti.com  Mon Dec 14 18:36:01 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 14 Dec 2015 13:36:01 -0500
Subject: [OmniOS-discuss] A useful tidbit or two for ESX admins running
	OmniOS Fibre Targets
In-Reply-To: <65DC5816D4BEE043885A89FD54E273FC6CF693F1@MAIL101.Ellipseinc.com>
References: <65DC5816D4BEE043885A89FD54E273FC6CF693F1@MAIL101.Ellipseinc.com>
Message-ID: <0FDFF526-04C0-4705-8E81-3B29F031CCD8@omniti.com>


> On Dec 14, 2015, at 1:22 PM, Richard Jahnel <rjahnel at ellipseinc.com> wrote:
> 
> Limiting the feature flags to those used in R151006 will eliminate the eager zero panic bug currently present in versions R151010 and later including the current LTS R151014.
> 


Is there an illumos bug filed for this?  If not, why hasn't there been?  Modulo the fiber channel HW, it seems easy enough to reproduce, no?

Dan


From rjahnel at ellipseinc.com  Mon Dec 14 18:51:05 2015
From: rjahnel at ellipseinc.com (Richard Jahnel)
Date: Mon, 14 Dec 2015 18:51:05 +0000
Subject: [OmniOS-discuss] A useful tidbit or two for ESX admins running
 OmniOS Fibre Targets
In-Reply-To: <0FDFF526-04C0-4705-8E81-3B29F031CCD8@omniti.com>
References: <65DC5816D4BEE043885A89FD54E273FC6CF693F1@MAIL101.Ellipseinc.com>
	<0FDFF526-04C0-4705-8E81-3B29F031CCD8@omniti.com>
Message-ID: <65DC5816D4BEE043885A89FD54E273FC6CF69427@MAIL101.Ellipseinc.com>

We have discussed it before here on this list. I haven't filed a bug because I don't know how to do so for or where the bug resides.

For example, does it belong to OmniOS, Illumos or OpenZFS?

I don't know and I can't read source code well enough to figure it out.

-----Original Message-----
From: Dan McDonald [mailto:danmcd at omniti.com]
Sent: Monday, December 14, 2015 12:36 PM
To: Richard Jahnel <rjahnel at ellipseinc.com>
Cc: omnios-discuss at lists.omniti.com; Dan McDonald <danmcd at omniti.com>
Subject: Re: [OmniOS-discuss] A useful tidbit or two for ESX admins running OmniOS Fibre Targets


> On Dec 14, 2015, at 1:22 PM, Richard Jahnel <rjahnel at ellipseinc.com> wrote:
>
> Limiting the feature flags to those used in R151006 will eliminate the eager zero panic bug currently present in versions R151010 and later including the current LTS R151014.
>


Is there an illumos bug filed for this?  If not, why hasn't there been?  Modulo the fiber channel HW, it seems easy enough to reproduce, no?

Dan

________________________________

The content of this e-mail (including any attachments) is strictly confidential and may be commercially sensitive. If you are not, or believe you may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.

From stephan.budach at JVM.DE  Mon Dec 14 18:57:38 2015
From: stephan.budach at JVM.DE (Stephan Budach)
Date: Mon, 14 Dec 2015 19:57:38 +0100
Subject: [OmniOS-discuss] Ang: Re: How to configure FCoE target in
	OmniOS?
In-Reply-To: <OFDDFD314A.F1444C01-ONC1257F1B.0058C006-C1257F1B.005A4CCC@inse.com>
References: <566EE46D.5050507@jvm.de>,
	<566EDF5E.3000004@jvm.de>	<F6497D11-29A6-4F0E-ABDE-787512ABCC3B@omniti.com>
	<OFDDFD314A.F1444C01-ONC1257F1B.0058C006-C1257F1B.005A4CCC@inse.com>
Message-ID: <566F1122.5010402@jvm.de>

Hi Johan,

Am 14.12.15 um 17:26 schrieb Johan Kragsterman:
> Hi!
>
> Is there a question in this mail I miss somewhere...?
well, not in that post, but in the first one, where I asked, if anyone 
would spot some obvious errors in my attempt to configure a FCoE target 
in OmniOS.

>
> Anyway, check further down...
>
>
>
> -----"OmniOS-discuss" <omnios-discuss-bounces at lists.omniti.com> skrev: -----
> Till: Dan McDonald <danmcd at omniti.com>
> Fr?n: Stephan Budach
> S?nt av: "OmniOS-discuss"
> Datum: 2015-12-14 16:48
> Kopia: omnios-discuss <omnios-discuss at lists.omniti.com>
> ?rende: Re: [OmniOS-discuss] How to configure FCoE target in OmniOS?
>
> Well&#8230; creating a view, requires a LUN, doesn't it? But anyway, I do have
> some LUNs configured already, which I formerly presented using iSCSI.
> The one I chose for my FCoE testing is this one:
>
> stmfadm list-lu -v
> LU Name: 600144F07A34AC66000054D1DEB50001
>       Operational Status: Online
>       Provider Name     : sbd
>       Alias             : /dev/zvol/rdsk/sasTank/nfsvmpool03sas
>       View Entry Count  : 1
>       Data File         : /dev/zvol/rdsk/sasTank/nfsvmpool03sas
>       Meta File         : not set
>       Size              : 1342177280000
>       Block Size        : 512
>       Management URL    : not set
>       Vendor ID         : SUN
>       Product ID        : COMSTAR
>       Serial Num        : not set
>       Write Protect     : Disabled
>       Writeback Cache   : Disabled
>       Access State      : Active
>
> Thanks,
> Stephan
>
>
>
> Have you enabled the: svcadm enable svc:/system/fcoe_target:default  ?
Yes.
>
> What I remember(years ago I did this...), FCoE HBA's show up when you run: stmfadm list-target (-v for verbose). From there you can get the wwnn and wwpn, which you need to configure fcoe ports:
>
> # fcadm create-fcoe-port -i -p Port_WWN -n Node_WWN Ethernet_Interface
Yeah, I did that as well, but the port actually doesn't seem to login to 
the fabric on the switch. Shouldn't I see some flogi message from the 
target port on th4 switch as well?
>
> Then, if you connect the FCoE initiator to the fabric, and search for a LUN, you will see that if you again run:
>
> stmfadm list-target -v
>
> as a logged in initiator, and from there you will get the wwn on the initiator.
>
> With that info you can create the view for the LUN, for the initiator to be able to access it.
>
> Hope this is the right working order, it was a long time ago I did this...
>
> Regards Johan
>
>
> Am 14.12.15 um 16:35 schrieb Dan McDonald:
>>> On Dec 14, 2015, at 10:25 AM, Stephan Budach <stephan.budach at JVM.DE> wrote:
>>>
>>> Hi guys,
>>>
>>> I am trying to configure a FCoE target in OmniOS r016, but I seem to cannot get it right. I started out with the documentation for Solaris 11, which seemd appropriate and configured a fc target and added a view, which granted the fcoe port access to that LUN, but the ort doesn't seem to login to my Nexus 5500 at all.
>>>
>>> This is what I got so far:
>>>
>>> I was able to successfully connect my Linux FCoE initiator to the fabric, but not the OmniOS target. Is there anything obvious wrong with my config?
>> Do you have any LUs configured?
>>
>> stmfadm list-lu
>>
>> You may want to make sure you have at least one.  Create a ZFS volume and then add it:
>>
>> zfs create -V poolname/volname
>>
>> stmfadm create-lu /dev/zvol/dsk/poolname/volname
>>
>> Dan
>>
> _______________________________________________
>

From danmcd at omniti.com  Mon Dec 14 19:48:22 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 14 Dec 2015 14:48:22 -0500
Subject: [OmniOS-discuss] A useful tidbit or two for ESX admins running
	OmniOS Fibre Targets
In-Reply-To: <65DC5816D4BEE043885A89FD54E273FC6CF69427@MAIL101.Ellipseinc.com>
References: <65DC5816D4BEE043885A89FD54E273FC6CF693F1@MAIL101.Ellipseinc.com>
	<0FDFF526-04C0-4705-8E81-3B29F031CCD8@omniti.com>
	<65DC5816D4BEE043885A89FD54E273FC6CF69427@MAIL101.Ellipseinc.com>
Message-ID: <FEB6FA71-F695-4036-98FE-4C48D2B6AB99@omniti.com>

Have you shared a coredump before?  I can analyze it to see where it might fall.  Typically a kernel panic is Illumos.  It might also be OpenZFS, but since Illumos is still upstream it's the safer bet.

Dan

Sent from my iPhone (typos, autocorrect, and all)

> On Dec 14, 2015, at 1:51 PM, Richard Jahnel <rjahnel at ellipseinc.com> wrote:
> 
> We have discussed it before here on this list. I haven't filed a bug because I don't know how to do so for or where the bug resides.
> 
> For example, does it belong to OmniOS, Illumos or OpenZFS?
> 
> I don't know and I can't read source code well enough to figure it out.
> 
> -----Original Message-----
> From: Dan McDonald [mailto:danmcd at omniti.com]
> Sent: Monday, December 14, 2015 12:36 PM
> To: Richard Jahnel <rjahnel at ellipseinc.com>
> Cc: omnios-discuss at lists.omniti.com; Dan McDonald <danmcd at omniti.com>
> Subject: Re: [OmniOS-discuss] A useful tidbit or two for ESX admins running OmniOS Fibre Targets
> 
> 
>> On Dec 14, 2015, at 1:22 PM, Richard Jahnel <rjahnel at ellipseinc.com> wrote:
>> 
>> Limiting the feature flags to those used in R151006 will eliminate the eager zero panic bug currently present in versions R151010 and later including the current LTS R151014.
> 
> 
> Is there an illumos bug filed for this?  If not, why hasn't there been?  Modulo the fiber channel HW, it seems easy enough to reproduce, no?
> 
> Dan
> 
> ________________________________
> 
> The content of this e-mail (including any attachments) is strictly confidential and may be commercially sensitive. If you are not, or believe you may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.

From rjahnel at ellipseinc.com  Mon Dec 14 19:53:36 2015
From: rjahnel at ellipseinc.com (Richard Jahnel)
Date: Mon, 14 Dec 2015 19:53:36 +0000
Subject: [OmniOS-discuss] A useful tidbit or two for ESX admins running
 OmniOS Fibre Targets
In-Reply-To: <FEB6FA71-F695-4036-98FE-4C48D2B6AB99@omniti.com>
References: <65DC5816D4BEE043885A89FD54E273FC6CF693F1@MAIL101.Ellipseinc.com>
	<0FDFF526-04C0-4705-8E81-3B29F031CCD8@omniti.com>
	<65DC5816D4BEE043885A89FD54E273FC6CF69427@MAIL101.Ellipseinc.com>
	<FEB6FA71-F695-4036-98FE-4C48D2B6AB99@omniti.com>
Message-ID: <65DC5816D4BEE043885A89FD54E273FC6CF69477@MAIL101.Ellipseinc.com>

Yes, you looked at it around Oct 12th or 13th of this year.

-----Original Message-----
From: Dan McDonald [mailto:danmcd at omniti.com] 
Sent: Monday, December 14, 2015 1:48 PM
To: Richard Jahnel <rjahnel at ellipseinc.com>; Dan McDonald <danmcd at omniti.com>
Cc: omnios-discuss at lists.omniti.com
Subject: Re: [OmniOS-discuss] A useful tidbit or two for ESX admins running OmniOS Fibre Targets

Have you shared a coredump before?  I can analyze it to see where it might fall.  Typically a kernel panic is Illumos.  It might also be OpenZFS, but since Illumos is still upstream it's the safer bet.

Dan

Sent from my iPhone (typos, autocorrect, and all)

> On Dec 14, 2015, at 1:51 PM, Richard Jahnel <rjahnel at ellipseinc.com> wrote:
> 
> We have discussed it before here on this list. I haven't filed a bug because I don't know how to do so for or where the bug resides.
> 
> For example, does it belong to OmniOS, Illumos or OpenZFS?
> 
> I don't know and I can't read source code well enough to figure it out.
> 
> -----Original Message-----
> From: Dan McDonald [mailto:danmcd at omniti.com]
> Sent: Monday, December 14, 2015 12:36 PM
> To: Richard Jahnel <rjahnel at ellipseinc.com>
> Cc: omnios-discuss at lists.omniti.com; Dan McDonald <danmcd at omniti.com>
> Subject: Re: [OmniOS-discuss] A useful tidbit or two for ESX admins running OmniOS Fibre Targets
> 
> 
>> On Dec 14, 2015, at 1:22 PM, Richard Jahnel <rjahnel at ellipseinc.com> wrote:
>> 
>> Limiting the feature flags to those used in R151006 will eliminate the eager zero panic bug currently present in versions R151010 and later including the current LTS R151014.
> 
> 
> Is there an illumos bug filed for this?  If not, why hasn't there been?  Modulo the fiber channel HW, it seems easy enough to reproduce, no?
> 
> Dan
> 
> ________________________________
> 
> The content of this e-mail (including any attachments) is strictly confidential and may be commercially sensitive. If you are not, or believe you may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.

From danmcd at omniti.com  Mon Dec 14 22:13:46 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 14 Dec 2015 17:13:46 -0500
Subject: [OmniOS-discuss] A useful tidbit or two for ESX admins running
	OmniOS Fibre Targets
In-Reply-To: <65DC5816D4BEE043885A89FD54E273FC6CF69477@MAIL101.Ellipseinc.com>
References: <65DC5816D4BEE043885A89FD54E273FC6CF693F1@MAIL101.Ellipseinc.com>
	<0FDFF526-04C0-4705-8E81-3B29F031CCD8@omniti.com>
	<65DC5816D4BEE043885A89FD54E273FC6CF69427@MAIL101.Ellipseinc.com>
	<FEB6FA71-F695-4036-98FE-4C48D2B6AB99@omniti.com>
	<65DC5816D4BEE043885A89FD54E273FC6CF69477@MAIL101.Ellipseinc.com>
Message-ID: <33ACC7C1-A2CF-4B5A-9EDF-840B41EE8C5E@omniti.com>


> On Dec 14, 2015, at 2:53 PM, Richard Jahnel <rjahnel at ellipseinc.com> wrote:
> 
> Yes, you looked at it around Oct 12th or 13th of this year.

And it was... interesting:

panic[cpu11]/thread=ffffff15a1ea4780: 
hati_pte_map: flags & HAT_LOAD_REMAP


ffffff009a23f850 unix:hati_pte_map+3ab ()
ffffff009a23f8e0 unix:hati_load_common+139 ()
ffffff009a23f960 unix:hat_memload+75 ()
ffffff009a23fa80 genunix:segvn_faultpage+730 ()
ffffff009a23fc50 genunix:segvn_fault+8e6 ()
ffffff009a23fd60 genunix:as_fault+31a ()
ffffff009a23fdf0 unix:pagefault+96 ()
ffffff009a23ff00 unix:trap+2c7 ()
ffffff009a23ff10 unix:cmntrap+e6 ()


Nothing to indicate ZFS or FC... that's a VM subsystem fault.  I do, however, see 78 threads all doing SCSI WRITE_SAME.

Dan


From rjahnel at ellipseinc.com  Mon Dec 14 22:21:51 2015
From: rjahnel at ellipseinc.com (Richard Jahnel)
Date: Mon, 14 Dec 2015 22:21:51 +0000
Subject: [OmniOS-discuss] A useful tidbit or two for ESX admins running
 OmniOS Fibre Targets
In-Reply-To: <33ACC7C1-A2CF-4B5A-9EDF-840B41EE8C5E@omniti.com>
References: <65DC5816D4BEE043885A89FD54E273FC6CF693F1@MAIL101.Ellipseinc.com>
	<0FDFF526-04C0-4705-8E81-3B29F031CCD8@omniti.com>
	<65DC5816D4BEE043885A89FD54E273FC6CF69427@MAIL101.Ellipseinc.com>
	<FEB6FA71-F695-4036-98FE-4C48D2B6AB99@omniti.com>
	<65DC5816D4BEE043885A89FD54E273FC6CF69477@MAIL101.Ellipseinc.com>
	<33ACC7C1-A2CF-4B5A-9EDF-840B41EE8C5E@omniti.com>
Message-ID: <65DC5816D4BEE043885A89FD54E273FC6CF694EB@MAIL101.Ellipseinc.com>

All I can say for sure is that the problem is repeatable with sufficient time.

The test we use to see whether or not a storage volume is susceptible is to create and eager zero a 4 TB vmdk on the volume. Or as large a VMDK as the volume can handle.

Most of the time it will panic within the first TB else it has thus far always panicked before the third.

Volumes made with only three flags previously listed will not panic and have been test with eager zeros as large as 8 TB. This has been tested against R151014 and R151016.

-----Original Message-----
From: Dan McDonald [mailto:danmcd at omniti.com]
Sent: Monday, December 14, 2015 4:14 PM
To: Richard Jahnel <rjahnel at ellipseinc.com>; Dan McDonald <danmcd at omniti.com>
Cc: omnios-discuss at lists.omniti.com
Subject: Re: [OmniOS-discuss] A useful tidbit or two for ESX admins running OmniOS Fibre Targets


> On Dec 14, 2015, at 2:53 PM, Richard Jahnel <rjahnel at ellipseinc.com> wrote:
>
> Yes, you looked at it around Oct 12th or 13th of this year.

And it was... interesting:

panic[cpu11]/thread=ffffff15a1ea4780:
hati_pte_map: flags & HAT_LOAD_REMAP


ffffff009a23f850 unix:hati_pte_map+3ab ()
ffffff009a23f8e0 unix:hati_load_common+139 ()
ffffff009a23f960 unix:hat_memload+75 ()
ffffff009a23fa80 genunix:segvn_faultpage+730 ()
ffffff009a23fc50 genunix:segvn_fault+8e6 ()
ffffff009a23fd60 genunix:as_fault+31a ()
ffffff009a23fdf0 unix:pagefault+96 ()
ffffff009a23ff00 unix:trap+2c7 ()
ffffff009a23ff10 unix:cmntrap+e6 ()


Nothing to indicate ZFS or FC... that's a VM subsystem fault.  I do, however, see 78 threads all doing SCSI WRITE_SAME.

Dan

________________________________

The content of this e-mail (including any attachments) is strictly confidential and may be commercially sensitive. If you are not, or believe you may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.

From johan.kragsterman at capvert.se  Mon Dec 14 22:37:00 2015
From: johan.kragsterman at capvert.se (Johan Kragsterman)
Date: Mon, 14 Dec 2015 23:37:00 +0100
Subject: [OmniOS-discuss] Ang: Re: Ang: Re: How to configure FCoE target in
	OmniOS?
In-Reply-To: <566F1122.5010402@jvm.de>
References: <566F1122.5010402@jvm.de>, <566EE46D.5050507@jvm.de>,
	<566EDF5E.3000004@jvm.de>	<F6497D11-29A6-4F0E-ABDE-787512ABCC3B@omniti.com>
	<OFDDFD314A.F1444C01-ONC1257F1B.0058C006-C1257F1B.005A4CCC@inse.com>
Message-ID: <OF501A9BF5.B8BE5842-ONC1257F1B.007C3CBB-C1257F1B.007C3CBC@inse.com>

Hi!


-----Stephan Budach <stephan.budach at jvm.de> skrev: -----
Till: Johan Kragsterman <johan.kragsterman at capvert.se>
Fr?n: Stephan Budach <stephan.budach at jvm.de>
Datum: 2015-12-14 19:57
Kopia: Dan McDonald <danmcd at omniti.com>, omnios-discuss	<omnios-discuss at lists.omniti.com>
?rende: Re: Ang: Re: [OmniOS-discuss] How to configure FCoE target in OmniOS?


> Have you enabled the: svcadm enable svc:/system/fcoe_target:default ??
Yes.
>
> What I remember(years ago I did this...), FCoE HBA's show up when you run: stmfadm list-target (-v for verbose). From there you can get the wwnn and wwpn, which you need to configure fcoe ports:
>
> # fcadm create-fcoe-port -i -p Port_WWN -n Node_WWN Ethernet_Interface
Yeah, I did that as well, but the port actually doesn't seem to login to 
the fabric on the switch. Shouldn't I see some flogi message from the 
target port on th4 switch as well?
>





Yeah, it must register with the name services on the switch, if I remember correctly. Must be the same as with fibre channel, the name services must pick it up to be able to serve the name further on to the SAN.

The problem with FCoE is, imho, that the adaptors doesn't have any bios to check. In FC you go inte the bios and check the presence of storage devices, I don't think you got that possibility in FCoE adaptors, do you? But can you perhaps bypass the switch, and see if you can pick up any device directly?

By the way, what kind of switch do you use? I know there are FCoE switches that have different FCoE ports and FC ports...I mean, if you confused those ports...?

Rgrds Johan



>______________________________________
>




From stephan.budach at JVM.DE  Tue Dec 15 05:55:26 2015
From: stephan.budach at JVM.DE (Stephan Budach)
Date: Tue, 15 Dec 2015 06:55:26 +0100
Subject: [OmniOS-discuss] Ang: Re: Ang: Re: How to configure FCoE target
 in OmniOS?
In-Reply-To: <OF501A9BF5.B8BE5842-ONC1257F1B.007C3CBB-C1257F1B.007C3CBC@inse.com>
References: <566F1122.5010402@jvm.de>, <566EE46D.5050507@jvm.de>,
	<566EDF5E.3000004@jvm.de>	<F6497D11-29A6-4F0E-ABDE-787512ABCC3B@omniti.com>
	<OFDDFD314A.F1444C01-ONC1257F1B.0058C006-C1257F1B.005A4CCC@inse.com>
	<OF501A9BF5.B8BE5842-ONC1257F1B.007C3CBB-C1257F1B.007C3CBC@inse.com>
Message-ID: <566FAB4E.8010201@jvm.de>

Hi Johan,

Am 14.12.15 um 23:37 schrieb Johan Kragsterman:
> Hi!
>
>
> -----Stephan Budach <stephan.budach at jvm.de> skrev: -----
> Till: Johan Kragsterman <johan.kragsterman at capvert.se>
> Fr?n: Stephan Budach <stephan.budach at jvm.de>
> Datum: 2015-12-14 19:57
> Kopia: Dan McDonald <danmcd at omniti.com>, omnios-discuss	<omnios-discuss at lists.omniti.com>
> ?rende: Re: Ang: Re: [OmniOS-discuss] How to configure FCoE target in OmniOS?
>
>
>> Have you enabled the: svcadm enable svc:/system/fcoe_target:default  ?
> Yes.
>> What I remember(years ago I did this...), FCoE HBA's show up when you run: stmfadm list-target (-v for verbose). From there you can get the wwnn and wwpn, which you need to configure fcoe ports:
>>
>> # fcadm create-fcoe-port -i -p Port_WWN -n Node_WWN Ethernet_Interface
> Yeah, I did that as well, but the port actually doesn't seem to login to
> the fabric on the switch. Shouldn't I see some flogi message from the
> target port on th4 switch as well?
> Yeah, it must register with the name services on the switch, if I remember correctly. Must be the same as with fibre channel, the name services must pick it up to be able to serve the name further on to the SAN.
>
> The problem with FCoE is, imho, that the adaptors doesn't have any bios to check. In FC you go inte the bios and check the presence of storage devices, I don't think you got that possibility in FCoE adaptors, do you? But can you perhaps bypass the switch, and see if you can pick up any device directly?
Hmm, no? I want to use OmniOS as a FCoE target, actually, not an 
initiator. I got my FCoE initiator on RHEL already set up and logged in 
to the fabric. I am wondering a bit about the DCB client though - I am 
still not sure, if the X520-T2 has one built it or not. On RHEL I 
configred the DCB_REQUIRED to no and it obviously works, which led me to 
believe that these Intel CNAs actually do have a DCB client built in.
>
> By the way, what kind of switch do you use? I know there are FCoE switches that have different FCoE ports and FC ports...I mean, if you confused those ports...?
I am using a Nexus 5596, with some fabric extenders and although I am 
pretty sure, that I configured the ports correctly I will check that one 
again and see, if I made some mistake.
>
> Rgrds Johan
Thanks,
Stephan
>
>
>> ______________________________________
>>
>
>


-- 
Krebs?s 3 Basic Rules for Online Safety
1st - ?If you didn?t go looking for it, don?t install it!?
2nd - ?If you installed it, update it.?
3rd - ?If you no longer need it, remove it.?
http://krebsonsecurity.com/2011/05/krebss-3-basic-rules-for-online-safety


Stephan Budach
Head of IT
Jung von Matt/basis GmbH
Glash?ttenstra?e 79
20357 Hamburg


Tel: +49 40-4321-1353
Fax: +49 40-4321-1114
E-Mail: stephan.budach at jvm.de
Internet: http://www.jvm.com

Gesch?ftsf?hrer: Dominik Fassl, Christoph K?hler, Ulrich Pallas
AG HH HRB 82024


From stephan.budach at JVM.DE  Tue Dec 15 07:56:26 2015
From: stephan.budach at JVM.DE (Stephan Budach)
Date: Tue, 15 Dec 2015 08:56:26 +0100
Subject: [OmniOS-discuss] Ang: Re: Ang: Re: How to configure FCoE target
 in OmniOS?
In-Reply-To: <566FAB4E.8010201@jvm.de>
References: <566F1122.5010402@jvm.de>, <566EE46D.5050507@jvm.de>,
	<566EDF5E.3000004@jvm.de>	<F6497D11-29A6-4F0E-ABDE-787512ABCC3B@omniti.com>	<OFDDFD314A.F1444C01-ONC1257F1B.0058C006-C1257F1B.005A4CCC@inse.com>	<OF501A9BF5.B8BE5842-ONC1257F1B.007C3CBB-C1257F1B.007C3CBC@inse.com>
	<566FAB4E.8010201@jvm.de>
Message-ID: <566FC7AA.9050402@jvm.de>

Am 15.12.15 um 06:55 schrieb Stephan Budach:
> Hi Johan,
>
> Am 14.12.15 um 23:37 schrieb Johan Kragsterman:
>> Hi!
>>
>>
>> -----Stephan Budach <stephan.budach at jvm.de> skrev: -----
>> Till: Johan Kragsterman <johan.kragsterman at capvert.se>
>> Fr?n: Stephan Budach <stephan.budach at jvm.de>
>> Datum: 2015-12-14 19:57
>> Kopia: Dan McDonald <danmcd at omniti.com>, omnios-discuss 
>> <omnios-discuss at lists.omniti.com>
>> ?rende: Re: Ang: Re: [OmniOS-discuss] How to configure FCoE target in 
>> OmniOS?
>>
>>
>>> Have you enabled the: svcadm enable svc:/system/fcoe_target:default  ?
>> Yes.
>>> What I remember(years ago I did this...), FCoE HBA's show up when 
>>> you run: stmfadm list-target (-v for verbose). From there you can 
>>> get the wwnn and wwpn, which you need to configure fcoe ports:
>>>
>>> # fcadm create-fcoe-port -i -p Port_WWN -n Node_WWN Ethernet_Interface
>> Yeah, I did that as well, but the port actually doesn't seem to login to
>> the fabric on the switch. Shouldn't I see some flogi message from the
>> target port on th4 switch as well?
>> Yeah, it must register with the name services on the switch, if I 
>> remember correctly. Must be the same as with fibre channel, the name 
>> services must pick it up to be able to serve the name further on to 
>> the SAN.
>>
>> The problem with FCoE is, imho, that the adaptors doesn't have any 
>> bios to check. In FC you go inte the bios and check the presence of 
>> storage devices, I don't think you got that possibility in FCoE 
>> adaptors, do you? But can you perhaps bypass the switch, and see if 
>> you can pick up any device directly?
> Hmm, no? I want to use OmniOS as a FCoE target, actually, not an 
> initiator. I got my FCoE initiator on RHEL already set up and logged 
> in to the fabric. I am wondering a bit about the DCB client though - I 
> am still not sure, if the X520-T2 has one built it or not. On RHEL I 
> configred the DCB_REQUIRED to no and it obviously works, which led me 
> to believe that these Intel CNAs actually do have a DCB client built in.
>>
>> By the way, what kind of switch do you use? I know there are FCoE 
>> switches that have different FCoE ports and FC ports...I mean, if you 
>> confused those ports...?
> I am using a Nexus 5596, with some fabric extenders and although I am 
> pretty sure, that I configured the ports correctly I will check that 
> one again and see, if I made some mistake.
>>
>> Rgrds Johan
> Thanks,
> Stephan 

I think that I do need the LLDP package installed, which will give me 
the DCBX capabilities, I seem to be missing, but actually I can't find 
any package providing that. Does anyone know, where that sucker hides 
in? Or what the equivalent in OmniOS is?

Thanks,
Stephan

From johan.kragsterman at capvert.se  Tue Dec 15 09:08:02 2015
From: johan.kragsterman at capvert.se (Johan Kragsterman)
Date: Tue, 15 Dec 2015 10:08:02 +0100
Subject: [OmniOS-discuss] Ang: Re: Ang: Re: Ang: Re: How to configure FCoE
 target in OmniOS?
In-Reply-To: <566FAB4E.8010201@jvm.de>
References: <566FAB4E.8010201@jvm.de>,
	<566F1122.5010402@jvm.de>, <566EE46D.5050507@jvm.de>,
	<566EDF5E.3000004@jvm.de>
	<F6497D11-29A6-4F0E-ABDE-787512ABCC3B@omniti.com>
	<OFDDFD314A.F1444C01-ONC1257F1B.0058C006-C1257F1B.005A4CCC@inse.com>
	<OF501A9BF5.B8BE5842-ONC1257F1B.007C3CBB-C1257F1B.007C3CBC@inse.com>
Message-ID: <OF36AA10D2.444EBCD4-ONC1257F1C.00322CB6-C1257F1C.00322CB9@inse.com>

Hi!


-----Stephan Budach <stephan.budach at jvm.de> skrev: -----
Till: Johan Kragsterman <johan.kragsterman at capvert.se>
Fr?n: Stephan Budach <stephan.budach at jvm.de>
Datum: 2015-12-15 06:55
Kopia: omnios-discuss <omnios-discuss at lists.omniti.com>
?rende: Re: Ang: Re: Ang: Re: [OmniOS-discuss] How to configure FCoE target in OmniOS?

Hi Johan,

Am 14.12.15 um 23:37 schrieb Johan Kragsterman:
> Hi!
>
>
> -----Stephan Budach <stephan.budach at jvm.de> skrev: -----
> Till: Johan Kragsterman <johan.kragsterman at capvert.se>
> Fr?n: Stephan Budach <stephan.budach at jvm.de>
> Datum: 2015-12-14 19:57
> Kopia: Dan McDonald <danmcd at omniti.com>, omnios-discuss	<omnios-discuss at lists.omniti.com>
> ?rende: Re: Ang: Re: [OmniOS-discuss] How to configure FCoE target in OmniOS?
>
>
>> Have you enabled the: svcadm enable svc:/system/fcoe_target:default ??
> Yes.
>> What I remember(years ago I did this...), FCoE HBA's show up when you run: stmfadm list-target (-v for verbose). From there you can get the wwnn and wwpn, which you need to configure fcoe ports:
>>
>> # fcadm create-fcoe-port -i -p Port_WWN -n Node_WWN Ethernet_Interface
> Yeah, I did that as well, but the port actually doesn't seem to login to
> the fabric on the switch. Shouldn't I see some flogi message from the
> target port on th4 switch as well?
> Yeah, it must register with the name services on the switch, if I remember correctly. Must be the same as with fibre channel, the name services must pick it up to be able to serve the name further on to the SAN.
>
> The problem with FCoE is, imho, that the adaptors doesn't have any bios to check. In FC you go inte the bios and check the presence of storage devices, I don't think you got that possibility in FCoE adaptors, do you? But can you perhaps bypass the switch, and see if you can pick up any device directly?
Hmm, no&#8230; I want to use OmniOS as a FCoE target, actually, not an 
initiator.




What I mean here it to bypass the switch in that way that you eliminate the switch as the problem. Disconnect the switch, and connect the initiator directly to the target, and see if you can pich up a device.

Rgrds Johan



From johan.kragsterman at capvert.se  Tue Dec 15 09:20:30 2015
From: johan.kragsterman at capvert.se (Johan Kragsterman)
Date: Tue, 15 Dec 2015 10:20:30 +0100
Subject: [OmniOS-discuss] Ang: Re: Ang: Re: Ang: Re: How to configure FCoE
 target in OmniOS?
In-Reply-To: <566FC7AA.9050402@jvm.de>
References: <566FC7AA.9050402@jvm.de>,
	<566F1122.5010402@jvm.de>, <566EE46D.5050507@jvm.de>,
	<566EDF5E.3000004@jvm.de>
	<F6497D11-29A6-4F0E-ABDE-787512ABCC3B@omniti.com>	<OFDDFD314A.F1444C01-ONC1257F1B.0058C006-C1257F1B.005A4CCC@inse.com>
	<OF501A9BF5.B8BE5842-ONC1257F1B.007C3CBB-C1257F1B.007C3CBC@inse.com>	<566FAB4E.8010201@jvm.de>
Message-ID: <OF32FE5A11.789B37DF-ONC1257F1C.0033510C-C1257F1C.0033510D@inse.com>

Hi!


-----"OmniOS-discuss" <omnios-discuss-bounces at lists.omniti.com> skrev: -----
Till: <omnios-discuss at lists.omniti.com>
Fr?n: Stephan Budach 
S?nt av: "OmniOS-discuss" 
Datum: 2015-12-15 08:58
?rende: Re: [OmniOS-discuss] Ang: Re: Ang: Re: How to configure FCoE target in OmniOS?

Am 15.12.15 um 06:55 schrieb Stephan Budach:
> Hi Johan,
>
> Am 14.12.15 um 23:37 schrieb Johan Kragsterman:
>> Hi!
>>
>>
>> -----Stephan Budach <stephan.budach at jvm.de> skrev: -----
>> Till: Johan Kragsterman <johan.kragsterman at capvert.se>
>> Fr?n: Stephan Budach <stephan.budach at jvm.de>
>> Datum: 2015-12-14 19:57
>> Kopia: Dan McDonald <danmcd at omniti.com>, omnios-discuss 
>> <omnios-discuss at lists.omniti.com>
>> ?rende: Re: Ang: Re: [OmniOS-discuss] How to configure FCoE target in 
>> OmniOS?
>>
>>
>>> Have you enabled the: svcadm enable svc:/system/fcoe_target:default ??
>> Yes.
>>> What I remember(years ago I did this...), FCoE HBA's show up when 
>>> you run: stmfadm list-target (-v for verbose). From there you can 
>>> get the wwnn and wwpn, which you need to configure fcoe ports:
>>>
>>> # fcadm create-fcoe-port -i -p Port_WWN -n Node_WWN Ethernet_Interface
>> Yeah, I did that as well, but the port actually doesn't seem to login to
>> the fabric on the switch. Shouldn't I see some flogi message from the
>> target port on th4 switch as well?
>> Yeah, it must register with the name services on the switch, if I 
>> remember correctly. Must be the same as with fibre channel, the name 
>> services must pick it up to be able to serve the name further on to 
>> the SAN.
>>
>> The problem with FCoE is, imho, that the adaptors doesn't have any 
>> bios to check. In FC you go inte the bios and check the presence of 
>> storage devices, I don't think you got that possibility in FCoE 
>> adaptors, do you? But can you perhaps bypass the switch, and see if 
>> you can pick up any device directly?
> Hmm, no&#8230; I want to use OmniOS as a FCoE target, actually, not an 
> initiator. I got my FCoE initiator on RHEL already set up and logged 
> in to the fabric. I am wondering a bit about the DCB client though - I 
> am still not sure, if the X520-T2 has one built it or not. On RHEL I 
> configred the DCB_REQUIRED to no and it obviously works, which led me 
> to believe that these Intel CNAs actually do have a DCB client built in.
>>
>> By the way, what kind of switch do you use? I know there are FCoE 
>> switches that have different FCoE ports and FC ports...I mean, if you 
>> confused those ports...?
> I am using a Nexus 5596, with some fabric extenders and although I am 
> pretty sure, that I configured the ports correctly I will check that 
> one again and see, if I made some mistake.
>>
>> Rgrds Johan
> Thanks,
> Stephan 

I think that I do need the LLDP package installed, which will give me 
the DCBX capabilities, I seem to be missing, but actually I can't find 
any package providing that. Does anyone know, where that sucker hides 
in? Or what the equivalent in OmniOS is?

Thanks,
Stephan




Now I remember that Gea(G?nther Alka) @ napp-it has FCoE working on his napp-it server. He uses omnios as a base. He's on this list frequently.

Otherwise:

http://napp-it.org/
gea at napp-it.org
alka at hfg-gmuend.de

Regards Johan
_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss



From alka at hfg-gmuend.de  Tue Dec 15 11:31:00 2015
From: alka at hfg-gmuend.de (Guenther Alka)
Date: Tue, 15 Dec 2015 12:31:00 +0100
Subject: [OmniOS-discuss] Bloody update for December 11th
In-Reply-To: <alpine.GSO.2.01.1512121504530.1673@freddy.simplesystems.org>
References: <EF95F9F8-855F-45AE-B288-0052FDCEC73E@omniti.com>
	<566B081C.8@hfg-gmuend.de>
	<alpine.GSO.2.01.1512121504530.1673@freddy.simplesystems.org>
Message-ID: <566FF9F4.5010902@hfg-gmuend.de>

I have updated the pdf with results from OmniOS bloody.

Main resultsfor 10G Ethernet on OSX 10.11 and Windows 8.1

-  OS version and client network driver is very criticalfor 10G
    on some configs or with some driver releases 10G is not faster than 
1G (mostly on reads)
-  From Windows, performance to Solaris is similar than to OmniOS (at a 
lower level than with OSX)
-  From OSX, SMB2 to Solaris is faster than to OmniOS
-  OSX is faster than Windows on SMB2 reads and writes out of the box
    SMB2 perfomance on OSX goes up to > 600 MB/s on writes and > 800 
MB/s on reads
    SMB perfomance on Windows goes up to > 300 MB/s on writes and > 600 
MB/s on reads

This is a quick "out of the box" check with SMB2 and Jumboframes as the 
only special settingson OSX.
On Windows 8.1 defaults + mtu 9000 are used. Maybe we need some 
additional tweakings on Windows




Am 12.12.2015 um 22:07 schrieb Bob Friesenhahn:
> On Fri, 11 Dec 2015, G?nther Alka wrote:
>
>> Many Thanks to Nexenta
>> and to OmniTi for this december bloody with SMB 2
>>
>> I have just done some tests on OSX under Solaris 11.3 to check some 
>> configuration
>> options for a ZFS video editing storage server for my Mac Pros.
>
> Do you plan to add tests with the implementation in OmniOS bloody? The 
> Nexenta implementation might be quite a lot different than the Oracle 
> Solaris one.  Perhaps it might even fail with your tests.
>
>> There are two must have principles: SMB2 and Jumboframes
>> see http://napp-it.org/doc/downloads/performance_smb2.pdf
>
> I was surprised to see the huge improvement with jumbo frames.
>
> Bob

-- 
H          f   G
Hochschule f?r Gestaltung
university of design

Schw?bisch Gm?nd
Rektor-Klaus Str. 100
73525 Schw?bisch Gm?nd

Guenther Alka, Dipl.-Ing. (FH)
Leiter des Rechenzentrums
head of computer center

Tel 07171 602 627
Fax 07171 69259
guenther.alka at hfg-gmuend.de
http://rz.hfg-gmuend.de


From mtalbott at lji.org  Wed Dec 16 05:07:45 2015
From: mtalbott at lji.org (Michael Talbott)
Date: Tue, 15 Dec 2015 21:07:45 -0800
Subject: [OmniOS-discuss] OmniOS and Veeam
Message-ID: <1B979ADA-EF76-422E-AFB0-AB31DED8A046@lji.org>

I'd like to use an OmniOS box for a Veeam backup repository. The only solution I've been able to come up with thus far is to create a Linux VM, mount an OmniOS nfs export and then point Veeam to the Linux box's mounted nfs volume. Kinda clunky and far from optimal, but seems to get the job done. I'd like to eliminate the middleman if possible.

From my understanding Veeam just uses some perl script over ssh to query for a few things like finding mounted filesystems, etc. And then it uses NFS for the actual transfers. But that perl script doesn't properly complete on OmniOS probably due to some slightly different call requirements.

Anybody on here been through this and know of any workarounds to make it work natively? Or can anyone think of a less clunky workaround?

Thanks,

Michael

From danmcd at omniti.com  Wed Dec 16 14:46:38 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 16 Dec 2015 09:46:38 -0500
Subject: [OmniOS-discuss] OmniOS and Veeam
In-Reply-To: <1B979ADA-EF76-422E-AFB0-AB31DED8A046@lji.org>
References: <1B979ADA-EF76-422E-AFB0-AB31DED8A046@lji.org>
Message-ID: <D8D6439F-8DCF-4C8F-AA80-D6D0F176FB8B@omniti.com>

I know nothing about Veeam, but...

> On Dec 16, 2015, at 12:07 AM, Michael Talbott <mtalbott at lji.org> wrote:
> 
> I'd like to use an OmniOS box for a Veeam backup repository. The only solution I've been able to come up with thus far is to create a Linux VM, mount an OmniOS nfs export and then point Veeam to the Linux box's mounted nfs volume. Kinda clunky and far from optimal, but seems to get the job done. I'd like to eliminate the middleman if possible.
> 
> From my understanding Veeam just uses some perl script over ssh to query for a few things like finding mounted filesystems, etc. And then it uses NFS for the actual transfers. But that perl script doesn't properly complete on OmniOS probably due to some slightly different call requirements.

If you showed people the failure output, that may help you.

Dan


From danmcd at omniti.com  Thu Dec 17 01:38:23 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 16 Dec 2015 20:38:23 -0500
Subject: [OmniOS-discuss] Updates for LTS (r151014) and Stable (r151016)
Message-ID: <4240E754-A583-4CA4-A079-631EB3954FBF@omniti.com>

The updates for LTS and Stable are identical this time.  New release media is out, and if you "pkg update" you will need to reboot, because of kernel ZFS changes.

This update includes:

* BIND security update to 9.10.3-P2

* ZFS receives now replication streams with a refquota even if older snapshots exceed it (illumos 4986). Includes new ZFS Test Suite test.

* OpenSSH now integrates with the illumos audit subsystem. Thanks to Joyent, and this is part of getting OpenSSH to match SunSSH's integrated functionality.

* NVMe bugfixes (illumos 6466 and 6467).

Modulo disaster, this will be the last update for calendar year 2015.  After this week ends, I will be on vacation (just relaxing at home with my family), but I will be occasionally reading mail.  My latency will be VERY HIGH after COB Friday, US/Eastern.

Have an enjoyable holiday season, whatever you do or don't celebrate, and catch you in 2016!
Dan


From henson at acm.org  Thu Dec 17 02:12:18 2015
From: henson at acm.org (Paul B. Henson)
Date: Wed, 16 Dec 2015 18:12:18 -0800
Subject: [OmniOS-discuss] Updates for LTS (r151014) and Stable (r151016)
In-Reply-To: <4240E754-A583-4CA4-A079-631EB3954FBF@omniti.com>
References: <4240E754-A583-4CA4-A079-631EB3954FBF@omniti.com>
Message-ID: <20151217021217.GC3405@bender.unx.cpp.edu>

On Wed, Dec 16, 2015 at 08:38:23PM -0500, Dan McDonald wrote:

> * ZFS receives now replication streams with a refquota even if older
> snapshots exceed it (illumos 4986). Includes new ZFS Test Suite test.

Woo-hoo! We'll be testing this out straight-away, thanks much for
resolving it so quickly.

> After this week ends, I will be on vacation (just relaxing at home
> with my family), but I will be occasionally reading mail.  My latency
> will be VERY HIGH after COB Friday, US/Eastern.

I'm off after this week too, but I fear I'm probably more of a
workaholic than you are ;). One of my holiday plans is to update/reboot
my home storage server and hope nothing blows chunks 8-/, then I can
finally add back my L2ARC devices.


From wonko at 4amlunch.net  Thu Dec 17 19:05:21 2015
From: wonko at 4amlunch.net (Brian Hechinger)
Date: Thu, 17 Dec 2015 14:05:21 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <7D06CC38-9841-4189-80CD-6341E025B10C@4amlunch.net>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
	<7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
	<A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>
	<4B858828-C823-4251-84A9-417028B01B3C@omniti.com>
	<584980F4-502A-4700-A58F-E720CB398BF0@4amlunch.net>
	<4B0CFB00-2181-4E38-B0E1-8AAAA3E6136C@omniti.com>
	<7D06CC38-9841-4189-80CD-6341E025B10C@4amlunch.net>
Message-ID: <A9FE4FCF-F381-4204-AECB-947D15110794@4amlunch.net>

Ok, let?s add to the weirdness.

I destroyed the degraded pool.

I re-created it.

I then re-ran iozone.

It completed with zero errors on the pool. iozone did have some issues at the end, but the FS seems ok:

  pool: zoom
 state: ONLINE
  scan: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        zoom          ONLINE       0     0     0
          mirror-0    ONLINE       0     0     0
            c4t1d0s1  ONLINE       0     0     0
            c5t1d0s1  ONLINE       0     0     0

errors: No known data errors

        Iozone: Performance Test of File I/O
                Version $Revision: 3.434 $
                Compiled for 64 bit mode.
                Build: Solaris10gcc-64

        Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                     Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                     Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                     Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
                     Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
                     Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                     Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer,
                     Vangel Bojaxhi, Ben England, Vikentsi Lapa,
                     Alexey Skidanov.

        Run began: Thu Dec 17 13:21:59 2015

        Multi_buffer. Work area 16777216 bytes
        OPS Mode. Output is in operations per second.
        Record Size 8 kB
        SYNC Mode.
        File size set to 2097152 kB
        Command line used: /usr/local/bin/iozone -m -t 16 -T -O -r 8k -o -s 2G
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 kBytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
        Throughput test with 16 threads
        Each thread writes a 2097152 kByte file in 8 kByte records

        Children see throughput for 16 initial writers  =   29558.13 ops/sec
        Parent sees throughput for 16 initial writers   =   29467.57 ops/sec
        Min throughput per thread                       =    1845.28 ops/sec
        Max throughput per thread                       =    1853.28 ops/sec
        Avg throughput per thread                       =    1847.38 ops/sec
        Min xfer                                        =  261012.00 ops

        Children see throughput for 16 rewriters        =   26802.94 ops/sec
        Parent sees throughput for 16 rewriters         =   26801.51 ops/sec
        Min throughput per thread                       =    1671.70 ops/sec
        Max throughput per thread                       =    1679.40 ops/sec
        Avg throughput per thread                       =    1675.18 ops/sec
        Min xfer                                        =  260942.00 ops

        Children see throughput for 16 readers          =  305525.26 ops/sec
        Parent sees throughput for 16 readers           =  304910.58 ops/sec
        Min throughput per thread                       =   16371.37 ops/sec
        Max throughput per thread                       =   20084.48 ops/sec
        Avg throughput per thread                       =   19095.33 ops/sec
        Min xfer                                        =  213905.00 ops

        Children see throughput for 16 re-readers       =  301510.86 ops/sec
        Parent sees throughput for 16 re-readers        =  301021.85 ops/sec
        Min throughput per thread                       =   16066.28 ops/sec
        Max throughput per thread                       =   19850.40 ops/sec
        Avg throughput per thread                       =   18844.43 ops/sec
        Min xfer                                        =  212289.00 ops

        Children see throughput for 16 reverse readers  =  520691.82 ops/sec
        Parent sees throughput for 16 reverse readers   =  520026.68 ops/sec
        Min throughput per thread                       =   30897.40 ops/sec
        Max throughput per thread                       =   33412.20 ops/sec
        Avg throughput per thread                       =   32543.24 ops/sec
        Min xfer                                        =  242448.00 ops

        Children see throughput for 16 stride readers   =   27067.77 ops/sec
        Parent sees throughput for 16 stride readers    =   27064.74 ops/sec
        Min throughput per thread                       =    1549.09 ops/sec
        Max throughput per thread                       =    3205.10 ops/sec
        Avg throughput per thread                       =    1691.74 ops/sec
        Min xfer                                        =  126699.00 ops

        Children see throughput for 16 random readers   =  215258.98 ops/sec
        Parent sees throughput for 16 random readers    =  214461.71 ops/sec
        Min throughput per thread                       =    2759.80 ops/sec
        Max throughput per thread                       =  169551.89 ops/sec
        Avg throughput per thread                       =   13453.69 ops/sec
        Min xfer                                        =    4281.00 ops

        Children see throughput for 16 mixed workload   =    8673.89 ops/sec
        Parent sees throughput for 16 mixed workload    =    6341.03 ops/sec
        Min throughput per thread                       =     442.73 ops/sec
        Max throughput per thread                       =     641.36 ops/sec
        Avg throughput per thread                       =     542.12 ops/sec
        Min xfer                                        =  180991.00 ops

        Children see throughput for 16 random writers   =    4008.54 ops/sec
        Parent sees throughput for 16 random writers    =    3972.48 ops/sec
        Min throughput per thread                       =     248.54 ops/sec
        Max throughput per thread                       =     252.76 ops/sec
        Avg throughput per thread                       =     250.53 ops/sec
        Min xfer                                        =  257769.00 ops

        Children see throughput for 16 fwriters         =   70222.20 ops/sec
        Parent sees throughput for 16 fwriters          =   65632.32 ops/sec
        Min throughput per thread                       =    4132.12 ops/sec
        Max throughput per thread                       =    4686.85 ops/sec
        Avg throughput per thread                       =    4388.89 ops/sec
        Min xfer                                        =  262144.00 ops




Error in file: Found ?0? Expecting ?7979797979797979? addr 29f6770
Error in file: Found ?0? Expecting ?7979797979797979? addr 29f6770
Error in file: Position 0
Error in file: Position 0
Record # 0 Record size 8 kb
Record # 0 Record size 8 kb
where 29f6770x loop 0
where 29f6770x loop 0

I can delete and create files just fine.

Grrrr.

-brian

> On Dec 9, 2015, at 11:27 AM, Brian Hechinger <wonko at 4amlunch.net> wrote:
> 
> 
>> On Dec 9, 2015, at 11:22 AM, Dan McDonald <danmcd at omniti.com> wrote:
>> 
>> 
>>> On Dec 9, 2015, at 11:18 AM, Brian Hechinger <wonko at 4amlunch.net> wrote:
>>> 
>>> It?s brand new!!
>> 
>> Sometimes you get flaky HW that's new.  I've had to return new spinning-rust disks, for example.
> 
> Bah. :(
> 
>> 
>>> Also, I would expect the other slice to be affected as well?  It?s been humming along just fine as SLOG with no errors:
>>> 
>>>      logs
>>>        mirror-3    ONLINE       0     0     0
>>>          c4t1d0s0  ONLINE       0     0     0
>>>          c5t1d0s0  ONLINE       0     0     0
>> 
>> Could just be bad luck your slog hasn't encountered the bad portion of this drive.
> 
> I suppose. You think there is a maybe a good way to test this device before I try to get it RMA-ed?
> 
>> Also, what OmniOS revision are you running? If you're not up to the latest November r151014 update, you may be missing some NVMe fixes.
> 
> Oh right, totally forgot to do that for you:
> 
> wonko at basket1:/var/adm$ head /etc/release ; uname -a
>  OmniOS v11 r151016
>  Copyright 2015 OmniTI Computer Consulting, Inc. All rights reserved.
>  Use is subject to license terms.
> SunOS basket1 5.11 omnios-073d8c0 i86pc i386 i86pc
> 


From danmcd at omniti.com  Thu Dec 17 19:15:11 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Thu, 17 Dec 2015 14:15:11 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <A9FE4FCF-F381-4204-AECB-947D15110794@4amlunch.net>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
	<7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
	<A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>
	<4B858828-C823-4251-84A9-417028B01B3C@omniti.com>
	<584980F4-502A-4700-A58F-E720CB398BF0@4amlunch.net>
	<4B0CFB00-2181-4E38-B0E1-8AAAA3E6136C@omniti.com>
	<7D06CC38-9841-4189-80CD-6341E025B10C@4amlunch.net>
	<A9FE4FCF-F381-4204-AECB-947D15110794@4amlunch.net>
Message-ID: <1A509267-5ADB-451C-A540-5F49367B7C22@omniti.com>


> On Dec 17, 2015, at 2:05 PM, Brian Hechinger <wonko at 4amlunch.net> wrote:
> 
> I can delete and create files just fine.
> 
> Grrrr.

Scrub it now.  Just in case.  A scrub is always a good idea anyway just to make sure bits haven't rotted on the disk.

Dan


From danmcd at omniti.com  Thu Dec 17 19:15:55 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Thu, 17 Dec 2015 14:15:55 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <1A509267-5ADB-451C-A540-5F49367B7C22@omniti.com>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
	<7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
	<A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>
	<4B858828-C823-4251-84A9-417028B01B3C@omniti.com>
	<584980F4-502A-4700-A58F-E720CB398BF0@4amlunch.net>
	<4B0CFB00-2181-4E38-B0E1-8AAAA3E6136C@omniti.com>
	<7D06CC38-9841-4189-80CD-6341E025B10C@4amlunch.net>
	<A9FE4FCF-F381-4204-AECB-947D15110794@4amlunch.net>
	<1A509267-5ADB-451C-A540-5F49367B7C22@omniti.com>
Message-ID: <68FC6DA2-5431-4AD5-8F4F-1023ED9CAA15@omniti.com>


> On Dec 17, 2015, at 2:15 PM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> Scrub it now.  Just in case.  A scrub is always a good idea anyway just to make sure bits haven't rotted on the disk.

Pardon me if I'm being pedantic:

	zpool scrub zoom

Then check it with zpool status.

Dan


From wonko at 4amlunch.net  Thu Dec 17 19:17:29 2015
From: wonko at 4amlunch.net (Brian Hechinger)
Date: Thu, 17 Dec 2015 14:17:29 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <1A509267-5ADB-451C-A540-5F49367B7C22@omniti.com>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
	<7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
	<A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>
	<4B858828-C823-4251-84A9-417028B01B3C@omniti.com>
	<584980F4-502A-4700-A58F-E720CB398BF0@4amlunch.net>
	<4B0CFB00-2181-4E38-B0E1-8AAAA3E6136C@omniti.com>
	<7D06CC38-9841-4189-80CD-6341E025B10C@4amlunch.net>
	<A9FE4FCF-F381-4204-AECB-947D15110794@4amlunch.net>
	<1A509267-5ADB-451C-A540-5F49367B7C22@omniti.com>
Message-ID: <F9678155-6710-457B-8FD2-2EBF5BD4B078@4amlunch.net>

Boom.

wonko at basket1:/export/home/wonko$ sudo zpool scrub zoom
Password:
wonko at basket1:/export/home/wonko$ sudo zpool status -v zoom
  pool: zoom
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: scrub repaired 226K in 0h0m with 0 errors on Thu Dec 17 14:15:12 2015
config:

        NAME          STATE     READ WRITE CKSUM
        zoom          DEGRADED     0     0     0
          mirror-0    DEGRADED     0     0     0
            c4t1d0s1  DEGRADED     0     0    38  too many errors
            c5t1d0s1  DEGRADED     0     0    42  too many errors

errors: No known data errors

-brian

> On Dec 17, 2015, at 2:15 PM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> 
>> On Dec 17, 2015, at 2:05 PM, Brian Hechinger <wonko at 4amlunch.net> wrote:
>> 
>> I can delete and create files just fine.
>> 
>> Grrrr.
> 
> Scrub it now.  Just in case.  A scrub is always a good idea anyway just to make sure bits haven't rotted on the disk.
> 
> Dan
> 


From danmcd at omniti.com  Thu Dec 17 19:18:13 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Thu, 17 Dec 2015 14:18:13 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <F9678155-6710-457B-8FD2-2EBF5BD4B078@4amlunch.net>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
	<7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
	<A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>
	<4B858828-C823-4251-84A9-417028B01B3C@omniti.com>
	<584980F4-502A-4700-A58F-E720CB398BF0@4amlunch.net>
	<4B0CFB00-2181-4E38-B0E1-8AAAA3E6136C@omniti.com>
	<7D06CC38-9841-4189-80CD-6341E025B10C@4amlunch.net>
	<A9FE4FCF-F381-4204-AECB-947D15110794@4amlunch.net>
	<1A509267-5ADB-451C-A540-5F49367B7C22@omniti.com>
	<F9678155-6710-457B-8FD2-2EBF5BD4B078@4amlunch.net>
Message-ID: <7FF77844-B41C-4837-A81F-42F32EF72AAC@omniti.com>


> On Dec 17, 2015, at 2:17 PM, Brian Hechinger <wonko at 4amlunch.net> wrote:
> 
> Boom.
> 
> wonko at basket1:/export/home/wonko$ sudo zpool scrub zoom
> Password:
> wonko at basket1:/export/home/wonko$ sudo zpool status -v zoom
>  pool: zoom
> state: DEGRADED
> status: One or more devices has experienced an unrecoverable error.  An
>        attempt was made to correct the error.  Applications are unaffected.
> action: Determine if the device needs to be replaced, and clear the errors
>        using 'zpool clear' or replace the device with 'zpool replace'.
>   see: http://illumos.org/msg/ZFS-8000-9P
>  scan: scrub repaired 226K in 0h0m with 0 errors on Thu Dec 17 14:15:12 2015
> config:
> 
>        NAME          STATE     READ WRITE CKSUM
>        zoom          DEGRADED     0     0     0
>          mirror-0    DEGRADED     0     0     0
>            c4t1d0s1  DEGRADED     0     0    38  too many errors
>            c5t1d0s1  DEGRADED     0     0    42  too many errors
> 
> errors: No known data errors

Looks like you got bad drives.

Dan


From wonko at 4amlunch.net  Thu Dec 17 19:20:13 2015
From: wonko at 4amlunch.net (Brian Hechinger)
Date: Thu, 17 Dec 2015 14:20:13 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <7FF77844-B41C-4837-A81F-42F32EF72AAC@omniti.com>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
	<7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
	<A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>
	<4B858828-C823-4251-84A9-417028B01B3C@omniti.com>
	<584980F4-502A-4700-A58F-E720CB398BF0@4amlunch.net>
	<4B0CFB00-2181-4E38-B0E1-8AAAA3E6136C@omniti.com>
	<7D06CC38-9841-4189-80CD-6341E025B10C@4amlunch.net>
	<A9FE4FCF-F381-4204-AECB-947D15110794@4amlunch.net>
	<1A509267-5ADB-451C-A540-5F49367B7C22@omniti.com>
	<F9678155-6710-457B-8FD2-2EBF5BD4B078@4amlunch.net>
	<7FF77844-B41C-4837-A81F-42F32EF72AAC@omniti.com>
Message-ID: <FCD70D85-8C7C-476A-96F7-E9939D07ADDE@4amlunch.net>

That seems??? unlikely to me?

I?ll put one of them into a linux box and see what happens with it.

Is there a way to somehow see if the nvme drivers are being wonky? I get the feeling NVMe 1.1 cards aren?t completely supported just yet?

-brian

> On Dec 17, 2015, at 2:18 PM, Dan McDonald <danmcd at omniti.com> wrote:
> 
>> 
>> On Dec 17, 2015, at 2:17 PM, Brian Hechinger <wonko at 4amlunch.net> wrote:
>> 
>> Boom.
>> 
>> wonko at basket1:/export/home/wonko$ sudo zpool scrub zoom
>> Password:
>> wonko at basket1:/export/home/wonko$ sudo zpool status -v zoom
>> pool: zoom
>> state: DEGRADED
>> status: One or more devices has experienced an unrecoverable error.  An
>>       attempt was made to correct the error.  Applications are unaffected.
>> action: Determine if the device needs to be replaced, and clear the errors
>>       using 'zpool clear' or replace the device with 'zpool replace'.
>>  see: http://illumos.org/msg/ZFS-8000-9P
>> scan: scrub repaired 226K in 0h0m with 0 errors on Thu Dec 17 14:15:12 2015
>> config:
>> 
>>       NAME          STATE     READ WRITE CKSUM
>>       zoom          DEGRADED     0     0     0
>>         mirror-0    DEGRADED     0     0     0
>>           c4t1d0s1  DEGRADED     0     0    38  too many errors
>>           c5t1d0s1  DEGRADED     0     0    42  too many errors
>> 
>> errors: No known data errors
> 
> Looks like you got bad drives.
> 
> Dan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151217/a47a949d/attachment.html>

From danmcd at omniti.com  Thu Dec 17 19:21:33 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Thu, 17 Dec 2015 14:21:33 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <FCD70D85-8C7C-476A-96F7-E9939D07ADDE@4amlunch.net>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
	<7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
	<A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>
	<4B858828-C823-4251-84A9-417028B01B3C@omniti.com>
	<584980F4-502A-4700-A58F-E720CB398BF0@4amlunch.net>
	<4B0CFB00-2181-4E38-B0E1-8AAAA3E6136C@omniti.com>
	<7D06CC38-9841-4189-80CD-6341E025B10C@4amlunch.net>
	<A9FE4FCF-F381-4204-AECB-947D15110794@4amlunch.net>
	<1A509267-5ADB-451C-A540-5F49367B7C22@omniti.com>
	<F9678155-6710-457B-8FD2-2EBF5BD4B078@4amlunch.net>
	<7FF77844-B41C-4837-A81F-42F32EF72AAC@omniti.com>
	<FCD70D85-8C7C-476A-96F7-E9939D07ADDE@4amlunch.net>
Message-ID: <4C5FB8BB-861E-4425-9F96-17D8E337A5AA@omniti.com>


> On Dec 17, 2015, at 2:20 PM, Brian Hechinger <wonko at 4amlunch.net> wrote:
> 
> That seems??? unlikely to me?
> 
> I?ll put one of them into a linux box and see what happens with it.
> 
> Is there a way to somehow see if the nvme drivers are being wonky? I get the feeling NVMe 1.1 cards aren?t completely supported just yet?

OH SHOOT!  I forgot these are NVMe.

Did you see my mail announcing the update?  Did you see it has two NVME fixes in it?

Dan


From wonko at 4amlunch.net  Thu Dec 17 19:23:26 2015
From: wonko at 4amlunch.net (Brian Hechinger)
Date: Thu, 17 Dec 2015 14:23:26 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <4C5FB8BB-861E-4425-9F96-17D8E337A5AA@omniti.com>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
	<7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
	<A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>
	<4B858828-C823-4251-84A9-417028B01B3C@omniti.com>
	<584980F4-502A-4700-A58F-E720CB398BF0@4amlunch.net>
	<4B0CFB00-2181-4E38-B0E1-8AAAA3E6136C@omniti.com>
	<7D06CC38-9841-4189-80CD-6341E025B10C@4amlunch.net>
	<A9FE4FCF-F381-4204-AECB-947D15110794@4amlunch.net>
	<1A509267-5ADB-451C-A540-5F49367B7C22@omniti.com>
	<F9678155-6710-457B-8FD2-2EBF5BD4B078@4amlunch.net>
	<7FF77844-B41C-4837-A81F-42F32EF72AAC@omniti.com>
	<FCD70D85-8C7C-476A-96F7-E9939D07ADDE@4amlunch.net>
	<4C5FB8BB-861E-4425-9F96-17D8E337A5AA@omniti.com>
Message-ID: <04185D4F-4394-4517-B395-89012DB9BE66@4amlunch.net>

Yeah, I think the one I already had (the init() related one that Hans gave me) but I wonder if the other one is somehow related?

I?ve installed the updates.

I?ll re-create the pool and re-run iozone

-brian

> On Dec 17, 2015, at 2:21 PM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> 
>> On Dec 17, 2015, at 2:20 PM, Brian Hechinger <wonko at 4amlunch.net> wrote:
>> 
>> That seems??? unlikely to me?
>> 
>> I?ll put one of them into a linux box and see what happens with it.
>> 
>> Is there a way to somehow see if the nvme drivers are being wonky? I get the feeling NVMe 1.1 cards aren?t completely supported just yet?
> 
> OH SHOOT!  I forgot these are NVMe.
> 
> Did you see my mail announcing the update?  Did you see it has two NVME fixes in it?
> 
> Dan
> 


From wonko at 4amlunch.net  Thu Dec 17 19:38:59 2015
From: wonko at 4amlunch.net (Brian Hechinger)
Date: Thu, 17 Dec 2015 14:38:59 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <04185D4F-4394-4517-B395-89012DB9BE66@4amlunch.net>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
	<7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
	<A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>
	<4B858828-C823-4251-84A9-417028B01B3C@omniti.com>
	<584980F4-502A-4700-A58F-E720CB398BF0@4amlunch.net>
	<4B0CFB00-2181-4E38-B0E1-8AAAA3E6136C@omniti.com>
	<7D06CC38-9841-4189-80CD-6341E025B10C@4amlunch.net>
	<A9FE4FCF-F381-4204-AECB-947D15110794@4amlunch.net>
	<1A509267-5ADB-451C-A540-5F49367B7C22@omniti.com>
	<F9678155-6710-457B-8FD2-2EBF5BD4B078@4amlunch.net>
	<7FF77844-B41C-4837-A81F-42F32EF72AAC@omniti.com>
	<FCD70D85-8C7C-476A-96F7-E9939D07ADDE@4amlunch.net>
	<4C5FB8BB-861E-4425-9F96-17D8E337A5AA@omniti.com>
	<04185D4F-4394-4517-B395-89012DB9BE66@4amlunch.net>
Message-ID: <31E95CC0-CFE7-46BE-ABF2-F099ADBB7527@4amlunch.net>

And??

  pool: zoom
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        zoom          DEGRADED     0     0    25
          mirror-0    DEGRADED     0     0   150
            c4t1d0s1  DEGRADED     0     0   150  too many errors
            c5t1d0s1  DEGRADED     0     0   154  too many errors

So those patches didn?t help. :(

-brian

> On Dec 17, 2015, at 2:23 PM, Brian Hechinger <wonko at 4amlunch.net> wrote:
> 
> Yeah, I think the one I already had (the init() related one that Hans gave me) but I wonder if the other one is somehow related?
> 
> I?ve installed the updates.
> 
> I?ll re-create the pool and re-run iozone
> 
> -brian
> 
>> On Dec 17, 2015, at 2:21 PM, Dan McDonald <danmcd at omniti.com> wrote:
>> 
>> 
>>> On Dec 17, 2015, at 2:20 PM, Brian Hechinger <wonko at 4amlunch.net> wrote:
>>> 
>>> That seems??? unlikely to me?
>>> 
>>> I?ll put one of them into a linux box and see what happens with it.
>>> 
>>> Is there a way to somehow see if the nvme drivers are being wonky? I get the feeling NVMe 1.1 cards aren?t completely supported just yet?
>> 
>> OH SHOOT!  I forgot these are NVMe.
>> 
>> Did you see my mail announcing the update?  Did you see it has two NVME fixes in it?
>> 
>> Dan
>> 
> 


From danmcd at omniti.com  Thu Dec 17 19:44:28 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Thu, 17 Dec 2015 14:44:28 -0500
Subject: [OmniOS-discuss] Hung ZFS Pool
In-Reply-To: <31E95CC0-CFE7-46BE-ABF2-F099ADBB7527@4amlunch.net>
References: <25D313E7-C974-43C0-817E-3A96514BCC16@4amlunch.net>
	<A1BD9243-CD1A-47D9-BC9B-C486DEB19330@omniti.com>
	<3FF750E3-A2C5-467C-A0D2-BDCC8C48C9CA@4amlunch.net>
	<7F5D451E-6467-4A3D-8785-AE069524452A@omniti.com>
	<A55AA698-B8D7-4041-AB97-F37055DBCDB9@4amlunch.net>
	<4B858828-C823-4251-84A9-417028B01B3C@omniti.com>
	<584980F4-502A-4700-A58F-E720CB398BF0@4amlunch.net>
	<4B0CFB00-2181-4E38-B0E1-8AAAA3E6136C@omniti.com>
	<7D06CC38-9841-4189-80CD-6341E025B10C@4amlunch.net>
	<A9FE4FCF-F381-4204-AECB-947D15110794@4amlunch.net>
	<1A509267-5ADB-451C-A540-5F49367B7C22@omniti.com>
	<F9678155-6710-457B-8FD2-2EBF5BD4B078@4amlunch.net>
	<7FF77844-B41C-4837-A81F-42F32EF72AAC@omniti.com>
	<FCD70D85-8C7C-476A-96F7-E9939D07ADDE@4amlunch.net>
	<4C5FB8BB-861E-4425-9F96-17D8E337A5AA@omniti.com>
	<04185D4F-4394-4517-B395-89012DB9BE66@4amlunch.net>
	<31E95CC0-CFE7-46BE-ABF2-F099ADBB7527@4amlunch.net>
Message-ID: <440ADF5A-0FFB-40E3-B4FB-17015DAA0C91@omniti.com>


> On Dec 17, 2015, at 2:38 PM, Brian Hechinger <wonko at 4amlunch.net> wrote:
> 
> So those patches didn?t help. :(

Hmm.  Try one on your Linux box, and I do know that we really only support NVMe 1.0 currently, not any higher revisions.

You may also need to kick this out to the illumos list, where the NVMe developer can see it.

Dan


From stephan.budach at JVM.DE  Sun Dec 20 21:16:22 2015
From: stephan.budach at JVM.DE (Stephan Budach)
Date: Sun, 20 Dec 2015 22:16:22 +0100
Subject: [OmniOS-discuss] OmniOS r151016 crashed and rebootet
Message-ID: <56771AA6.90501@jvm.de>

Hi all,

a couple of hours ago one of my OmniOS boxes crashed and rebootet. As 
I'd like to determine the reason as of why that happend, I'd could use 
some advice on how to do that. There is a vmdump.0 available, but I am 
lacking the knowledge what to do with it.

Could anyone fill me in on that?

Thanks,
Stephan

From stephan.budach at JVM.DE  Sun Dec 20 21:54:09 2015
From: stephan.budach at JVM.DE (Stephan Budach)
Date: Sun, 20 Dec 2015 22:54:09 +0100
Subject: [OmniOS-discuss] OmniOS r151016 crashed and rebootet
In-Reply-To: <56771AA6.90501@jvm.de>
References: <56771AA6.90501@jvm.de>
Message-ID: <56772381.8040606@jvm.de>

A little addendum?

Am 20.12.15 um 22:16 schrieb Stephan Budach:
> Hi all,
>
> a couple of hours ago one of my OmniOS boxes crashed and rebootet. As 
> I'd like to determine the reason as of why that happend, I'd could use 
> some advice on how to do that. There is a vmdump.0 available, but I am 
> lacking the knowledge what to do with it.
>
> Could anyone fill me in on that?
>
> Thanks,
> Stephan 

I found this while digging through fmdump's log:

root at nfsvmpool07:/root# fmdump -Vp -u 1e24474c-0077-cad1-e684-8f3b0f950af6
TIME                           UUID SUNW-MSG-ID
Dez 20 2015 20:26:34.062346000 1e24474c-0077-cad1-e684-8f3b0f950af6 
SUNOS-8000-KL

   TIME                 CLASS                                 ENA
   Dez 20 20:26:33.8727 ireport.os.sunos.panic.dump_available 
0x0000000000000000
   Dez 20 20:23:14.5491 ireport.os.sunos.panic.dump_pending_on_device 
0x0000000000000000

nvlist version: 0
         version = 0x0
         class = list.suspect
         uuid = 1e24474c-0077-cad1-e684-8f3b0f950af6
         code = SUNOS-8000-KL
         diag-time = 1450639593 895469
         de = fmd:///module/software-diagnosis
         fault-list-sz = 0x1
         fault-list = (array of embedded nvlists)
         (start fault-list[0])
         nvlist version: 0
                 version = 0x0
                 class = defect.sunos.kernel.panic
                 certainty = 0x64
                 asru = 
sw:///:path=/var/crash/unknown/.1e24474c-0077-cad1-e684-8f3b0f950af6
                 resource = 
sw:///:path=/var/crash/unknown/.1e24474c-0077-cad1-e684-8f3b0f950af6
                 savecore-succcess = 1
                 dump-dir = /var/crash/unknown
                 dump-files = vmdump.0
                 os-instance-uuid = 1e24474c-0077-cad1-e684-8f3b0f950af6
                 panicstr = kernel heap corruption detected
                 panicstack = fffffffffba4e8d4 () | 
genunix:kmem_slab_free+c1 () | genunix:kmem_magazine_destroy+6e () | 
genunix:kmem_depot_ws_reap+5d () | genunix:kmem_cache_magazine_purge+110 
() | genunix:kmem_cache_magazine_resize+40 () | genunix:taskq_thread+2d0 
() | unix:thread_start+8 () |
                 crashtime = 1450638112
                 panic-time = Sun Dec 20 20:01:52 2015 CET
         (end fault-list[0])

         fault-status = 0x1
         severity = Major
         __ttl = 0x1
         __tod = 0x567700ea 0x3b75310

Cheers,
Stephan

From danmcd at omniti.com  Sun Dec 20 23:36:37 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Sun, 20 Dec 2015 18:36:37 -0500
Subject: [OmniOS-discuss] OmniOS r151016 crashed and rebootet
In-Reply-To: <56771AA6.90501@jvm.de>
References: <56771AA6.90501@jvm.de>
Message-ID: <213D3342-ABEA-4AA1-A8B5-773940300451@omniti.com>

Place it somewhere I can download it unless it has customer information.  If it does, mail me offline for instructions.

Dan

Sent from my iPhone (typos, autocorrect, and all)

> On Dec 20, 2015, at 4:16 PM, Stephan Budach <stephan.budach at JVM.DE> wrote:
> 
> Hi all,
> 
> a couple of hours ago one of my OmniOS boxes crashed and rebootet. As I'd like to determine the reason as of why that happend, I'd could use some advice on how to do that. There is a vmdump.0 available, but I am lacking the knowledge what to do with it.
> 
> Could anyone fill me in on that?
> 
> Thanks,
> Stephan
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

From bfriesen at simple.dallas.tx.us  Mon Dec 21 21:29:48 2015
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Mon, 21 Dec 2015 15:29:48 -0600 (CST)
Subject: [OmniOS-discuss] rsync 3.1.2 & security fix
Message-ID: <alpine.GSO.2.01.1512211527430.28454@freddy.simplesystems.org>

Rsync 3.1.2 is out and contains a security fix.  OmniOS seems to be 
using 3.1.1.  See 
"http://rsync.samba.org/ftp/rsync/src/rsync-3.1.2-NEWS".

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From bfriesen at simple.dallas.tx.us  Mon Dec 21 23:26:02 2015
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Mon, 21 Dec 2015 17:26:02 -0600 (CST)
Subject: [OmniOS-discuss] OmniOS r151016 zone has difficulties shutting
 down
In-Reply-To: <536501D2-EA96-4F6B-8CB2-39A0F9698267@omniti.com>
References: <alpine.GSO.2.01.1512061727590.1673@freddy.simplesystems.org>
	<536501D2-EA96-4F6B-8CB2-39A0F9698267@omniti.com>
Message-ID: <alpine.GSO.2.01.1512211720120.28454@freddy.simplesystems.org>

>
>> Have others encountered this issue?  What can be done to fix it?
>
> This message is printed by zoneadmd.  If you or anyone else encounters this hang again, please do the following:
>
> 1.) While zoneadm is hung, check the console for the above message, you'll see a pid for zoneadmd (Bob's example was 17388).
>
> 2.) See if you can get the stack(s) of zoneadmd that reported the console master error:    pstack <PID>
>
> 3.) Grab a corefile of the zoneadmd:  gcore <PID>
>
> 4.) Share the corefile somehow.
>
> The pstack and core of the running/hung zoneadm(1M) command would also be useful, I think.

I captured some data (as described above) and have made it available 
for anonymous ftp at 
"ftp://ftp.simplesystems.org/pub/outgoing/omnios/zoneadmd/".  I did 
this prior to updating the system due to suspecting that the problem 
would be cured by rebooting the system.

As suspected, the problem was cured by rebooting the system.  Perhaps 
the parent zoneadmd is confused about the state after the new zone has 
been added and this confusion carries over to the child zoneadmd.

The creation of the zone follows the example from the OmniOS Wiki 
except for the addition of a lofs mount.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From danmcd at omniti.com  Tue Dec 22 01:02:59 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 21 Dec 2015 20:02:59 -0500
Subject: [OmniOS-discuss] OmniOS r151016 zone has difficulties shutting
	down
In-Reply-To: <alpine.GSO.2.01.1512211720120.28454@freddy.simplesystems.org>
References: <alpine.GSO.2.01.1512061727590.1673@freddy.simplesystems.org>
	<536501D2-EA96-4F6B-8CB2-39A0F9698267@omniti.com>
	<alpine.GSO.2.01.1512211720120.28454@freddy.simplesystems.org>
Message-ID: <C328ACF6-18BC-49AB-BFA8-5E47963D8C81@omniti.com>

I forgot to mention "zoneadm list -cv".  That would've shown the zone's state.

The pstack for zoneadmd showed this function:

http://src.illumos.org/source/xref/illumos-gate/usr/src/cmd/zoneadmd/zoneadmd.c#1089

is waiting for something to only exit a loop after a long wait.

I'm curious which of:  zone_get_state() failing or "zstate == ZONE_STATE_INSTALLED" fails?  That's why the "list -cv" would've been nice.

If this happens ever again, some more useful captures:

	ptree `pgrep -z <zonename>`

	pargs `pgrep -z <zonename>`

	pstack `pgrep -z <zonename>`

That'll be a LOT of output, BUT it may provide more clues.

Thanks for this, and sorry I don't have more immediately useful data.

Dan


From danmcd at omniti.com  Tue Dec 22 01:17:27 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 21 Dec 2015 20:17:27 -0500
Subject: [OmniOS-discuss] rsync 3.1.2 & security fix
In-Reply-To: <alpine.GSO.2.01.1512211527430.28454@freddy.simplesystems.org>
References: <alpine.GSO.2.01.1512211527430.28454@freddy.simplesystems.org>
Message-ID: <06447436-873B-436B-9686-5B632F9D5C9A@omniti.com>


> On Dec 21, 2015, at 4:29 PM, Bob Friesenhahn <bfriesen at simple.dallas.tx.us> wrote:
> 
> Rsync 3.1.2 is out and contains a security fix.  OmniOS seems to be using 3.1.1.  See "http://rsync.samba.org/ftp/rsync/src/rsync-3.1.2-NEWS".

So much for a full vacation day...

Watch this space for an update later this evening.

Thank you for pointing this out, security is important!!!
Dan


From danmcd at omniti.com  Tue Dec 22 01:29:12 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 21 Dec 2015 20:29:12 -0500
Subject: [OmniOS-discuss] SECURITY UPDATE rsync to 3.1.2
Message-ID: <C716581A-7FC6-4DFF-B772-89E3EC92F933@omniti.com>

Hello!

Thanks go out to Bob Friesenhahn for reminding me about today's rsync security update.  It is now pushed out for r151014 (LTS) and r151016 (Stable).

Happy updating!
Dan

p.s. I'm officially on vacation the rest of this year, so pardon any latency increases until 2016.


From bfriesen at simple.dallas.tx.us  Tue Dec 22 02:05:54 2015
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Mon, 21 Dec 2015 20:05:54 -0600 (CST)
Subject: [OmniOS-discuss] rsync 3.1.2 & security fix
In-Reply-To: <06447436-873B-436B-9686-5B632F9D5C9A@omniti.com>
References: <alpine.GSO.2.01.1512211527430.28454@freddy.simplesystems.org>
	<06447436-873B-436B-9686-5B632F9D5C9A@omniti.com>
Message-ID: <alpine.GSO.2.01.1512212004430.28454@freddy.simplesystems.org>

On Mon, 21 Dec 2015, Dan McDonald wrote:

>
>> On Dec 21, 2015, at 4:29 PM, Bob Friesenhahn <bfriesen at simple.dallas.tx.us> wrote:
>>
>> Rsync 3.1.2 is out and contains a security fix.  OmniOS seems to be using 3.1.1.  See "http://rsync.samba.org/ftp/rsync/src/rsync-3.1.2-NEWS".
>
> So much for a full vacation day...
>
> Watch this space for an update later this evening.
>
> Thank you for pointing this out, security is important!!!

This is yet another case where the sending side can send something bad 
to cause harm to the recipient.  It is only a problem if you don't 
trust the sending side.

I agree that security is important.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From hasslerd at gmx.li  Tue Dec 22 10:50:22 2015
From: hasslerd at gmx.li (Dominik Hassler)
Date: Tue, 22 Dec 2015 11:50:22 +0100
Subject: [OmniOS-discuss] OmniOS r151016 zone has difficulties shutting
 down
In-Reply-To: <C328ACF6-18BC-49AB-BFA8-5E47963D8C81@omniti.com>
References: <alpine.GSO.2.01.1512061727590.1673@freddy.simplesystems.org>
	<536501D2-EA96-4F6B-8CB2-39A0F9698267@omniti.com>
	<alpine.GSO.2.01.1512211720120.28454@freddy.simplesystems.org>
	<C328ACF6-18BC-49AB-BFA8-5E47963D8C81@omniti.com>
Message-ID: <56792AEE.10807@gmx.li>

Dan,

I remember that in my cases when a zone shutdown got stuck, "zoneadm 
list -cv" showed the state of the hung zone as: shutting_down


On 12/22/2015 02:02 AM, Dan McDonald wrote:
> I forgot to mention "zoneadm list -cv".  That would've shown the zone's state.
>
> The pstack for zoneadmd showed this function:
>
> http://src.illumos.org/source/xref/illumos-gate/usr/src/cmd/zoneadmd/zoneadmd.c#1089
>
> is waiting for something to only exit a loop after a long wait.
>
> I'm curious which of:  zone_get_state() failing or "zstate == ZONE_STATE_INSTALLED" fails?  That's why the "list -cv" would've been nice.
>
> If this happens ever again, some more useful captures:
>
> 	ptree `pgrep -z <zonename>`
>
> 	pargs `pgrep -z <zonename>`
>
> 	pstack `pgrep -z <zonename>`
>
> That'll be a LOT of output, BUT it may provide more clues.
>
> Thanks for this, and sorry I don't have more immediately useful data.
>
> Dan
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>

From jz+omni at neexistuje.sk  Tue Dec 22 16:15:39 2015
From: jz+omni at neexistuje.sk (Juraj Ziegler)
Date: Tue, 22 Dec 2015 17:15:39 +0100
Subject: [OmniOS-discuss] SECURITY UPDATE rsync to 3.1.2
In-Reply-To: <C716581A-7FC6-4DFF-B772-89E3EC92F933@omniti.com>
References: <C716581A-7FC6-4DFF-B772-89E3EC92F933@omniti.com>
Message-ID: <E3E8662F-9D31-427B-9E7B-458904C5A735@neexistuje.sk>


> On 22.12.2015, at 2:29, Dan McDonald <danmcd at omniti.com> wrote:
> 
> Hello!
> 
> Thanks go out to Bob Friesenhahn for reminding me about today's rsync security update.  It is now pushed out for r151014 (LTS) and r151016 (Stable).
> 
> Happy updating!
> Dan
> 
> p.s. I'm officially on vacation the rest of this year, so pardon any latency increases until 2016.

Am I doing something wrong, or is something else wrong?
rsync is not updating for me.

As shown below, ?pkg update -nv? says there?s nothing to update.
pkg is subscribed to r151016 publisher.
rsync is 3.1.1.

(Personally, I don?t mind the vacation latency, but other users might be affected by this as well).
 

root at box:/root# rsync --version
rsync  version 3.1.1  protocol version 31
Copyright (C) 1996-2014 by Andrew Tridgell, Wayne Davison, and others.
Web site: http://rsync.samba.org/
Capabilities:
    64-bit files, 64-bit inums, 32-bit timestamps, 64-bit long ints,
    socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace,
    append, ACLs, xattrs, iconv, symtimes, no prealloc

rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
are welcome to redistribute it under certain conditions.  See the GNU
General Public Licence for details.

root at box:/root# uname -a
SunOS box 5.11 omnios-b5093df i86pc i386 i86pc

root at box:/root# cat /etc/release
  OmniOS v11 r151016
  Copyright 2015 OmniTI Computer Consulting, Inc. All rights reserved.
  Use is subject to license terms.

root at box:/root# pkg publisher
PUBLISHER                   TYPE     STATUS P LOCATION
omnios                      origin   online F http://pkg.omniti.com/omnios/r151016/
ms.omniti.com               origin   online F http://pkg.omniti.com/omniti-ms/
niksula.hut.fi              origin   online F http://pkg.niksula.hut.fi/
omnios.blackdot.be              origin   online F http://omnios.blackdot.be/
uulm.mawi                   origin   online F http://scott.mathematik.uni-ulm.de/release/

root at box:/root# pkg update -nv
No updates available for this image.


j.


From danmcd at omniti.com  Tue Dec 22 16:47:35 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 22 Dec 2015 11:47:35 -0500
Subject: [OmniOS-discuss] SECURITY UPDATE rsync to 3.1.2
In-Reply-To: <E3E8662F-9D31-427B-9E7B-458904C5A735@neexistuje.sk>
References: <C716581A-7FC6-4DFF-B772-89E3EC92F933@omniti.com>
	<E3E8662F-9D31-427B-9E7B-458904C5A735@neexistuje.sk>
Message-ID: <206EEEE6-E838-4BB6-8008-5C99E3D38E7F@omniti.com>


> On Dec 22, 2015, at 11:15 AM, Juraj Ziegler <jz+omni at neexistuje.sk> wrote:
> 
> Am I doing something wrong, or is something else wrong?
> rsync is not updating for me.
> 
> As shown below, ?pkg update -nv? says there?s nothing to update.
> pkg is subscribed to r151016 publisher.
> rsync is 3.1.1.
> 
> (Personally, I don?t mind the vacation latency, but other users might be affected by this as well).

Where is your rsync coming from?  Utter this:

	pkg list rsync

and see if you have output.  Maybe you have your own version somewhere?  Maybe it's from one of the other publishers you mention:

> root at box:/root# pkg publisher
> PUBLISHER                   TYPE     STATUS P LOCATION
> omnios                      origin   online F http://pkg.omniti.com/omnios/r151016/
> ms.omniti.com               origin   online F http://pkg.omniti.com/omniti-ms/
> niksula.hut.fi              origin   online F http://pkg.niksula.hut.fi/
> omnios.blackdot.be              origin   online F http://omnios.blackdot.be/
> uulm.mawi                   origin   online F http://scott.mathematik.uni-ulm.de/release/
> 
> root at box:/root# pkg update -nv
> No updates available for this image.

It's most certainly available:

	pkg list -avf -g http://pkg.omniti.com/omniti-ms/r151016 rsync

You should see two versions, 3.1.1 and 3.1.2.

I think your'e getting your rsync from one of the other publishers mentioned on your list.

Dan


From davide.poletto at gmail.com  Tue Dec 22 21:37:49 2015
From: davide.poletto at gmail.com (Davide Poletto)
Date: Tue, 22 Dec 2015 22:37:49 +0100
Subject: [OmniOS-discuss] SECURITY UPDATE rsync to 3.1.2
In-Reply-To: <206EEEE6-E838-4BB6-8008-5C99E3D38E7F@omniti.com>
References: <C716581A-7FC6-4DFF-B772-89E3EC92F933@omniti.com>
	<E3E8662F-9D31-427B-9E7B-458904C5A735@neexistuje.sk>
	<206EEEE6-E838-4BB6-8008-5C99E3D38E7F@omniti.com>
Message-ID: <CANKMAMa3wTanZPOapTWVH2-Px=zY1tQw0m2zoGQFYDMatuq+6g@mail.gmail.com>

Just for information...on a r151014 (on which rsync was not installed) the
command:

pkg list -avf -g http://pkg.omniti.com/omniti-ms/r151014 rsync

provides:

Errors were encountered while attempting to retrieve package or file data
for
the requested operation.
Details follow:

http protocol error: code: 400 reason: Bad Request
URL: 'http://pkg.omniti.com/omniti-ms/r151014/versions/0/'

The same result happens using the r151016 string instead of r151014.

Instead running:

pkg list -avf -g http://pkg.omniti.com/omnios/r151014 rsync

provides what is expected:

FMRI
IFO
pkg://omnios/network/rsync at 3.1.2-0.151014:20151222T011609Z
---
pkg://omnios/network/rsync at 3.1.1-0.151014:20150402T174523Z
---

So rsync could be provided by omnios publisher.

Looking at r151016:

pkg list -avf -g http://pkg.omniti.com/omnios/r151016 rsync

gives:

FMRI
IFO
pkg://omnios/network/rsync at 3.1.2-0.151016:20151222T011220Z
---
pkg://omnios/network/rsync at 3.1.1-0.151016:20151102T185945Z
---

Once rsync is installed (omnios publisher) invoking:

pkg list -avf -g http://pkg.omniti.com/omnios/r151014 rsync

provides:

FMRI
IFO
pkg://omnios/network/rsync at 3.1.2-0.151014:20151222T011609Z
i--
pkg://omnios/network/rsync at 3.1.1-0.151014:20150402T174523Z
---

On Tue, Dec 22, 2015 at 5:47 PM, Dan McDonald <danmcd at omniti.com> wrote:

>
> > On Dec 22, 2015, at 11:15 AM, Juraj Ziegler <jz+omni at neexistuje.sk>
> wrote:
> >
> > Am I doing something wrong, or is something else wrong?
> > rsync is not updating for me.
> >
> > As shown below, ?pkg update -nv? says there?s nothing to update.
> > pkg is subscribed to r151016 publisher.
> > rsync is 3.1.1.
> >
> > (Personally, I don?t mind the vacation latency, but other users might be
> affected by this as well).
>
> Where is your rsync coming from?  Utter this:
>
>         pkg list rsync
>
> and see if you have output.  Maybe you have your own version somewhere?
> Maybe it's from one of the other publishers you mention:
>
> > root at box:/root# pkg publisher
> > PUBLISHER                   TYPE     STATUS P LOCATION
> > omnios                      origin   online F
> http://pkg.omniti.com/omnios/r151016/
> > ms.omniti.com               origin   online F
> http://pkg.omniti.com/omniti-ms/
> > niksula.hut.fi              origin   online F http://pkg.niksula.hut.fi/
> > omnios.blackdot.be              origin   online F
> http://omnios.blackdot.be/
> > uulm.mawi                   origin   online F
> http://scott.mathematik.uni-ulm.de/release/
> >
> > root at box:/root# pkg update -nv
> > No updates available for this image.
>
> It's most certainly available:
>
>         pkg list -avf -g http://pkg.omniti.com/omniti-ms/r151016 rsync
>
> You should see two versions, 3.1.1 and 3.1.2.
>
> I think your'e getting your rsync from one of the other publishers
> mentioned on your list.
>
> Dan
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151222/66c39c9d/attachment-0001.html>

From danmcd at omniti.com  Tue Dec 22 22:04:28 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 22 Dec 2015 17:04:28 -0500
Subject: [OmniOS-discuss] SECURITY UPDATE rsync to 3.1.2
In-Reply-To: <CANKMAMa3wTanZPOapTWVH2-Px=zY1tQw0m2zoGQFYDMatuq+6g@mail.gmail.com>
References: <C716581A-7FC6-4DFF-B772-89E3EC92F933@omniti.com>
	<E3E8662F-9D31-427B-9E7B-458904C5A735@neexistuje.sk>
	<206EEEE6-E838-4BB6-8008-5C99E3D38E7F@omniti.com>
	<CANKMAMa3wTanZPOapTWVH2-Px=zY1tQw0m2zoGQFYDMatuq+6g@mail.gmail.com>
Message-ID: <A3D9F7C1-1AE3-4284-AFDE-A67BDECDEAD5@omniti.com>

I misspelled the URL.  Your URLs are correct.

Dan


From jz+omni at neexistuje.sk  Wed Dec 23 00:12:17 2015
From: jz+omni at neexistuje.sk (Juraj Ziegler)
Date: Wed, 23 Dec 2015 01:12:17 +0100
Subject: [OmniOS-discuss] SECURITY UPDATE rsync to 3.1.2
In-Reply-To: <206EEEE6-E838-4BB6-8008-5C99E3D38E7F@omniti.com>
References: <C716581A-7FC6-4DFF-B772-89E3EC92F933@omniti.com>
	<E3E8662F-9D31-427B-9E7B-458904C5A735@neexistuje.sk>
	<206EEEE6-E838-4BB6-8008-5C99E3D38E7F@omniti.com>
Message-ID: <C2B6E60D-6660-4980-BF41-49B0A2D834D2@neexistuje.sk>


> On 22.12.2015, at 17:47, Dan McDonald <danmcd at omniti.com> wrote:
> 
> 
>> On Dec 22, 2015, at 11:15 AM, Juraj Ziegler <jz+omni at neexistuje.sk> wrote:
>> 
>> Am I doing something wrong, or is something else wrong?
>> rsync is not updating for me.
>> 
>> As shown below, ?pkg update -nv? says there?s nothing to update.
>> pkg is subscribed to r151016 publisher.
>> rsync is 3.1.1.
>> 
>> (Personally, I don?t mind the vacation latency, but other users might be affected by this as well).
> 
> Where is your rsync coming from?  Utter this:

?

> I think your'e getting your rsync from one of the other publishers mentioned on your list.

Right you are:

root at box:/root# which rsync
/opt/local/bin/rsync

root at box:/root# pkgin ls | grep rsync
rsync-3.1.1          Network file distribution/synchronisation utility

I had it from pkgsrc.


j.



From gary at genashor.com  Wed Dec 23 00:20:56 2015
From: gary at genashor.com (Gary Gendel)
Date: Tue, 22 Dec 2015 19:20:56 -0500
Subject: [OmniOS-discuss] SECURITY UPDATE rsync to 3.1.2
In-Reply-To: <C2B6E60D-6660-4980-BF41-49B0A2D834D2@neexistuje.sk>
References: <C716581A-7FC6-4DFF-B772-89E3EC92F933@omniti.com>
	<E3E8662F-9D31-427B-9E7B-458904C5A735@neexistuje.sk>
	<206EEEE6-E838-4BB6-8008-5C99E3D38E7F@omniti.com>
	<C2B6E60D-6660-4980-BF41-49B0A2D834D2@neexistuje.sk>
Message-ID: <5679E8E8.3010303@genashor.com>

I had the same situation.  Unfortunately, pkgsrc dirvish has pkgsrc 
rsync as a dependency.  Anyone know how to fool pkgsrc into 
understanding that rsync is already installed?

I manually removed pkgsrc rsync for the time being.

Gary

On 12/22/2015 7:12 PM, Juraj Ziegler wrote:
>> On 22.12.2015, at 17:47, Dan McDonald <danmcd at omniti.com> wrote:
>>
>>
>>> On Dec 22, 2015, at 11:15 AM, Juraj Ziegler <jz+omni at neexistuje.sk> wrote:
>>>
>>> Am I doing something wrong, or is something else wrong?
>>> rsync is not updating for me.
>>>
>>> As shown below, ?pkg update -nv? says there?s nothing to update.
>>> pkg is subscribed to r151016 publisher.
>>> rsync is 3.1.1.
>>>
>>> (Personally, I don?t mind the vacation latency, but other users might be affected by this as well).
>> Where is your rsync coming from?  Utter this:
> ?
>
>> I think your'e getting your rsync from one of the other publishers mentioned on your list.
> Right you are:
>
> root at box:/root# which rsync
> /opt/local/bin/rsync
>
> root at box:/root# pkgin ls | grep rsync
> rsync-3.1.1          Network file distribution/synchronisation utility
>
> I had it from pkgsrc.
>
>
> j.
>
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss



From doug at will.to  Wed Dec 23 00:39:49 2015
From: doug at will.to (Doug Hughes)
Date: Tue, 22 Dec 2015 19:39:49 -0500
Subject: [OmniOS-discuss] reverse engineering from broken ips
Message-ID: <5679ED55.4080008@will.to>

had a local IPS server.. it went SNAFU. totally lost..

Has anybody recovered an IPS from the local installed packages that came 
from that repo to generate a new one?

/var/pkg/publisher/<repo>/pkg/...

it looks like the complete catalog is in there that could be used 
rebuild a repo.

Ideas?



From jerry1209 at cht.com.tw  Wed Dec 23 00:44:03 2015
From: jerry1209 at cht.com.tw (=?big5?B?sWmubaZ0?=)
Date: Wed, 23 Dec 2015 00:44:03 +0000
Subject: [OmniOS-discuss] How to get NFS read & write latency in OmniOS
	r151016
Message-ID: <58A78BB477E10F419783CE1E1E5185C301227187B7@mbs5.app.corp.cht.com.tw>

Hi all,
      According to the release note of OmniOS r151016, we could get ?IOPS, bandwidth, and latency kstats for NFS server?

      there is lots of information showing when I use enter command #kstat,
      I want to get the ?nfs read & write latency for NFS server?

      Q1 : Is the ?nfs:0:rfsprocio_v4_write:wtime? & ?nfs:0:rfsprocio_v4_read:wtime? meant write & read latency ?
      Q2 : I mounted the nfs share directory, and write lots file to it, the number of ?nfs:0:rfsprocio_v4_write:wtime? & ?nfs:0:rfsprocio_v4_read:wtime? still zero. Why ?

      #kstat ?p ?m nfs ?n rfsprocio_v4_write
        nfs:0:rfsprocio_v4_write:class        rfsprocio_v4
nfs:0:rfsprocio_v4_write:crtime     50.833043074
nfs:0:rfsprocio_v4_write:nread      3932160
nfs:0:rfsprocio_v4_write:nwritten  5374607360
nfs:0:rfsprocio_v4_write:rcnt 0
nfs:0:rfsprocio_v4_write:reads       163840
nfs:0:rfsprocio_v4_write:rlastupdate     12048225488385
nfs:0:rfsprocio_v4_write:rlentime  33429565743
nfs:0:rfsprocio_v4_write:rtime       23992279289
nfs:0:rfsprocio_v4_write:snaptime 269635.483575440
nfs:0:rfsprocio_v4_write:wcnt        0
nfs:0:rfsprocio_v4_write:wlastupdate    0
nfs:0:rfsprocio_v4_write:wlentime 0
nfs:0:rfsprocio_v4_write:writes      163840            / number of writes /
nfs:0:rfsprocio_v4_write:wtime     0                      / wait queue - time spent waiting /

        #kstat ?p ?m nfs ?n rfsprocio_v4_read
        nfs:0:rfsprocio_v4_read:class rfsprocio_v4
nfs:0:rfsprocio_v4_read:crtime      50.833003263
nfs:0:rfsprocio_v4_read:nread       0
nfs:0:rfsprocio_v4_read:nwritten   0
nfs:0:rfsprocio_v4_read:rcnt  0
nfs:0:rfsprocio_v4_read:reads        0
nfs:0:rfsprocio_v4_read:rlastupdate      0
nfs:0:rfsprocio_v4_read:rlentime   0
nfs:0:rfsprocio_v4_read:rtime        0
nfs:0:rfsprocio_v4_read:snaptime  269635.483080962
nfs:0:rfsprocio_v4_read:wcnt 0
nfs:0:rfsprocio_v4_read:wlastupdate     0
nfs:0:rfsprocio_v4_read:wlentime 0
nfs:0:rfsprocio_v4_read:writes       0
nfs:0:rfsprocio_v4_read:wtime      0



Best regards,
---------------------------------------------
???
??????????????
TEL: 03-4245663


Please be advised that this email message (including any attachments) contains confidential information and may be legally privileged. If you are not the intended recipient, please destroy this message and all attachments from your system and do not further collect, process, or use them. Chunghwa Telecom and all its subsidiaries and associated companies shall not be liable for the improper or incomplete transmission of the information contained in this email nor for any delay in its receipt or damage to your system. If you are the intended recipient, please protect the confidential and/or personal information contained in this email with due care. Any unauthorized use, disclosure or distribution of this message in whole or in part is strictly prohibited.  Also, please self-inspect attachments and hyperlinks contained in this email to ensure the information security and to protect personal information.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151223/b3f1b3b8/attachment-0001.html>

From richard.elling at richardelling.com  Wed Dec 23 03:49:05 2015
From: richard.elling at richardelling.com (Richard Elling)
Date: Tue, 22 Dec 2015 19:49:05 -0800
Subject: [OmniOS-discuss] How to get NFS read & write latency in OmniOS
	r151016
In-Reply-To: <58A78BB477E10F419783CE1E1E5185C301227187B7@mbs5.app.corp.cht.com.tw>
References: <58A78BB477E10F419783CE1E1E5185C301227187B7@mbs5.app.corp.cht.com.tw>
Message-ID: <7BC64C53-06FC-465F-BB1E-B242CE7809AD@richardelling.com>


> On Dec 22, 2015, at 4:44 PM, ??? <jerry1209 at cht.com.tw> wrote:
> 
> Hi all,
>       According to the release note of OmniOS r151016, we could get ?IOPS, bandwidth, and latency kstats for NFS server?
>  
>       there is lots of information showing when I use enter command #kstat,
>       I want to get the ?nfs read & write latency for NFS server?
>      
>       Q1 : Is the ?nfs:0:rfsprocio_v4_write:wtime? & ?nfs:0:rfsprocio_v4_read:wtime? meant write & read latency ?

No, wtime is the wait queue occupancy (%wait in iostat -x)
A good reference is the man page for kstat(3kstat)
	man -s 3kstat kstat

Hopefully, the information there will answer your Q2.
 -- richard

>       Q2 : I mounted the nfs share directory, and write lots file to it, the number of ?nfs:0:rfsprocio_v4_write:wtime? & ?nfs:0:rfsprocio_v4_read:wtime? still zero. Why ? 
>  
>       #kstat ?p ?m nfs ?n rfsprocio_v4_write
>         nfs:0:rfsprocio_v4_write:class        rfsprocio_v4
> nfs:0:rfsprocio_v4_write:crtime     50.833043074
> nfs:0:rfsprocio_v4_write:nread      3932160
> nfs:0:rfsprocio_v4_write:nwritten  5374607360
> nfs:0:rfsprocio_v4_write:rcnt 0
> nfs:0:rfsprocio_v4_write:reads       163840
> nfs:0:rfsprocio_v4_write:rlastupdate     12048225488385
> nfs:0:rfsprocio_v4_write:rlentime  33429565743
> nfs:0:rfsprocio_v4_write:rtime       23992279289
> nfs:0:rfsprocio_v4_write:snaptime 269635.483575440
> nfs:0:rfsprocio_v4_write:wcnt        0
> nfs:0:rfsprocio_v4_write:wlastupdate    0
> nfs:0:rfsprocio_v4_write:wlentime 0
> nfs:0:rfsprocio_v4_write:writes      163840            / number of writes /
> nfs:0:rfsprocio_v4_write:wtime     0                      / wait queue - time spent waiting /
>  
>         #kstat ?p ?m nfs ?n rfsprocio_v4_read
>         nfs:0:rfsprocio_v4_read:class rfsprocio_v4
> nfs:0:rfsprocio_v4_read:crtime      50.833003263
> nfs:0:rfsprocio_v4_read:nread       0
> nfs:0:rfsprocio_v4_read:nwritten   0
> nfs:0:rfsprocio_v4_read:rcnt  0
> nfs:0:rfsprocio_v4_read:reads        0
> nfs:0:rfsprocio_v4_read:rlastupdate      0
> nfs:0:rfsprocio_v4_read:rlentime   0
> nfs:0:rfsprocio_v4_read:rtime        0
> nfs:0:rfsprocio_v4_read:snaptime  269635.483080962
> nfs:0:rfsprocio_v4_read:wcnt 0
> nfs:0:rfsprocio_v4_read:wlastupdate     0
> nfs:0:rfsprocio_v4_read:wlentime 0
> nfs:0:rfsprocio_v4_read:writes       0
> nfs:0:rfsprocio_v4_read:wtime      0
>         
>  
>  
> Best regards,
> ---------------------------------------------
> ???
> ??????????????
> TEL: 03-4245663
>  
> 
> 
> ?????????????????????,???????,???????????????,???????. ???????,?????????????????????,?????????,????????????????????,????????????????. 
> Please be advised that this email message (including any attachments) contains confidential information and may be legally privileged. If you are not the intended recipient, please destroy this message and all attachments from your system and do not further collect, process, or use them. Chunghwa Telecom and all its subsidiaries and associated companies shall not be liable for the improper or incomplete transmission of the information contained in this email nor for any delay in its receipt or damage to your system. If you are the intended recipient, please protect the confidential and/or personal information contained in this email with due care. Any unauthorized use, disclosure or distribution of this message in whole or in part is strictly prohibited. Also, please self-inspect attachments and hyperlinks contained in this email to ensure the information security and to protect personal information._______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com <mailto:OmniOS-discuss at lists.omniti.com>
> http://lists.omniti.com/mailman/listinfo/omnios-discuss <http://lists.omniti.com/mailman/listinfo/omnios-discuss>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151222/242ae9ac/attachment-0001.html>

From dan at syneto.eu  Wed Dec 23 08:58:37 2015
From: dan at syneto.eu (Dan Vatca)
Date: Wed, 23 Dec 2015 10:58:37 +0200
Subject: [OmniOS-discuss] How to get NFS read & write latency in OmniOS
	r151016
In-Reply-To: <58A78BB477E10F419783CE1E1E5185C301227187B7@mbs5.app.corp.cht.com.tw>
References: <58A78BB477E10F419783CE1E1E5185C301227187B7@mbs5.app.corp.cht.com.tw>
Message-ID: <CAELW+Ff7z=6LuadSnAudnqi7gZCgXCm3DArrAhGWaurMi3YHOA@mail.gmail.com>

If you need latency, you will most likely need a latency distribution
histogram, and not an average latency.
With averages you will lose latency outliers that are very important.
Here's a good read with lots of references on this topic:
https://www.vividcortex.com/blog/why-percentiles-dont-work-the-way-you-think
To currently do this on OmniOS, you need to use dtrace to aggregate
(quantize) time differences between nfsv3:::op-read-start
and nfsv3:::op-read-done (same for write).


Dan V?tca

CTO at Syneto

Tel: +40723604357, Skype: dan_vatca

<http://www.syneto.net/>

On Wed, Dec 23, 2015 at 2:44 AM, ??? <jerry1209 at cht.com.tw> wrote:

> Hi all,
>
>       According to the release note of OmniOS r151016, we could get ?IOPS,
> bandwidth, and latency kstats for NFS server?
>
>
>
>       there is lots of information showing when I use enter command #kstat,
>
>       I want to get the ?nfs read & write latency for NFS server?
>
>
>
>       Q1 : Is the ?nfs:0:rfsprocio_v4_write:wtime? &
> ?nfs:0:rfsprocio_v4_read:wtime? meant write & read latency ?
>
>       Q2 : I mounted the nfs share directory, and write lots file to it,
> the number of ?nfs:0:rfsprocio_v4_write:wtime? &
> ?nfs:0:rfsprocio_v4_read:wtime? still zero. Why ?
>
>
>
>       #kstat ?p ?m nfs ?n rfsprocio_v4_write
>
>         nfs:0:rfsprocio_v4_write:class        rfsprocio_v4
>
> nfs:0:rfsprocio_v4_write:crtime     50.833043074
>
> nfs:0:rfsprocio_v4_write:nread      3932160
>
> nfs:0:rfsprocio_v4_write:nwritten  5374607360
>
> nfs:0:rfsprocio_v4_write:rcnt 0
>
> nfs:0:rfsprocio_v4_write:reads       163840
>
> nfs:0:rfsprocio_v4_write:rlastupdate     12048225488385
>
> nfs:0:rfsprocio_v4_write:rlentime  33429565743
>
> nfs:0:rfsprocio_v4_write:rtime       23992279289
>
> nfs:0:rfsprocio_v4_write:snaptime 269635.483575440
>
> nfs:0:rfsprocio_v4_write:wcnt        0
>
> nfs:0:rfsprocio_v4_write:wlastupdate    0
>
> nfs:0:rfsprocio_v4_write:wlentime 0
>
> nfs:0:rfsprocio_v4_write:writes      163840            / number of writes /
>
> nfs:0:rfsprocio_v4_write:wtime     0                      / wait queue -
> time spent waiting /
>
>
>
>         #kstat ?p ?m nfs ?n rfsprocio_v4_read
>
>         nfs:0:rfsprocio_v4_read:class rfsprocio_v4
>
> nfs:0:rfsprocio_v4_read:crtime      50.833003263
>
> nfs:0:rfsprocio_v4_read:nread       0
>
> nfs:0:rfsprocio_v4_read:nwritten   0
>
> nfs:0:rfsprocio_v4_read:rcnt  0
>
> nfs:0:rfsprocio_v4_read:reads        0
>
> nfs:0:rfsprocio_v4_read:rlastupdate      0
>
> nfs:0:rfsprocio_v4_read:rlentime   0
>
> nfs:0:rfsprocio_v4_read:rtime        0
>
> nfs:0:rfsprocio_v4_read:snaptime  269635.483080962
>
> nfs:0:rfsprocio_v4_read:wcnt 0
>
> nfs:0:rfsprocio_v4_read:wlastupdate     0
>
> nfs:0:rfsprocio_v4_read:wlentime 0
>
> nfs:0:rfsprocio_v4_read:writes       0
>
> nfs:0:rfsprocio_v4_read:wtime      0
>
>
>
>
>
>
>
> Best regards,
>
> ---------------------------------------------
>
> ???
>
> ??????????????
>
> TEL: 03-4245663
>
>
>
>
>
> *?????????????????????,???????,???????????????,???????.
> ???????,?????????????????????,?????????,????????????????????,????????????????.
> Please be advised that this email message (including any attachments)
> contains confidential information and may be legally privileged. If you are
> not the intended recipient, please destroy this message and all attachments
> from your system and do not further collect, process, or use them. Chunghwa
> Telecom and all its subsidiaries and associated companies shall not be
> liable for the improper or incomplete transmission of the information
> contained in this email nor for any delay in its receipt or damage to your
> system. If you are the intended recipient, please protect the confidential
> and/or personal information contained in this email with due care. Any
> unauthorized use, disclosure or distribution of this message in whole or in
> part is strictly prohibited. Also, please self-inspect attachments and
> hyperlinks contained in this email to ensure the information security and
> to protect personal information.*
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151223/392cf627/attachment.html>

From richard.elling at richardelling.com  Thu Dec 24 22:44:35 2015
From: richard.elling at richardelling.com (Richard Elling)
Date: Thu, 24 Dec 2015 14:44:35 -0800
Subject: [OmniOS-discuss] How to get NFS read & write latency in OmniOS
	r151016
In-Reply-To: <CAELW+Ff7z=6LuadSnAudnqi7gZCgXCm3DArrAhGWaurMi3YHOA@mail.gmail.com>
References: <58A78BB477E10F419783CE1E1E5185C301227187B7@mbs5.app.corp.cht.com.tw>
	<CAELW+Ff7z=6LuadSnAudnqi7gZCgXCm3DArrAhGWaurMi3YHOA@mail.gmail.com>
Message-ID: <0DB958F5-09BC-48D9-924E-4F54A53F4D3A@RichardElling.com>


> On Dec 23, 2015, at 12:58 AM, Dan Vatca <dan at syneto.eu> wrote:
> 
> If you need latency, you will most likely need a latency distribution histogram, and not an average latency.
> With averages you will lose latency outliers that are very important. Here's a good read with lots of references on this topic: https://www.vividcortex.com/blog/why-percentiles-dont-work-the-way-you-think <https://www.vividcortex.com/blog/why-percentiles-dont-work-the-way-you-think>
> To currently do this on OmniOS, you need to use dtrace to aggregate (quantize) time differences between nfsv3:::op-read-start and nfsv3:::op-read-done (same for write).

Indeed, distributions are much more enlightening than averages.

Unfortunately, the new kstats added for NFS server operations on a per-mountpoint basis
are implemented using the Riemann sums (KSTAT_TYPE_IO) and it is not possible to obtain
per-operation information needed for min/max or distribution. These are the same type of 
kstat used for the iostat command.

Shameless plug, nfssvrtop has proven to be useful in watching NFS traffic and uses the 
op-read-start/op-read-done method.
https://github.com/richardelling/tools <https://github.com/richardelling/tools>

 ? richard

> 
> 
> Dan V?tca
> CTO at Syneto
> Tel: +40723604357, Skype: dan_vatca
>  <http://www.syneto.net/>
> On Wed, Dec 23, 2015 at 2:44 AM, ??? <jerry1209 at cht.com.tw <mailto:jerry1209 at cht.com.tw>> wrote:
> Hi all,
> 
>       According to the release note of OmniOS r151016, we could get ?IOPS, bandwidth, and latency kstats for NFS server?
> 
>  
> 
>       there is lots of information showing when I use enter command #kstat,
> 
>       I want to get the ?nfs read & write latency for NFS server?
> 
>      
> 
>       Q1 : Is the ?nfs:0:rfsprocio_v4_write:wtime? & ?nfs:0:rfsprocio_v4_read:wtime? meant write & read latency ?
> 
>       Q2 : I mounted the nfs share directory, and write lots file to it, the number of ?nfs:0:rfsprocio_v4_write:wtime? & ?nfs:0:rfsprocio_v4_read:wtime? still zero. Why ?
> 
>  
> 
>       #kstat ?p ?m nfs ?n rfsprocio_v4_write
> 
>         nfs:0:rfsprocio_v4_write:class        rfsprocio_v4
> 
> nfs:0:rfsprocio_v4_write:crtime     50.833043074
> 
> nfs:0:rfsprocio_v4_write:nread      3932160
> 
> nfs:0:rfsprocio_v4_write:nwritten  5374607360
> 
> nfs:0:rfsprocio_v4_write:rcnt 0
> 
> nfs:0:rfsprocio_v4_write:reads       163840
> 
> nfs:0:rfsprocio_v4_write:rlastupdate     12048225488385
> 
> nfs:0:rfsprocio_v4_write:rlentime  33429565743
> 
> nfs:0:rfsprocio_v4_write:rtime       23992279289
> 
> nfs:0:rfsprocio_v4_write:snaptime 269635.483575440
> 
> nfs:0:rfsprocio_v4_write:wcnt        0
> 
> nfs:0:rfsprocio_v4_write:wlastupdate    0
> 
> nfs:0:rfsprocio_v4_write:wlentime 0
> 
> nfs:0:rfsprocio_v4_write:writes      163840            / number of writes /
> 
> nfs:0:rfsprocio_v4_write:wtime     0                      / wait queue - time spent waiting /
> 
>  
> 
>         #kstat ?p ?m nfs ?n rfsprocio_v4_read
> 
>         nfs:0:rfsprocio_v4_read:class rfsprocio_v4
> 
> nfs:0:rfsprocio_v4_read:crtime      50.833003263
> 
> nfs:0:rfsprocio_v4_read:nread       0
> 
> nfs:0:rfsprocio_v4_read:nwritten   0
> 
> nfs:0:rfsprocio_v4_read:rcnt  0
> 
> nfs:0:rfsprocio_v4_read:reads        0
> 
> nfs:0:rfsprocio_v4_read:rlastupdate      0
> 
> nfs:0:rfsprocio_v4_read:rlentime   0
> 
> nfs:0:rfsprocio_v4_read:rtime        0
> 
> nfs:0:rfsprocio_v4_read:snaptime  269635.483080962
> 
> nfs:0:rfsprocio_v4_read:wcnt 0
> 
> nfs:0:rfsprocio_v4_read:wlastupdate     0
> 
> nfs:0:rfsprocio_v4_read:wlentime 0
> 
> nfs:0:rfsprocio_v4_read:writes       0
> 
> nfs:0:rfsprocio_v4_read:wtime      0
> 
>        
> 
>  
> 
>  
> 
> Best regards,
> 
> ---------------------------------------------
> 
> ???
> 
> ??????????????
> 
> TEL: 03-4245663
> 
>  
> 
> 
> 
> ?????????????????????,???????,???????????????,???????. ???????,?????????????????????,?????????,????????????????????,????????????????. 
> Please be advised that this email message (including any attachments) contains confidential information and may be legally privileged. If you are not the intended recipient, please destroy this message and all attachments from your system and do not further collect, process, or use them. Chunghwa Telecom and all its subsidiaries and associated companies shall not be liable for the improper or incomplete transmission of the information contained in this email nor for any delay in its receipt or damage to your system. If you are the intended recipient, please protect the confidential and/or personal information contained in this email with due care. Any unauthorized use, disclosure or distribution of this message in whole or in part is strictly prohibited. Also, please self-inspect attachments and hyperlinks contained in this email to ensure the information security and to protect personal information.
> 
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com <mailto:OmniOS-discuss at lists.omniti.com>
> http://lists.omniti.com/mailman/listinfo/omnios-discuss <http://lists.omniti.com/mailman/listinfo/omnios-discuss>
> 
> 
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

--

Richard.Elling at RichardElling.com
+1-760-896-4422



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20151224/f8d10dfb/attachment-0001.html>

From bfriesen at simple.dallas.tx.us  Thu Dec 24 23:36:46 2015
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Thu, 24 Dec 2015 17:36:46 -0600 (CST)
Subject: [OmniOS-discuss] OmniOS and OpenMP
Message-ID: <alpine.GSO.2.01.1512241736200.28454@freddy.simplesystems.org>

GCC compiled programs making use of OpenMP require libgomp in order to run. 
Currently this library is provided as part of the GCC packages. It is necessary 
to install all of GCC in order for dependent programs to be able to run, and 
linker run-path (e.g. -R/opt/gcc-5.1.0/lib) also needs to be specified when 
linking the program.  There was a similar problem for libgcc_s.so but this was 
provided via a runtime package:

% pkg contents system/library/gcc-5-runtime
PATH
usr/lib/amd64
usr/lib/amd64/libgcc_s.so
usr/lib/amd64/libgcc_s.so.1
usr/lib/libgcc_s.so
usr/lib/libgcc_s.so.1

Can a similar runtime package be provided for libgomp (e.g. 
gcc-5-gomp-runtime)?  It would make sense for gomp to be included in 
gcc-5-runtime except that since it was not included from the start, adding it 
now might cause problems.

Thanks,

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From gate03 at landcroft.co.uk  Sun Dec 27 03:15:47 2015
From: gate03 at landcroft.co.uk (Michael Mounteney)
Date: Sun, 27 Dec 2015 13:15:47 +1000
Subject: [OmniOS-discuss] networking from a zone
Message-ID: <20151227131547.1ac15e50@punda-mlia>

Hello, I tried to do this a while ago and Jim Klimov (4 Jan 2015) was
kind enough to reply but I was unable to solve the problem with his
advice.

The problem is that DNS does not work from a non-global zone
(hereunder referred-to as a child zone or CZ) whereas it does
from the global zone (GZ).

My IPFilter rule set is at https://pastebin.com/JYeYDPAb and it is
the problem:  with 'svcadm disable ipfilter' I CAN do DNS from the CZ
and with 'svcadm enable ipfilter' I CANNOT.

Interface e1000g0 is connected to my cable modem (192.168.0.1) and the
interwebs, and e1000g1 is connected to my switch and house network.

The interfaces in the GZ and CZ:

GZ# netstat -rn
Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use
Interface -------------------- -------------------- ----- -----
---------- --------- default              192.168.0.1
UG        3    1517370 127.0.0.1            127.0.0.1
UH        2        236 lo0 192.168.0.0          192.168.0.9
U         3         12 e1000g0 192.168.1.0
192.168.1.1          U        10   60219886 e1000g1 

(IPv6 stuff omitted for brevity)

CZ# netstat -rn
Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use
Interface -------------------- -------------------- ----- -----
---------- --------- default              192.168.0.1
UG        3    1517442 127.0.0.1            127.0.0.1
UH        2         24 lo0 192.168.0.0          192.168.0.3
U         3          3 e1000g0 192.168.1.0
192.168.1.3          U         5          0 e1000g1

so the only difference is the IP addresses.

Now with ipfilter disabled:

CZ# nslookup www.gentoo.org
Server:         198.142.235.14
Address:        198.142.235.14#53

Non-authoritative answer:
www.gentoo.org  canonical name = www-bytemark-v4v6.gentoo.org.
Name:   www-bytemark-v4v6.gentoo.org
Address: 89.16.167.134

But with it ENabled:

CZ# nslookup www.gentoo.org
;; connection timed out; no servers could be reached

CZ# ping 89.16.167.134
89.16.167.134 is alive

So pinging works but DNS doesn't.

Obviously, as nslookup in the CZ works with ipfilter disabled, DNS is
configured correctly:

CZ# grep '^hosts:' /etc/nsswitch.conf
hosts:      files dns mdns

CZ# cat /etc/resolv.conf
nameserver 198.142.235.14
nameserver 211.29.132.12
nameserver 198.142.0.51

Picking bits from Jim's responses (4 Jan 2015):

<< For debugging, you can 'snoop' in the zone owning the interface
(GZ for shared, LZ for dedicated VNICs) to check what requests go
out and what does or does not come back in. >>

I tried this couldn't snoop in the CZ/LZ
("snoop: cannot open "e1000g0": DLPI link does not exist") and a GZ
snoop didn't show any DNS.

<< rules for e1000g0 in/out comms. name the dynamic address for the
interface as 'e1000g0/32' which may limit to the GZ address. See if
replacing this by the subnet /24 fixes the issue? >>
I did this but no difference.

<< Does the external LZ have a fixed IP address >> Yes

<< you can then pluck in specific rules for its network access then? >>
Now that e1000g0 rules in ipf.conf are all /24 this should not matter.

<< you start with
  block in quick on e1000g0 from 192.168.0.0/16 to any
which may preclude access to your router >>
I tried removing this but no difference.

<< Also [...] 'ipfstat -hion' [...] 'ipmon | grep -w b' >>

Tried those but couldn't see anything relevant in the output.

The nub of the matter is that something in the ipf.conf is treating the
LZ e1000g0 interface differently from the GZ's e1000g0 but I cannot see
what.

Any assistance would be appreciated.

-- 
______________
Michael Mounteney

From gate03 at landcroft.co.uk  Sun Dec 27 03:18:40 2015
From: gate03 at landcroft.co.uk (Michael Mounteney)
Date: Sun, 27 Dec 2015 13:18:40 +1000
Subject: [OmniOS-discuss] networking from a zone
Message-ID: <20151227131840.4e5467ed@punda-mlia>

Hello, I tried to do this a while ago and Jim Klimov (4 Jan 2015) was
kind enough to reply but I was unable to solve the problem with his advice.

The problem is that DNS does not work from a non-global zone
(hereunder referred-to as a child zone or CZ) whereas it does
from the global zone (GZ).

My IPFilter rule set is at https://pastebin.com/JYeYDPAb and it is
the problem:  with 'svcadm disable ipfilter' I CAN do DNS from the CZ
and with 'svcadm enable ipfilter' I CANNOT.

Interface e1000g0 is connected to my cable modem (192.168.0.1) and the
interwebs, and e1000g1 is connected to my switch and house network.

The interfaces in the GZ and CZ:

GZ# netstat -rn
Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface 
-------------------- -------------------- ----- ----- ---------- --------- 
default              192.168.0.1          UG        3    1517370           
127.0.0.1            127.0.0.1            UH        2        236 lo0       
192.168.0.0          192.168.0.9          U         3         12 e1000g0   
192.168.1.0          192.168.1.1          U        10   60219886 e1000g1 

(IPv6 stuff omitted for brevity)

CZ# netstat -rn
Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface 
-------------------- -------------------- ----- ----- ---------- --------- 
default              192.168.0.1          UG        3    1517442           
127.0.0.1            127.0.0.1            UH        2         24 lo0       
192.168.0.0          192.168.0.3          U         3          3 e1000g0   
192.168.1.0          192.168.1.3          U         5          0 e1000g1

so the only difference is the IP addresses.

Now with ipfilter disabled:

CZ# nslookup www.gentoo.org
Server:         198.142.235.14
Address:        198.142.235.14#53

Non-authoritative answer:
www.gentoo.org  canonical name = www-bytemark-v4v6.gentoo.org.
Name:   www-bytemark-v4v6.gentoo.org
Address: 89.16.167.134

But with it ENabled:

CZ# nslookup www.gentoo.org
;; connection timed out; no servers could be reached

CZ# ping 89.16.167.134
89.16.167.134 is alive

So pinging works but DNS doesn't.

Obviously, as nslookup in the CZ works with ipfilter disabled, DNS is
configured correctly:

CZ# grep '^hosts:' /etc/nsswitch.conf
hosts:      files dns mdns

CZ# cat /etc/resolv.conf
nameserver 198.142.235.14
nameserver 211.29.132.12
nameserver 198.142.0.51

Picking bits from Jim's responses (4 Jan 2015):

<< For debugging, you can 'snoop' in the zone owning the interface
(GZ for shared, LZ for dedicated VNICs) to check what requests go
out and what does or does not come back in. >>

I tried this couldn't snoop in the CZ/LZ
("snoop: cannot open "e1000g0": DLPI link does not exist") and a GZ snoop
didn't show any DNS.

<< rules for e1000g0 in/out comms. name the dynamic address for the
interface as 'e1000g0/32' which may limit to the GZ address. See if
replacing this by the subnet /24 fixes the issue? >>
I did this but no difference.

<< Does the external LZ have a fixed IP address >> Yes

<< you can then pluck in specific rules for its network access then? >>
Now that e1000g0 rules in ipf.conf are all /24 this should not matter.

<< you start with
  block in quick on e1000g0 from 192.168.0.0/16 to any
which may preclude access to your router >>
I tried removing this but no difference.

<< Also [...] 'ipfstat -hion' [...] 'ipmon | grep -w b' >>

Tried those but couldn't see anything relevant in the output.

The nub of the matter is that something in the ipf.conf is treating the LZ e1000g0
interface differently from the GZ's e1000g0 but I cannot see what.

Any assistance would be appreciated.

-- 
______________
Michael Mounteney

From gate03 at landcroft.co.uk  Sun Dec 27 04:49:49 2015
From: gate03 at landcroft.co.uk (Michael Mounteney)
Date: Sun, 27 Dec 2015 14:49:49 +1000
Subject: [OmniOS-discuss] OmniOS stops acting as a DHCP client
In-Reply-To: <5639EAF9.4010605@genashor.com>
References: <20151104194943.3faa0777@coomera> <5639EAF9.4010605@genashor.com>
Message-ID: <20151227144949.6a51cb27@punda-mlia>

On Wed, 4 Nov 2015 06:24:41 -0500 Gary Gendel <gary at genashor.com> wrote:

> Try snooping the nic to see if you get the appropriate DHCP messages 
> flowing in and out of the box.  Make sure you don't have an ipfilter 
> rule blocking this traffic.  You might try to shut down ipfilter just
> to see if it got in the way.

Gary, you were right.  At first I dismissed your solution because I
reasoned that I had not altered the ipfilter rule set so why would it
block the DHCP request when it never did before?  But I had run the
initial DHCP request early during configuration **before configuring
ipfilter**.

The actual problem is the lack of a port=68 rule to let the
lease-response through.

______________
Michael Mounteney

From jimklimov at cos.ru  Sun Dec 27 10:18:30 2015
From: jimklimov at cos.ru (Jim Klimov)
Date: Sun, 27 Dec 2015 11:18:30 +0100
Subject: [OmniOS-discuss] networking from a zone
In-Reply-To: <20151227131840.4e5467ed@punda-mlia>
References: <20151227131840.4e5467ed@punda-mlia>
Message-ID: <FCBCEC0B-023C-4745-9076-96BF3F5DD7A5@cos.ru>

27 ??????? 2015??. 4:18:40 CET, Michael Mounteney <gate03 at landcroft.co.uk> ?????:
>Hello, I tried to do this a while ago and Jim Klimov (4 Jan 2015) was
>kind enough to reply but I was unable to solve the problem with his
>advice.
>
>The problem is that DNS does not work from a non-global zone
>(hereunder referred-to as a child zone or CZ) whereas it does
>from the global zone (GZ).
>
>My IPFilter rule set is at https://pastebin.com/JYeYDPAb and it is
>the problem:  with 'svcadm disable ipfilter' I CAN do DNS from the CZ
>and with 'svcadm enable ipfilter' I CANNOT.
>
>Interface e1000g0 is connected to my cable modem (192.168.0.1) and the
>interwebs, and e1000g1 is connected to my switch and house network.
>
>The interfaces in the GZ and CZ:
>
>GZ# netstat -rn
>Routing Table: IPv4
>Destination           Gateway           Flags  Ref     Use    
>Interface 
>-------------------- -------------------- ----- ----- ----------
>--------- 
>default              192.168.0.1          UG        3    1517370       
>   
>127.0.0.1            127.0.0.1            UH        2        236 lo0   
>   
>192.168.0.0          192.168.0.9          U         3         12
>e1000g0   
>192.168.1.0          192.168.1.1          U        10   60219886
>e1000g1 
>
>(IPv6 stuff omitted for brevity)
>
>CZ# netstat -rn
>Routing Table: IPv4
>Destination           Gateway           Flags  Ref     Use    
>Interface 
>-------------------- -------------------- ----- ----- ----------
>--------- 
>default              192.168.0.1          UG        3    1517442       
>   
>127.0.0.1            127.0.0.1            UH        2         24 lo0   
>   
>192.168.0.0          192.168.0.3          U         3          3
>e1000g0   
>192.168.1.0          192.168.1.3          U         5          0
>e1000g1
>
>so the only difference is the IP addresses.
>
>Now with ipfilter disabled:
>
>CZ# nslookup www.gentoo.org
>Server:         198.142.235.14
>Address:        198.142.235.14#53
>
>Non-authoritative answer:
>www.gentoo.org  canonical name = www-bytemark-v4v6.gentoo.org.
>Name:   www-bytemark-v4v6.gentoo.org
>Address: 89.16.167.134
>
>But with it ENabled:
>
>CZ# nslookup www.gentoo.org
>;; connection timed out; no servers could be reached
>
>CZ# ping 89.16.167.134
>89.16.167.134 is alive
>
>So pinging works but DNS doesn't.
>
>Obviously, as nslookup in the CZ works with ipfilter disabled, DNS is
>configured correctly:
>
>CZ# grep '^hosts:' /etc/nsswitch.conf
>hosts:      files dns mdns
>
>CZ# cat /etc/resolv.conf
>nameserver 198.142.235.14
>nameserver 211.29.132.12
>nameserver 198.142.0.51
>
>Picking bits from Jim's responses (4 Jan 2015):
>
><< For debugging, you can 'snoop' in the zone owning the interface
>(GZ for shared, LZ for dedicated VNICs) to check what requests go
>out and what does or does not come back in. >>
>
>I tried this couldn't snoop in the CZ/LZ
>("snoop: cannot open "e1000g0": DLPI link does not exist") and a GZ
>snoop
>didn't show any DNS.
>
><< rules for e1000g0 in/out comms. name the dynamic address for the
>interface as 'e1000g0/32' which may limit to the GZ address. See if
>replacing this by the subnet /24 fixes the issue? >>
>I did this but no difference.
>
><< Does the external LZ have a fixed IP address >> Yes
>
><< you can then pluck in specific rules for its network access then? >>
>Now that e1000g0 rules in ipf.conf are all /24 this should not matter.
>
><< you start with
>  block in quick on e1000g0 from 192.168.0.0/16 to any
>which may preclude access to your router >>
>I tried removing this but no difference.
>
><< Also [...] 'ipfstat -hion' [...] 'ipmon | grep -w b' >>
>
>Tried those but couldn't see anything relevant in the output.
>
>The nub of the matter is that something in the ipf.conf is treating the
>LZ e1000g0
>interface differently from the GZ's e1000g0 but I cannot see what.
>
>Any assistance would be appreciated.

Hello again ;)

Looking at your pastebin rules, i am a bit concerned about lines 34, 42 and such with 'e1000g0/24' - this may be, possibly, limiting the ipfilter somehow to only use the GZ addresses, or those that are bound to GZ at the time of ipfilter startup, or just wholly owned by the GZ. At least I'm wary of that bit... And from route screenshots, I infer that the local zone is currently on shared stack, so its interfaces are aliased and set up from the GZ. 

If you boot up the local zone and then restart ipfilter in the GZ - does it still misbehave?

See if allowing requests from the subnet by number explicitly would help?

Also, your rules could be a bit optimized by using 'head' and 'group' to separate the int/ext interfaces in/out directions so ipfilter does not have to process the whole ruleset when you know in advance that a rule is not applicable to each and every packet ;)

As for snoop and/or libpcap cliebts not finding interfaces - 'truss' the program to see what they try to access. Maybe they want e.g. /dev/e1000g0 so you'd have to go and make symlinks:

cd /dev && ln -s ./net/* .

Some (older/vanilla) sniffer versions could also look for the base device like 'e1000' - i'm not sure how to help that...

Hope this helps,
Jim
--
Typos courtesy of K-9 Mail on my Samsung Android

From gary at genashor.com  Sun Dec 27 13:45:46 2015
From: gary at genashor.com (Gary Gendel)
Date: Sun, 27 Dec 2015 08:45:46 -0500
Subject: [OmniOS-discuss] OmniOS stops acting as a DHCP client
In-Reply-To: <20151227144949.6a51cb27@punda-mlia>
References: <20151104194943.3faa0777@coomera> <5639EAF9.4010605@genashor.com>
	<20151227144949.6a51cb27@punda-mlia>
Message-ID: <567FEB8A.8060900@genashor.com>

On 12/26/2015 11:49 PM, Michael Mounteney wrote:
> On Wed, 4 Nov 2015 06:24:41 -0500 Gary Gendel <gary at genashor.com> wrote:
>
>> Try snooping the nic to see if you get the appropriate DHCP messages
>> flowing in and out of the box.  Make sure you don't have an ipfilter
>> rule blocking this traffic.  You might try to shut down ipfilter just
>> to see if it got in the way.
> Gary, you were right.  At first I dismissed your solution because I
> reasoned that I had not altered the ipfilter rule set so why would it
> block the DHCP request when it never did before?  But I had run the
> initial DHCP request early during configuration **before configuring
> ipfilter**.
>
> The actual problem is the lack of a port=68 rule to let the
> lease-response through.
>
> ______________
> Michael Mounteney
Michael,

It always helps to not assume anything and test everything. I have a 
modern smartphone get an OS update quarterly.  After each update, I get 
the same symptom... Every ~20 hours all applications send me 
notifications that I have to log in again.  However, when I check I am 
logged in.  The only cure I found was to wipe the phone clean and 
install each application again manually.  If I restore from backup the 
nonsense starts all over again.  It goes against reason but it happens 
reliably after each OS update.  Software is funny that way.

Gary



From danmcd at omniti.com  Mon Dec 28 21:45:25 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 28 Dec 2015 16:45:25 -0500
Subject: [OmniOS-discuss] OmniOS and OpenMP
In-Reply-To: <alpine.GSO.2.01.1512241736200.28454@freddy.simplesystems.org>
References: <alpine.GSO.2.01.1512241736200.28454@freddy.simplesystems.org>
Message-ID: <C715F70F-F670-4535-A7F2-CB075B00783F@omniti.com>


> On Dec 24, 2015, at 6:36 PM, Bob Friesenhahn <bfriesen at simple.dallas.tx.us> wrote:
> 
> GCC compiled programs making use of OpenMP require libgomp in order to run. Currently this library is provided as part of the GCC packages. It is necessary to install all of GCC in order for dependent programs to be able to run, and linker run-path (e.g. -R/opt/gcc-5.1.0/lib) also needs to be specified when linking the program.  There was a similar problem for libgcc_s.so but this was provided via a runtime package:
> 
> % pkg contents system/library/gcc-5-runtime
> PATH
> usr/lib/amd64
> usr/lib/amd64/libgcc_s.so
> usr/lib/amd64/libgcc_s.so.1
> usr/lib/libgcc_s.so
> usr/lib/libgcc_s.so.1
> 
> Can a similar runtime package be provided for libgomp (e.g. gcc-5-gomp-runtime)?  It would make sense for gomp to be included in gcc-5-runtime except that since it was not included from the start, adding it now might cause problems.

It's possible, but I'd have to think about how best to package it up (including dependencies, etc. etc.).  I don't have any objections to including gomp in gcc-5-runtime, and modulo the must-have-the-latest-version problem, it might not be so bad.  It's something to consider for bloody & r151018.

Dan


From danmcd at omniti.com  Mon Dec 28 21:53:25 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 28 Dec 2015 16:53:25 -0500
Subject: [OmniOS-discuss] OmniOS stops acting as a DHCP client
In-Reply-To: <20151227144949.6a51cb27@punda-mlia>
References: <20151104194943.3faa0777@coomera> <5639EAF9.4010605@genashor.com>
	<20151227144949.6a51cb27@punda-mlia>
Message-ID: <23D7ADAF-99A8-4277-A0D0-18D1106CAEA5@omniti.com>


> On Dec 26, 2015, at 11:49 PM, Michael Mounteney <gate03 at landcroft.co.uk> wrote:
> 
> The actual problem is the lack of a port=68 rule to let the
> lease-response through.

Did that clear up this problem?  Also, did you other networking problem clear up with some ipfilter rule fixing?

Dan


From bfriesen at simple.dallas.tx.us  Mon Dec 28 22:50:23 2015
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Mon, 28 Dec 2015 16:50:23 -0600 (CST)
Subject: [OmniOS-discuss] OmniOS and OpenMP
In-Reply-To: <C715F70F-F670-4535-A7F2-CB075B00783F@omniti.com>
References: <alpine.GSO.2.01.1512241736200.28454@freddy.simplesystems.org>
	<C715F70F-F670-4535-A7F2-CB075B00783F@omniti.com>
Message-ID: <alpine.GSO.2.01.1512281645220.28454@freddy.simplesystems.org>

On Mon, 28 Dec 2015, Dan McDonald wrote:

>
>> On Dec 24, 2015, at 6:36 PM, Bob Friesenhahn <bfriesen at simple.dallas.tx.us> wrote:
>>
>> GCC compiled programs making use of OpenMP require libgomp in order to run. Currently this library is provided as part of the GCC packages. It is necessary to install all of GCC in order for dependent programs to be able to run, and linker run-path (e.g. -R/opt/gcc-5.1.0/lib) also needs to be specified when linking the program.  There was a similar problem for libgcc_s.so but this was provided via a runtime package:
>>
>> % pkg contents system/library/gcc-5-runtime
>> PATH
>> usr/lib/amd64
>> usr/lib/amd64/libgcc_s.so
>> usr/lib/amd64/libgcc_s.so.1
>> usr/lib/libgcc_s.so
>> usr/lib/libgcc_s.so.1
>>
>> Can a similar runtime package be provided for libgomp (e.g. gcc-5-gomp-runtime)?  It would make sense for gomp to be included in gcc-5-runtime except that since it was not included from the start, adding it now might cause problems.
>
> It's possible, but I'd have to think about how best to package it up (including dependencies, etc. etc.).  I don't have any objections to including gomp in gcc-5-runtime, and modulo the must-have-the-latest-version problem, it might not be so bad.  It's something to consider for bloody & r151018.

Due to the existing problem, I don't think that there are existing 
dependencies to worry about.  It is clear that for existing release 
branches, gcc-5-runtime can not add libraries without the risk of an 
application not running because the gcc-5-runtime vintage is too old. 
This is perhaps not so much of a problem since OmniOS systems should 
be updating regularly.

The gomp library is a bit large so some users might be happier if it 
was optional via its own package and not part of the OmniOS baseline.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From gate03 at landcroft.co.uk  Tue Dec 29 00:16:58 2015
From: gate03 at landcroft.co.uk (Michael Mounteney)
Date: Tue, 29 Dec 2015 10:16:58 +1000
Subject: [OmniOS-discuss] OmniOS stops acting as a DHCP client
In-Reply-To: <23D7ADAF-99A8-4277-A0D0-18D1106CAEA5@omniti.com>
References: <20151104194943.3faa0777@coomera> <5639EAF9.4010605@genashor.com>
	<20151227144949.6a51cb27@punda-mlia>
	<23D7ADAF-99A8-4277-A0D0-18D1106CAEA5@omniti.com>
Message-ID: <20151229101658.0bbb436e@punda-mlia>

On Mon, 28 Dec 2015 16:53:25 -0500
Dan McDonald <danmcd at omniti.com> wrote:

> > On Dec 26, 2015, at 11:49 PM, Michael Mounteney <gate03 at landcroft.co.uk> wrote:
> > 
> > The actual problem is the lack of a port=68 rule to let the
> > lease-response through.  
> 
> Did that clear up this problem?  Also, did you other networking
> problem clear up with some ipfilter rule fixing?

Hello Dan;  the port=68 rule did clear up the DHCP problem but the
other problem, i.e., DNS from a non-global zone, is still
present.  I haven't had a chance yet to implement all
Jim's suggestions but will ask again on this list when
I've done so.  Thanks for caring.  ;-)

______________ 
Michael Mounteney

From ryan at zinascii.com  Wed Dec 30 01:28:50 2015
From: ryan at zinascii.com (Ryan Zezeski)
Date: Tue, 29 Dec 2015 20:28:50 -0500
Subject: [OmniOS-discuss] Panic, BAD TRAP, r151014, VMWare Fusion 8.1.0
Message-ID: <m28u4c912l.fsf@zinascii.com>


While running a nightly build of illumos-gate the kernel panicked with
"BAD TRAP". Running OmniOS r151014 on VMWare Fusion.

This is not urgent. I am posting in case I have stumbled onto a bug.


VMWare Fusion Version 8.1.0 (3272237)

# cat /etc/release
  OmniOS v11 r151014
  Copyright 2015 OmniTI Computer Consulting, Inc. All rights reserved.
  Use is subject to license terms.

# uname -v
omnios-d08e0e5


You can find the full crash dump here:

http://zinascii.com/pub/illumos/cores/bad-trap-12-29-15/


crash dump info
---------------

> ::status
debugging crash dump /var/crash/unknown/vmcore.0 (64-bit) from omnislash
operating system: 5.11 omnios-d08e0e5 (i86pc)
image uuid: a43f56bd-f5a4-6643-92e6-84262b07c26a
panic message: BAD TRAP: type=e (#pf Page fault) rp=ffffff001010c010 addr=fffffffffb8484b0
dump content: kernel pages only

> ::panicinfo
             cpu                2
          thread ffffff02e40fb160
         message BAD TRAP: type=e (#pf Page fault) rp=ffffff001010c010 addr=fffffffffb8484b0
             rdi ffffff001010c110
             rsi fffffffffb8484b0
             rdx                2
             rcx                0
              r8 fffffffffbc723a0
              r9               78
             rax fffffffffbc72420
             rbx         fee19000
             rbp ffffff001010c110
             r10 fffffffffbcf8cb0
             r11 ffffff02e40fb160
             r12                0
             r13                0
             r14 ffffff02fb5e1c00
             r15 fffffffffb8484b0
          fsbase                0
          gsbase ffffff02dbcb4080
              ds               4b
              es               4b
              fs                0
              gs              1c3
          trapno                e
             err               10
             rip fffffffffb8484b0
              cs               30
          rflags            10202
             rsp ffffff001010c108
              ss                0
          gdt_hi                0
          gdt_lo         b00001ef
          idt_hi                0
          idt_lo         a0000fff
             ldt                0
            task               70
             cr0         8005003b
             cr2 ffffff02d90f0ff8
             cr3        234af6000
             cr4            406b8
> ::msgbuf
MESSAGE
pcieb16 is /pci at 0,0/pci15ad,7a0 at 17
PCI Express-device: pci15ad,7a0 at 17,1, pcieb17
pcieb17 is /pci at 0,0/pci15ad,7a0 at 17,1
PCI Express-device: pci15ad,7a0 at 17,2, pcieb18
pcieb18 is /pci at 0,0/pci15ad,7a0 at 17,2
PCI Express-device: pci15ad,7a0 at 17,3, pcieb19
pcieb19 is /pci at 0,0/pci15ad,7a0 at 17,3
PCI Express-device: pci15ad,7a0 at 17,4, pcieb20
pcieb20 is /pci at 0,0/pci15ad,7a0 at 17,4
PCI Express-device: pci15ad,7a0 at 17,5, pcieb21
pcieb21 is /pci at 0,0/pci15ad,7a0 at 17,5
PCI Express-device: pci15ad,7a0 at 17,6, pcieb22
pcieb22 is /pci at 0,0/pci15ad,7a0 at 17,6
PCI Express-device: pci15ad,7a0 at 17,7, pcieb23
pcieb23 is /pci at 0,0/pci15ad,7a0 at 17,7
PCI Express-device: pci15ad,7a0 at 18, pcieb24
pcieb24 is /pci at 0,0/pci15ad,7a0 at 18
PCI Express-device: pci15ad,7a0 at 18,1, pcieb25
pcieb25 is /pci at 0,0/pci15ad,7a0 at 18,1
PCI Express-device: pci15ad,7a0 at 18,2, pcieb26
pcieb26 is /pci at 0,0/pci15ad,7a0 at 18,2
PCI Express-device: pci15ad,7a0 at 18,3, pcieb27
pcieb27 is /pci at 0,0/pci15ad,7a0 at 18,3
PCI Express-device: pci15ad,7a0 at 18,4, pcieb28
pcieb28 is /pci at 0,0/pci15ad,7a0 at 18,4
PCI Express-device: pci15ad,7a0 at 18,5, pcieb29
pcieb29 is /pci at 0,0/pci15ad,7a0 at 18,5
PCI Express-device: pci15ad,7a0 at 18,6, pcieb30
pcieb30 is /pci at 0,0/pci15ad,7a0 at 18,6
PCI Express-device: pci15ad,7a0 at 18,7, pcieb31
pcieb31 is /pci at 0,0/pci15ad,7a0 at 18,7
pseudo-device: stmf_sbd0
stmf_sbd0 is /pseudo/stmf_sbd at 0
PCI Express-device: pci15ad,790 at 11, pci_pci1
pci_pci1 is /pci at 0,0/pci15ad,790 at 11
NOTICE: e1000g0 registered
NOTICE: e1000g0 link up, 1000 Mbps, full duplex
pseudo-device: devinfo0
devinfo0 is /pseudo/devinfo at 0
pseudo-device: zfs0
zfs0 is /pseudo/zfs at 0
WARNING: drmach_init: number of logical CPUs (3) in physical processor is not power of 2.
This Solaris instance has UUID a43f56bd-f5a4-6643-92e6-84262b07c26a
dump on /dev/zvol/dsk/rpool/dump size 4096 MB
pseudo-device: pm0
pm0 is /pseudo/pm at 0
pseudo-device: power0
power0 is /pseudo/power at 0
pseudo-device: srn0
srn0 is /pseudo/srn at 0
iscsi0 at root
iscsi0 is /iscsi
ISA-device: fdc0
fd0 at fdc0
fd0 is /pci at 0,0/isa at 7/fdc at 1,3f0/fd at 0,0
audioens#0: AC'97 codec id Cirrus Logic 0x43525913 (43525913, 2 channels, caps 0)
PCI-device: pci1274,1371 at 1, audioens0
audioens0 is /pci at 0,0/pci15ad,790 at 11/pci1274,1371 at 1
        ATAPI device at targ 0, lun 0 lastlun 0x0
        model VMware Virtual IDE CDROM Drive
        ATA/ATAPI-4 supported, majver 0x1e minver 0x17
PCI Express-device: ide at 1, ata1
ata1 is /pci at 0,0/pci-ide at 7,1/ide at 1
        UltraDMA mode 2 selected
sd1 at ata1: target 0 lun 0
sd1 is /pci at 0,0/pci-ide at 7,1/ide at 1/sd at 0,0
device pciclass,030000 at f(display#0) keeps up device sd at 0,0(sd#1), but the former is not power managed
pseudo-device: pool0
pool0 is /pseudo/pool at 0
pseudo-device: dtrace0
dtrace0 is /pseudo/dtrace at 0
pseudo-device: devinfo0
devinfo0 is /pseudo/devinfo at 0

panic[cpu2]/thread=ffffff02e40fb160:
BAD TRAP: type=e (#pf Page fault) rp=ffffff001010c010 addr=fffffffffb8484b0


make:
#pf Page fault
Bad kernel fault at addr=0xfffffffffb8484b0
pid=19583, pc=0xfffffffffb8484b0, sp=0xffffff001010c108, eflags=0x10202
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 406b8<osxsav,xmme,fxsr,pge,pae,pse,de>
cr2: ffffff02d90f0ff8
cr3: 234af6000
cr8: 0

        rdi: ffffff001010c110 rsi: fffffffffb8484b0 rdx:                2
        rcx:                0  r8: fffffffffbc723a0  r9:               78
        rax: fffffffffbc72420 rbx:         fee19000 rbp: ffffff001010c110
        r10: fffffffffbcf8cb0 r11: ffffff02e40fb160 r12:                0
        r13:                0 r14: ffffff02fb5e1c00 r15: fffffffffb8484b0
        fsb:                0 gsb: ffffff02dbcb4080  ds:               4b
         es:               4b  fs:                0  gs:              1c3
        trp:                e err:               10 rip: fffffffffb8484b0
         cs:               30 rfl:            10202 rsp: ffffff001010c108
         ss:                0

ffffff001010bef0 unix:real_mode_stop_cpu_stage2_end+9e43 ()
ffffff001010c000 unix:trap+db3 ()
ffffff001010c010 unix:cmntrap+e6 ()
ffffff001010c110 unix:trap+0 ()
ffffff001010c210 unix:trap+0 ()
ffffff001010c310 unix:trap+0 ()
ffffff001010c410 unix:trap+0 ()
ffffff001010c510 unix:trap+0 ()
ffffff001010c610 unix:trap+0 ()
ffffff001010c710 unix:trap+0 ()
ffffff001010c810 unix:trap+0 ()
ffffff001010c910 unix:trap+0 ()
ffffff001010ca10 unix:trap+0 ()
ffffff001010cb10 unix:trap+0 ()
ffffff001010cc10 unix:trap+0 ()
ffffff001010cd10 unix:trap+0 ()
ffffff001010ce10 unix:trap+0 ()
ffffff001010cf10 unix:trap+0 ()
ffffff001010d010 unix:trap+0 ()
ffffff001010d110 unix:trap+0 ()
ffffff001010d210 unix:trap+0 ()
ffffff001010d310 unix:trap+0 ()
ffffff001010d410 unix:trap+0 ()
ffffff001010d510 unix:trap+0 ()
ffffff001010d610 unix:trap+0 ()
ffffff001010d710 unix:trap+0 ()
ffffff001010d810 unix:trap+0 ()
ffffff001010d910 unix:trap+0 ()
ffffff001010da10 unix:trap+0 ()
ffffff001010db10 unix:trap+0 ()
ffffff001010dc10 unix:trap+0 ()
ffffff001010dd10 unix:trap+0 ()
ffffff001010de10 unix:trap+0 ()
ffffff001010df10 unix:trap+0 ()

syncing file systems...
 done
dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel

> fffffffffbc3b540::cpuinfo -v
 ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
  2 fffffffffbc3b540  1b    1    0  19   no    no t-0    ffffff02e40fb160 make
                       |    |
            RUNNING <--+    +-->  PRI THREAD           PROC
              READY                 0 ffffff02fbb5dae0 gcc
             EXISTS
             ENABLE




             -Ryan

From josh at sysmgr.org  Wed Dec 30 01:51:13 2015
From: josh at sysmgr.org (Joshua M. Clulow)
Date: Tue, 29 Dec 2015 17:51:13 -0800
Subject: [OmniOS-discuss] Panic, BAD TRAP, r151014, VMWare Fusion 8.1.0
In-Reply-To: <m28u4c912l.fsf@zinascii.com>
References: <m28u4c912l.fsf@zinascii.com>
Message-ID: <CAEwA5nJKfSGVb3xkpVfUkOHz0R8qjO04onXMFynP7SnjKfNkjg@mail.gmail.com>

On 29 December 2015 at 17:28, Ryan Zezeski <ryan at zinascii.com> wrote:
> While running a nightly build of illumos-gate the kernel panicked with
> "BAD TRAP". Running OmniOS r151014 on VMWare Fusion.
>
> VMWare Fusion Version 8.1.0 (3272237)

What model of Mac is this, and what model of CPU is in it?  We have
experienced some issues with SmartOS running in VMware Fusion on some
models of Intel CPU.  I believe there is an erratum about spurious
page faults when running a hypervisor that makes use of EPT.


Cheers.

-- 
Joshua M. Clulow
UNIX Admin/Developer
http://blog.sysmgr.org

From ryan at zinascii.com  Wed Dec 30 02:11:34 2015
From: ryan at zinascii.com (Ryan Zezeski)
Date: Tue, 29 Dec 2015 21:11:34 -0500
Subject: [OmniOS-discuss] Panic, BAD TRAP, r151014, VMWare Fusion 8.1.0
In-Reply-To: <CAEwA5nJKfSGVb3xkpVfUkOHz0R8qjO04onXMFynP7SnjKfNkjg@mail.gmail.com>
References: <m28u4c912l.fsf@zinascii.com>
	<CAEwA5nJKfSGVb3xkpVfUkOHz0R8qjO04onXMFynP7SnjKfNkjg@mail.gmail.com>
Message-ID: <m27fjw8z3d.fsf@zinascii.com>


Joshua M. Clulow writes:

> On 29 December 2015 at 17:28, Ryan Zezeski <ryan at zinascii.com> wrote:
>> While running a nightly build of illumos-gate the kernel panicked with
>> "BAD TRAP". Running OmniOS r151014 on VMWare Fusion.
>>
>> VMWare Fusion Version 8.1.0 (3272237)
>
> What model of Mac is this, and what model of CPU is in it?  We have
> experienced some issues with SmartOS running in VMware Fusion on some
> models of Intel CPU.  I believe there is an erratum about spurious
> page faults when running a hypervisor that makes use of EPT.

MacBook Pro Retina 15" Late 2013
Model Identifier: MacBookPro11,3
System Version:	OS X 10.11.2 (15C50)

$ sysctl -a machdep.cpu
machdep.cpu.max_basic: 13
machdep.cpu.max_ext: 2147483656
machdep.cpu.vendor: GenuineIntel
machdep.cpu.brand_string: Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz
machdep.cpu.family: 6
machdep.cpu.model: 70
machdep.cpu.extmodel: 4
machdep.cpu.extfamily: 0
machdep.cpu.stepping: 1
machdep.cpu.feature_bits: 9221960262849657855
machdep.cpu.leaf7_feature_bits: 12219
machdep.cpu.extfeature_bits: 142473169152
machdep.cpu.signature: 263777
machdep.cpu.brand: 0
machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C
machdep.cpu.leaf7_features: SMEP ERMS RDWRFSGS TSC_THREAD_OFFSET BMI1 HLE AVX2 BMI2 INVPCID RTM FPU_CSDS
machdep.cpu.extfeatures: SYSCALL XD 1GBPAGE EM64T LAHF LZCNT RDTSCP TSCI
machdep.cpu.logical_per_package: 16
machdep.cpu.cores_per_package: 8
machdep.cpu.microcode_version: 15
machdep.cpu.processor_flag: 5
machdep.cpu.mwait.linesize_min: 64
machdep.cpu.mwait.linesize_max: 64
machdep.cpu.mwait.extensions: 3
machdep.cpu.mwait.sub_Cstates: 270624
machdep.cpu.thermal.sensor: 1
machdep.cpu.thermal.dynamic_acceleration: 1
machdep.cpu.thermal.invariant_APIC_timer: 1
machdep.cpu.thermal.thresholds: 2
machdep.cpu.thermal.ACNT_MCNT: 1
machdep.cpu.thermal.core_power_limits: 1
machdep.cpu.thermal.fine_grain_clock_mod: 1
machdep.cpu.thermal.package_thermal_intr: 1
machdep.cpu.thermal.hardware_feedback: 0
machdep.cpu.thermal.energy_policy: 1
machdep.cpu.xsave.extended_state: 7 832 832 0
machdep.cpu.xsave.extended_state1: 1 0 0 0
machdep.cpu.arch_perf.version: 3
machdep.cpu.arch_perf.number: 4
machdep.cpu.arch_perf.width: 48
machdep.cpu.arch_perf.events_number: 7
machdep.cpu.arch_perf.events: 0
machdep.cpu.arch_perf.fixed_number: 3
machdep.cpu.arch_perf.fixed_width: 48
machdep.cpu.cache.linesize: 64
machdep.cpu.cache.L2_associativity: 8
machdep.cpu.cache.size: 256
machdep.cpu.tlb.inst.large: 8
machdep.cpu.tlb.data.small: 64
machdep.cpu.tlb.data.small_level1: 64
machdep.cpu.tlb.shared: 1024
machdep.cpu.address_bits.physical: 39
machdep.cpu.address_bits.virtual: 48
machdep.cpu.core_count: 4
machdep.cpu.thread_count: 8
machdep.cpu.tsc_ccc.numerator: 0
machdep.cpu.tsc_ccc.denominator: 0


        -Ryan

From danmcd at omniti.com  Wed Dec 30 04:21:57 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 29 Dec 2015 23:21:57 -0500
Subject: [OmniOS-discuss] Panic, BAD TRAP, r151014, VMWare Fusion 8.1.0
In-Reply-To: <m27fjw8z3d.fsf@zinascii.com>
References: <m28u4c912l.fsf@zinascii.com>
	<CAEwA5nJKfSGVb3xkpVfUkOHz0R8qjO04onXMFynP7SnjKfNkjg@mail.gmail.com>
	<m27fjw8z3d.fsf@zinascii.com>
Message-ID: <9E905831-B7CE-40A9-BD29-196079023D64@omniti.com>

So is r151014 panicking?  Or is your nightly build panicking?  If the latter, it's not strictly an OmniOS problem.  Also, if your uname -a is correct, you need to update your r151014.

Dan

Sent from my iPhone (typos, autocorrect, and all)

> On Dec 29, 2015, at 9:11 PM, Ryan Zezeski <ryan at zinascii.com> wrote:
> 
> 
> Joshua M. Clulow writes:
> 
>>> On 29 December 2015 at 17:28, Ryan Zezeski <ryan at zinascii.com> wrote:
>>> While running a nightly build of illumos-gate the kernel panicked with
>>> "BAD TRAP". Running OmniOS r151014 on VMWare Fusion.
>>> 
>>> VMWare Fusion Version 8.1.0 (3272237)
>> 
>> What model of Mac is this, and what model of CPU is in it?  We have
>> experienced some issues with SmartOS running in VMware Fusion on some
>> models of Intel CPU.  I believe there is an erratum about spurious
>> page faults when running a hypervisor that makes use of EPT.
> 
> MacBook Pro Retina 15" Late 2013
> Model Identifier: MacBookPro11,3
> System Version:    OS X 10.11.2 (15C50)
> 
> $ sysctl -a machdep.cpu
> machdep.cpu.max_basic: 13
> machdep.cpu.max_ext: 2147483656
> machdep.cpu.vendor: GenuineIntel
> machdep.cpu.brand_string: Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz
> machdep.cpu.family: 6
> machdep.cpu.model: 70
> machdep.cpu.extmodel: 4
> machdep.cpu.extfamily: 0
> machdep.cpu.stepping: 1
> machdep.cpu.feature_bits: 9221960262849657855
> machdep.cpu.leaf7_feature_bits: 12219
> machdep.cpu.extfeature_bits: 142473169152
> machdep.cpu.signature: 263777
> machdep.cpu.brand: 0
> machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C
> machdep.cpu.leaf7_features: SMEP ERMS RDWRFSGS TSC_THREAD_OFFSET BMI1 HLE AVX2 BMI2 INVPCID RTM FPU_CSDS
> machdep.cpu.extfeatures: SYSCALL XD 1GBPAGE EM64T LAHF LZCNT RDTSCP TSCI
> machdep.cpu.logical_per_package: 16
> machdep.cpu.cores_per_package: 8
> machdep.cpu.microcode_version: 15
> machdep.cpu.processor_flag: 5
> machdep.cpu.mwait.linesize_min: 64
> machdep.cpu.mwait.linesize_max: 64
> machdep.cpu.mwait.extensions: 3
> machdep.cpu.mwait.sub_Cstates: 270624
> machdep.cpu.thermal.sensor: 1
> machdep.cpu.thermal.dynamic_acceleration: 1
> machdep.cpu.thermal.invariant_APIC_timer: 1
> machdep.cpu.thermal.thresholds: 2
> machdep.cpu.thermal.ACNT_MCNT: 1
> machdep.cpu.thermal.core_power_limits: 1
> machdep.cpu.thermal.fine_grain_clock_mod: 1
> machdep.cpu.thermal.package_thermal_intr: 1
> machdep.cpu.thermal.hardware_feedback: 0
> machdep.cpu.thermal.energy_policy: 1
> machdep.cpu.xsave.extended_state: 7 832 832 0
> machdep.cpu.xsave.extended_state1: 1 0 0 0
> machdep.cpu.arch_perf.version: 3
> machdep.cpu.arch_perf.number: 4
> machdep.cpu.arch_perf.width: 48
> machdep.cpu.arch_perf.events_number: 7
> machdep.cpu.arch_perf.events: 0
> machdep.cpu.arch_perf.fixed_number: 3
> machdep.cpu.arch_perf.fixed_width: 48
> machdep.cpu.cache.linesize: 64
> machdep.cpu.cache.L2_associativity: 8
> machdep.cpu.cache.size: 256
> machdep.cpu.tlb.inst.large: 8
> machdep.cpu.tlb.data.small: 64
> machdep.cpu.tlb.data.small_level1: 64
> machdep.cpu.tlb.shared: 1024
> machdep.cpu.address_bits.physical: 39
> machdep.cpu.address_bits.virtual: 48
> machdep.cpu.core_count: 4
> machdep.cpu.thread_count: 8
> machdep.cpu.tsc_ccc.numerator: 0
> machdep.cpu.tsc_ccc.denominator: 0
> 
> 
>       -Ryan
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

From ryan at zinascii.com  Wed Dec 30 04:36:52 2015
From: ryan at zinascii.com (Ryan Zezeski)
Date: Tue, 29 Dec 2015 23:36:52 -0500
Subject: [OmniOS-discuss] Panic, BAD TRAP, r151014, VMWare Fusion 8.1.0
In-Reply-To: <9E905831-B7CE-40A9-BD29-196079023D64@omniti.com>
References: <m28u4c912l.fsf@zinascii.com>
	<CAEwA5nJKfSGVb3xkpVfUkOHz0R8qjO04onXMFynP7SnjKfNkjg@mail.gmail.com>
	<m27fjw8z3d.fsf@zinascii.com>
	<9E905831-B7CE-40A9-BD29-196079023D64@omniti.com>
Message-ID: <m24mf08sd7.fsf@zinascii.com>


Dan McDonald writes:

> So is r151014 panicking? Or is your nightly build panicking? If the
> latter, it's not strictly an OmniOS problem.

r151014 panicked, not my nightly build. I.e., this was not an ONU'd
system that panicked.

>   Also, if your uname -a is correct, you need to update your r151014.

Yes, I need to update. I haven't touched this system in a while. I
needed to run a nightly build and didn't want to wait to upgrade first.

       -Ryan

From bfriesen at simple.dallas.tx.us  Thu Dec 31 15:45:48 2015
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Thu, 31 Dec 2015 09:45:48 -0600 (CST)
Subject: [OmniOS-discuss] OmniOS r151016 zone has difficulties shutting
 down
In-Reply-To: <C328ACF6-18BC-49AB-BFA8-5E47963D8C81@omniti.com>
References: <alpine.GSO.2.01.1512061727590.1673@freddy.simplesystems.org>
	<536501D2-EA96-4F6B-8CB2-39A0F9698267@omniti.com>
	<alpine.GSO.2.01.1512211720120.28454@freddy.simplesystems.org>
	<C328ACF6-18BC-49AB-BFA8-5E47963D8C81@omniti.com>
Message-ID: <alpine.GSO.2.01.1512310938030.28454@freddy.simplesystems.org>

I would like to share a idea regarding how the newly created zone 
may have difficulties shutting down.

I am accessing the system via ssh which uses '~.' to terminate the ssh 
session.  This is also the default shutdown sequence for 'zlogin -C'. 
The idea is that after doing some initial administration using 'zlogin 
-C', the "~." sequence was used to quit it.  This terminates the ssh 
session rather than the zone console login.  While subsequent 'zlogin 
-C' works, it may be that the unclean/violent termination of the zone 
console login has left behind residue which prevents the zone from 
shutting down.

Yesterday I created a new zone, which had no problems shutting down. 
I did not use 'zlogin -C' (only ordinary 'zlogin') with this new zone.

It may be that 'zlogin -e ! -C' and then using '!.' to terminate the 
zlogin would avoid the problem.

Regardless, this would still be a bug.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From Kevin.Swab at ColoState.EDU  Thu Dec 31 16:19:20 2015
From: Kevin.Swab at ColoState.EDU (Kevin Swab)
Date: Thu, 31 Dec 2015 09:19:20 -0700
Subject: [OmniOS-discuss] OmniOS r151016 zone has difficulties shutting
 down
In-Reply-To: <alpine.GSO.2.01.1512310938030.28454@freddy.simplesystems.org>
References: <alpine.GSO.2.01.1512061727590.1673@freddy.simplesystems.org>
	<536501D2-EA96-4F6B-8CB2-39A0F9698267@omniti.com>
	<alpine.GSO.2.01.1512211720120.28454@freddy.simplesystems.org>
	<C328ACF6-18BC-49AB-BFA8-5E47963D8C81@omniti.com>
	<alpine.GSO.2.01.1512310938030.28454@freddy.simplesystems.org>
Message-ID: <56855588.7040308@ColoState.EDU>

You can work round this by escaping the '~' character.  Try typing '~~.'
to exit from 'zlogin -C'.  Here's the relevant section from the ssh man
page:

>  A single tilde character can be sent as ~~, or by following the tilde
> with a character other than those described above. The escape character
> must always follow a newline to be interpreted as special. The escape
> character can be changed in configuration files or on the command line.

HTH, Kevin


On 12/31/2015 08:45 AM, Bob Friesenhahn wrote:
> I would like to share a idea regarding how the newly created zone may
> have difficulties shutting down.
> 
> I am accessing the system via ssh which uses '~.' to terminate the ssh
> session.  This is also the default shutdown sequence for 'zlogin -C'.
> The idea is that after doing some initial administration using 'zlogin
> -C', the "~." sequence was used to quit it.  This terminates the ssh
> session rather than the zone console login.  While subsequent 'zlogin
> -C' works, it may be that the unclean/violent termination of the zone
> console login has left behind residue which prevents the zone from
> shutting down.
> 
> Yesterday I created a new zone, which had no problems shutting down. I
> did not use 'zlogin -C' (only ordinary 'zlogin') with this new zone.
> 
> It may be that 'zlogin -e ! -C' and then using '!.' to terminate the
> zlogin would avoid the problem.
> 
> Regardless, this would still be a bug.
> 
> Bob

-- 
-------------------------------------------------------------------
Kevin Swab                          UNIX Systems Administrator
ACNS                                Colorado State University
Phone: (970)491-6572                Email: Kevin.Swab at ColoState.EDU
GPG Fingerprint: 7026 3F66 A970 67BD 6F17  8EB8 8A7D 142F 2392 791C