From yavoritomov at gmail.com  Fri May  1 16:42:29 2015
From: yavoritomov at gmail.com (Yavor Tomov)
Date: Fri, 1 May 2015 11:42:29 -0500
Subject: [OmniOS-discuss] ZFS ACL Solaris CIFS and Windows client
In-Reply-To: <5541D215.2020603@gmx.net>
References: <mailman.405.1430330829.89990.omnios-discuss@lists.omniti.com>
	<5541D215.2020603@gmx.net>
Message-ID: <CAJ5wh+5x-Gx2r7UN5BTBCs5MwbsVkU4wLUdHcaf80s8J65m=pg@mail.gmail.com>

This is an old guide I made long time ago should help you connect and set
permissions.

On Thu, Apr 30, 2015 at 1:56 AM, Sebastian Gabler <sequoiamobil at gmx.net>
wrote:

> Am 29.04.2015 um 20:07 schrieb omnios-discuss-request at lists.omniti.com:
>
>> Message: 3
>> Date: Tue, 28 Apr 2015 19:22:34 +0200
>> From: G?nther Alka <alka at hfg-gmuend.de>
>> To: omnios-discuss <omnios-discuss at lists.omniti.com>
>> Subject: Re: [OmniOS-discuss] ZFS ACL Solaris CIFS and Windows client
>> Message-ID: <9D064AA0-0C34-444F-9FF0-900F32EFF5B9 at hfg-gmuend.de>
>> Content-Type: text/plain; charset=utf-8
>>
>> Lets?s begin with ZFS properties
>> - aclinhert: passthrough
>>
> Thanks. It was on "restricted". I applied the change, but that makes no
> difference to my original problem.
>
>> - aclmode: does not matter for CIFS
>>
> Thanks. Do you have any sources for that for futher studies?
>
>>
>> Next, set idmappings
>> - in Workgroup mode: do not set any user mappings (only group mappings)
>> - in Domain mode: set domainadmins => root
>>
> That's already the case. On that occasion: how would one delegate operator
> permissions for ACL assignment to other users. i.e. if I want certain
> Domain Users to change ACLs, permissions, and privileges, on shares of the
> illumos machine, who are not member of the domain admin group?
>
>>
>> Next: join AD Domain (for domain mode)
>>
>> Next: SMB connect
>> - use root (requires a passwd root to generate s SMB password) or
>> - use an Domain Admin account (requires the idmapping to root)
>>
> I am using the domain admin account. Note: what specifically is not
> working is to set ownership on behalf of a different domain user.
>
>>
>> Windows version:
>> - you need Windows Pro or Windows server (no home edition)
>>
> Known.
>
>>
>> Now you should be able to set ownership and ACL on files and folders.
>>
>> If you want to set ACL on shares, you must
>> - SMB connect as a user that is a member of the Administrators group
>> - use Computer Management on Windows and connect OmniOS
>>
> Trying the latter ends up in "access denied".
> Maybe there is something broken with the user mapping. (i.e., the domain
> admin >root mapping was done, but how do I check if it is in effect, how do
> I check if root (who is in my understanding the provider of the permissions
> to domain admin, right?) has the required privs?
>
>>
>>
>> Gea
>>
>>
>>  Am 28.04.2015 um 14:09 schrieb Sebastian Gabler <sequoiamobil at gmx.net>:
>>>
>>> Hi,
>>>
>>> I am a bit stuck in getting my ACL management straight for the CIFS
>>> shares I run. What I would like to do is to set all the ACLs from Windows.
>>> What does not work right now is to assign ownership to a sharepoint or an
>>> object below it to a different user, i.e. to set ownership as the Domain
>>> Administrator to a specific user. I get an error message that a "Restore"
>>> privilege would be missing, but the error message is unclear if that
>>> applies to the current context (Domain Administrator), or the prospective
>>> owner. I can set full control for that user, however.
>>> Specifically,
>>> 1. I am wondering how to get, from my illumos machine, the privileges
>>> applicable on an object for a certain user.
>>> 2. finding out what is required to take/provide ownership, specifically
>>> of a sharepoint, from Windows, (ACLs, idmap, ZFS acl modes and inhertiance
>>> modes, etc), and in what hierarchy things apply.
>>> I am aware that this may be a FAQ, but I didn't find comprehensive
>>> documentation on the matter. The Oracle docs are focussed to explain how
>>> things work from the Solaris side, most HowTos that include the Windows
>>> side are not deep enough.
>>>
>>> Thanks for any hints.
>>>
>>> With best regards,
>>>
>>> Sebastian
>>> _______________________________________________
>>> OmniOS-discuss mailing list
>>> OmniOS-discuss at lists.omniti.com
>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>>>
>>
>>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150501/6c157a1f/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenIndiana Windows 2008 R2 AD.pdf
Type: application/pdf
Size: 526095 bytes
Desc: not available
URL: <https://omniosce.org/ml-archive/attachments/20150501/6c157a1f/attachment-0001.pdf>

From alka at hfg-gmuend.de  Fri May  1 17:32:49 2015
From: alka at hfg-gmuend.de (=?utf-8?Q?G=C3=BCnther_Alka?=)
Date: Fri, 1 May 2015 19:32:49 +0200
Subject: [OmniOS-discuss] ZFS ACL Solaris CIFS and Windows client
In-Reply-To: <5541D215.2020603@gmx.net>
References: <mailman.405.1430330829.89990.omnios-discuss@lists.omniti.com>
	<5541D215.2020603@gmx.net>
Message-ID: <725A130D-701C-4DAE-95A0-928089C1CA56@hfg-gmuend.de>

ZFS properties, see Oracke docs ex
http://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/gbaaz/index.html <http://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/gbaaz/index.html>

If you want full permissions on files on an SMB share, you must either connect as user root
or as an AD user that is idmapped to Unix root

Adding a user to the SMB group administrators is needed for some administration 
tasks (ex remote computer management) but root permission is the key for 
any file permission problems.


Gea


> Am 30.04.2015 um 08:56 schrieb Sebastian Gabler <sequoiamobil at gmx.net>:
> 
> Am 29.04.2015 um 20:07 schrieb omnios-discuss-request at lists.omniti.com:
>> Message: 3
>> Date: Tue, 28 Apr 2015 19:22:34 +0200
>> From: G?nther Alka <alka at hfg-gmuend.de>
>> To: omnios-discuss <omnios-discuss at lists.omniti.com>
>> Subject: Re: [OmniOS-discuss] ZFS ACL Solaris CIFS and Windows client
>> Message-ID: <9D064AA0-0C34-444F-9FF0-900F32EFF5B9 at hfg-gmuend.de>
>> Content-Type: text/plain; charset=utf-8
>> 
>> Lets?s begin with ZFS properties
>> - aclinhert: passthrough
> Thanks. It was on "restricted". I applied the change, but that makes no difference to my original problem.
>> - aclmode: does not matter for CIFS
> Thanks. Do you have any sources for that for futher studies?
>> 
>> Next, set idmappings
>> - in Workgroup mode: do not set any user mappings (only group mappings)
>> - in Domain mode: set domainadmins => root
> That's already the case. On that occasion: how would one delegate operator permissions for ACL assignment to other users. i.e. if I want certain Domain Users to change ACLs, permissions, and privileges, on shares of the illumos machine, who are not member of the domain admin group?
>> 
>> Next: join AD Domain (for domain mode)
>> 
>> Next: SMB connect
>> - use root (requires a passwd root to generate s SMB password) or
>> - use an Domain Admin account (requires the idmapping to root)
> I am using the domain admin account. Note: what specifically is not working is to set ownership on behalf of a different domain user.
>> 
>> Windows version:
>> - you need Windows Pro or Windows server (no home edition)
> Known.
>> 
>> Now you should be able to set ownership and ACL on files and folders.
>> 
>> If you want to set ACL on shares, you must
>> - SMB connect as a user that is a member of the Administrators group
>> - use Computer Management on Windows and connect OmniOS
> Trying the latter ends up in "access denied".
> Maybe there is something broken with the user mapping. (i.e., the domain admin >root mapping was done, but how do I check if it is in effect, how do I check if root (who is in my understanding the provider of the permissions to domain admin, right?) has the required privs?
>> 
>> 
>> Gea
>> 
>> 
>>> Am 28.04.2015 um 14:09 schrieb Sebastian Gabler <sequoiamobil at gmx.net>:
>>> 
>>> Hi,
>>> 
>>> I am a bit stuck in getting my ACL management straight for the CIFS shares I run. What I would like to do is to set all the ACLs from Windows. What does not work right now is to assign ownership to a sharepoint or an object below it to a different user, i.e. to set ownership as the Domain Administrator to a specific user. I get an error message that a "Restore" privilege would be missing, but the error message is unclear if that applies to the current context (Domain Administrator), or the prospective owner. I can set full control for that user, however.
>>> Specifically,
>>> 1. I am wondering how to get, from my illumos machine, the privileges applicable on an object for a certain user.
>>> 2. finding out what is required to take/provide ownership, specifically of a sharepoint, from Windows, (ACLs, idmap, ZFS acl modes and inhertiance modes, etc), and in what hierarchy things apply.
>>> I am aware that this may be a FAQ, but I didn't find comprehensive documentation on the matter. The Oracle docs are focussed to explain how things work from the Solaris side, most HowTos that include the Windows side are not deep enough.
>>> 
>>> Thanks for any hints.
>>> 
>>> With best regards,
>>> 
>>> Sebastian
>>> _______________________________________________
>>> OmniOS-discuss mailing list
>>> OmniOS-discuss at lists.omniti.com
>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>> 
> 
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150501/4a8ddf84/attachment.html>

From doug at will.to  Fri May  1 21:40:04 2015
From: doug at will.to (Doug Hughes)
Date: Fri, 1 May 2015 17:40:04 -0400
Subject: [OmniOS-discuss] r151012 fresh install to r151014
Message-ID: <CAOpmc6xZ9ViDe1YnWTjOy=+VT-wiuHJYwuRgvxrfKLp+==G2uw@mail.gmail.com>

haven't had time to upgrade all my kayak stuff to r151014 yet, so going to
r151012 and then doing the pkg upgrade with alt BE, but running into a
consistent problem

After the install and making sure I have all of the latest r151012 there, I
do the unset publisher, the set publisher, and then upgrade (as normal, and
as on the web page) and get an error. It's a very persistent error and the
same every time. I know it's not an actual problem connecting to the
website because I could upgrade the r151012 just fine (several times)

# pkg unset-publisher omnios
# /usr/bin/pkg set-publisher -P --set-property
signature-policy=require-signatures -g http://pkg.omniti.com/omnios/r151014/
omnios
# /usr/bin/pkg update --be-name=omnios-r151014 entire at 11,5.11-0.151014
           Packages to install:   4
            Packages to update: 391
           Mediators to change:   1
       Create boot environment: Yes
Create backup boot environment:  No

DOWNLOAD                                  PKGS       FILES    XFER (MB)
library/python-2/lxml-26                14/395    85/12717    2.7/260.4


Errors were encountered while attempting to retrieve package or file data
for
the requested operation.
Details follow:

Framework error: code: 56 reason: Recv failure: Connection reset by peer
URL: '
http://pkg.omniti.com/omnios/r151014/omnios/file/1/e11e44f204ee81611903399694ce0ed20d6ade9c'.
(happened 4 times)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150501/65d5c908/attachment.html>

From jdg117 at elvis.arl.psu.edu  Fri May  1 21:55:57 2015
From: jdg117 at elvis.arl.psu.edu (John D Groenveld)
Date: Fri, 01 May 2015 17:55:57 -0400
Subject: [OmniOS-discuss] r151012 fresh install to r151014
In-Reply-To: Your message of "Fri, 01 May 2015 17:40:04 EDT."
	<CAOpmc6xZ9ViDe1YnWTjOy=+VT-wiuHJYwuRgvxrfKLp+==G2uw@mail.gmail.com> 
References: <CAOpmc6xZ9ViDe1YnWTjOy=+VT-wiuHJYwuRgvxrfKLp+==G2uw@mail.gmail.com>
Message-ID: <201505012155.t41LtvFF028687@elvis.arl.psu.edu>

In message <CAOpmc6xZ9ViDe1YnWTjOy=+VT-wiuHJYwuRgvxrfKLp+==G2uw at mail.gmail.com>
, Doug Hughes writes:
>DOWNLOAD                                  PKGS       FILES    XFER (MB)
>library/python-2/lxml-26                14/395    85/12717    2.7/260.4
>
>
>Errors were encountered while attempting to retrieve package or file data
>for
>the requested operation.
>Details follow:
>
>Framework error: code: 56 reason: Recv failure: Connection reset by peer
>URL: '
>http://pkg.omniti.com/omnios/r151014/omnios/file/1/e11e44f204ee81611903399694c
>e0ed20d6ade9c'.
>(happened 4 times)

Between your host and pkg.omniti.com exists a transparent web
proxy with the hash of that file in its malware signatures
database.

Sneakernet that file and drop it in /var/pkg [subdirectory
I can't remember but you'll easily find(1) -name e1]

John
groenveld at acm.org

From eric.sproul at circonus.com  Fri May  1 22:05:42 2015
From: eric.sproul at circonus.com (Eric Sproul)
Date: Fri, 1 May 2015 18:05:42 -0400
Subject: [OmniOS-discuss] r151012 fresh install to r151014
In-Reply-To: <201505012155.t41LtvFF028687@elvis.arl.psu.edu>
References: <CAOpmc6xZ9ViDe1YnWTjOy=+VT-wiuHJYwuRgvxrfKLp+==G2uw@mail.gmail.com>
	<201505012155.t41LtvFF028687@elvis.arl.psu.edu>
Message-ID: <CAO8hXRDFoc+TUdCTShPiZiNP-Qo7_FY_D4YNHvzmHS--GCktAw@mail.gmail.com>

On Fri, May 1, 2015 at 5:55 PM, John D Groenveld
<jdg117 at elvis.arl.psu.edu> wrote:
> In message <CAOpmc6xZ9ViDe1YnWTjOy=+VT-wiuHJYwuRgvxrfKLp+==G2uw at mail.gmail.com>
> , Doug Hughes writes:
>>DOWNLOAD                                  PKGS       FILES    XFER (MB)
>>library/python-2/lxml-26                14/395    85/12717    2.7/260.4
>>
>>
>>Errors were encountered while attempting to retrieve package or file data
>>for
>>the requested operation.
>>Details follow:
>>
>>Framework error: code: 56 reason: Recv failure: Connection reset by peer
>>URL: '
>>http://pkg.omniti.com/omnios/r151014/omnios/file/1/e11e44f204ee81611903399694c
>>e0ed20d6ade9c'.
>>(happened 4 times)
>
> Between your host and pkg.omniti.com exists a transparent web
> proxy with the hash of that file in its malware signatures
> database.

Wow.  That's, um, fun.  For those playing along at home, the
"offending" file is:

/usr/lib/python2.6/vendor-packages/lxml/html/clean.py

http://lxml.de/api/lxml.html.clean.Cleaner-class.html

I'm guessing that's because it is often bundled with malware?  Talk
about collateral damage.

From doug at will.to  Mon May  4 03:39:54 2015
From: doug at will.to (Doug Hughes)
Date: Sun, 03 May 2015 23:39:54 -0400
Subject: [OmniOS-discuss] r151012 fresh install to r151014
In-Reply-To: <201505012155.t41LtvFF028687@elvis.arl.psu.edu>
References: <CAOpmc6xZ9ViDe1YnWTjOy=+VT-wiuHJYwuRgvxrfKLp+==G2uw@mail.gmail.com>
	<201505012155.t41LtvFF028687@elvis.arl.psu.edu>
Message-ID: <5546EA0A.2030308@will.to>

Good catch! I take if you've run into this before. Luckily, I also admin 
the firewall so I added an exception for the outbound threat trigger.



On 5/1/2015 5:55 PM, John D Groenveld wrote:
> In message <CAOpmc6xZ9ViDe1YnWTjOy=+VT-wiuHJYwuRgvxrfKLp+==G2uw at mail.gmail.com>
> , Doug Hughes writes:
>> DOWNLOAD                                  PKGS       FILES    XFER (MB)
>> library/python-2/lxml-26                14/395    85/12717    2.7/260.4
>>
>>
>> Errors were encountered while attempting to retrieve package or file data
>> for
>> the requested operation.
>> Details follow:
>>
>> Framework error: code: 56 reason: Recv failure: Connection reset by peer
>> URL: '
>> http://pkg.omniti.com/omnios/r151014/omnios/file/1/e11e44f204ee81611903399694c
>> e0ed20d6ade9c'.
>> (happened 4 times)
> Between your host and pkg.omniti.com exists a transparent web
> proxy with the hash of that file in its malware signatures
> database.
>
> Sneakernet that file and drop it in /var/pkg [subdirectory
> I can't remember but you'll easily find(1) -name e1]
>
> John
> groenveld at acm.org
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss


From davide.poletto at gmail.com  Mon May  4 17:10:13 2015
From: davide.poletto at gmail.com (Davide Poletto)
Date: Mon, 4 May 2015 19:10:13 +0200
Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and uname
Message-ID: <CANKMAMYdUGS4WzSbzNDEc3zyy_bfrf_642jCXNo7FjRfqjxZdw@mail.gmail.com>

Just to say I've noticed that uname -v reports "illumos-omnios" on a
OmniOS 151012 which was "omnios-10b9c79" after I updated it today
(packages released on 17.04.2015 at official repository):

OmniOS 5.11     omnios-10b9c79  September 2014
root at nas:/root#

OmniOS 5.11     illumos-omnios  April 2015
root at nas:/root#

Is that OK/by Design?

Just for reference on OmniOS 151014, after the same big set of updates
(released the same day, 17.04.2015), the uname -v changed from
"omnios-a708424" (from its ISO install) to "omnios-170cea2".

Regards, Davide.

From danmcd at omniti.com  Mon May  4 17:43:18 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 4 May 2015 13:43:18 -0400
Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and
	uname
In-Reply-To: <CANKMAMYdUGS4WzSbzNDEc3zyy_bfrf_642jCXNo7FjRfqjxZdw@mail.gmail.com>
References: <CANKMAMYdUGS4WzSbzNDEc3zyy_bfrf_642jCXNo7FjRfqjxZdw@mail.gmail.com>
Message-ID: <4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com>


> On May 4, 2015, at 1:10 PM, Davide Poletto <davide.poletto at gmail.com> wrote:
> 
> Just to say I've noticed that uname -v reports "illumos-omnios" on a
> OmniOS 151012 which was "omnios-10b9c79" after I updated it today
> (packages released on 17.04.2015 at official repository):
> 
> OmniOS 5.11     omnios-10b9c79  September 2014
> root at nas:/root#
> 
> OmniOS 5.11     illumos-omnios  April 2015
> root at nas:/root#
> 
> Is that OK/by Design?

That was my fault during the kernel build.  I had the wrong variable set in my .env file.

> Just for reference on OmniOS 151014, after the same big set of updates
> (released the same day, 17.04.2015), the uname -v changed from
> "omnios-a708424" (from its ISO install) to "omnios-170cea2".

Yes, I believe only r151012 was affected poorly by this.  Since 012 is in its last 6 months of support life, I'm not particularly concerned.

Dan


From davide.poletto at gmail.com  Mon May  4 19:35:46 2015
From: davide.poletto at gmail.com (Davide Poletto)
Date: Mon, 4 May 2015 21:35:46 +0200
Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and
	uname
In-Reply-To: <4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com>
References: <CANKMAMYdUGS4WzSbzNDEc3zyy_bfrf_642jCXNo7FjRfqjxZdw@mail.gmail.com>
	<4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com>
Message-ID: <CANKMAMb87yMC8jNe4GKWJKeDAdEA7CfAn5+Sk0jAmMFfYhhOgg@mail.gmail.com>

Yeah, me too now! Thanks Dan.

On Mon, May 4, 2015 at 7:43 PM, Dan McDonald <danmcd at omniti.com> wrote:
>
>> On May 4, 2015, at 1:10 PM, Davide Poletto <davide.poletto at gmail.com> wrote:
>>
>> Just to say I've noticed that uname -v reports "illumos-omnios" on a
>> OmniOS 151012 which was "omnios-10b9c79" after I updated it today
>> (packages released on 17.04.2015 at official repository):
>>
>> OmniOS 5.11     omnios-10b9c79  September 2014
>> root at nas:/root#
>>
>> OmniOS 5.11     illumos-omnios  April 2015
>> root at nas:/root#
>>
>> Is that OK/by Design?
>
> That was my fault during the kernel build.  I had the wrong variable set in my .env file.
>
>> Just for reference on OmniOS 151014, after the same big set of updates
>> (released the same day, 17.04.2015), the uname -v changed from
>> "omnios-a708424" (from its ISO install) to "omnios-170cea2".
>
> Yes, I believe only r151012 was affected poorly by this.  Since 012 is in its last 6 months of support life, I'm not particularly concerned.
>
> Dan
>

From cks at cs.toronto.edu  Mon May  4 21:45:27 2015
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Mon, 04 May 2015 17:45:27 -0400
Subject: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained high
	write loads
Message-ID: <20150504214527.583867A061E@apps0.cs.toronto.edu>

 We now have a reproducable setup with OmniOS r151014 where an OmniOS
NFS fileserver will experience memory exhaustion and then hang in the
kernel if it receives sustained NFS write traffic from multiple clients
at a rate faster than its local disks can sustain. The machine will run
okay for a while but with mdb -k's ::memstat showing steadily increasing
'Kernel' memory usage; after a while it tips over the edge, the ZFS ARC
starts shrinking, free RAM reported by 'vmstat' goes basically to nothing
(eg 182 MB), and the system locks hard.

(We have not at this point tried to make a crash dump, but past attempts
to do so in similar situations have been failures.)

 A fairly reliable signal that the system is about to lock up very
soon is that '::svc_pool nfs' will report a steadily increasing and often
very large number of 'Pending requests' (as well as all configured threads
being active). Our most recent lockup reported over 270,000 pending
requests. Our working hypothesis is that something in the NFS server code
is accepting (too many) incoming requests and filling all memory with them,
which then leads to the hard lock.

(It's possible that lower levels are also involved, eg TCP socket
receive buffers.)

 Our current simplified test setup: the OmniOS machine has 64 GB RAM
with 2x 1G Ethernet for incoming NFS writes, writing to a single pool of
a mirrored pair of 2 TB WD SE SATA drives. There are six client machines
on one network, 25 on the other, and all client machines are running
multiple processes that are writing files of various sizes (from 50 MB
through several GB); all client machines are Ubuntu Linux. We believe
(but have not tested) that multiple clients and possibly multiple
processes are required to provoke this behavior. All NFS traffic is
NFS v3 over TCP.

 Has anyone seen or heard of anything like this before?

 Is there any way to limit the number of pending NFS requests that the
system will accept? Allowing 270,000 strikes me as kind of absurd.

(I don't suppose anyone with a test environment wants to take a shot
at reproducing this. For us, this happens within an hour or three of
running at this load, and generally happens faster with smaller number
of NFS server threads.)

	- cks

From doug at will.to  Tue May  5 00:50:23 2015
From: doug at will.to (Doug Hughes)
Date: Mon, 04 May 2015 20:50:23 -0400
Subject: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained
 high write loads
In-Reply-To: <20150504214527.583867A061E@apps0.cs.toronto.edu>
References: <20150504214527.583867A061E@apps0.cs.toronto.edu>
Message-ID: <554813CF.9070800@will.to>

Yes, absolutely. We've run into this same problem, exactly as you 
describe, in Solaris10 (all versions)
You can catch it with a kernel dump, but you have to be wary and quick.

keep a vmstat 3 open (or similar), and when free mem drops below 5GB or 
so, be ready. As soon you start seeing PO or DE, that's when to take 
your crash dump.

Basically, what happens (from my understanding previously talking with 
an Oracle kernel engineer) is that the kernel just allocates tons of NFS 
buffers that keep building up and building up and there's no mechanism 
for getting rid of them in sufficient time. There really ought to be a 
RED or some sort of back pressure, but it doesn't seem to be there.

You can make this problem less likely to occur by decreasing the client 
side rsize and wsize. Linux centos/rhel6 (and similar 2.6+ kernel) 
exacerbates the problem by using 1MB rsize and wsize, which makes the 
server burn through big NFS buffers, but if you force the clients to 32k 
or perhaps even smaller, then you can push off the problem a bit.

Do you have a synthetic load test to reproduce it?

On 5/4/2015 5:45 PM, Chris Siebenmann wrote:
>   We now have a reproducable setup with OmniOS r151014 where an OmniOS
> NFS fileserver will experience memory exhaustion and then hang in the
> kernel if it receives sustained NFS write traffic from multiple clients
> at a rate faster than its local disks can sustain. The machine will run
> okay for a while but with mdb -k's ::memstat showing steadily increasing
> 'Kernel' memory usage; after a while it tips over the edge, the ZFS ARC
> starts shrinking, free RAM reported by 'vmstat' goes basically to nothing
> (eg 182 MB), and the system locks hard.
>
> (We have not at this point tried to make a crash dump, but past attempts
> to do so in similar situations have been failures.)
>
>   A fairly reliable signal that the system is about to lock up very
> soon is that '::svc_pool nfs' will report a steadily increasing and often
> very large number of 'Pending requests' (as well as all configured threads
> being active). Our most recent lockup reported over 270,000 pending
> requests. Our working hypothesis is that something in the NFS server code
> is accepting (too many) incoming requests and filling all memory with them,
> which then leads to the hard lock.
>
> (It's possible that lower levels are also involved, eg TCP socket
> receive buffers.)
>
>   Our current simplified test setup: the OmniOS machine has 64 GB RAM
> with 2x 1G Ethernet for incoming NFS writes, writing to a single pool of
> a mirrored pair of 2 TB WD SE SATA drives. There are six client machines
> on one network, 25 on the other, and all client machines are running
> multiple processes that are writing files of various sizes (from 50 MB
> through several GB); all client machines are Ubuntu Linux. We believe
> (but have not tested) that multiple clients and possibly multiple
> processes are required to provoke this behavior. All NFS traffic is
> NFS v3 over TCP.
>
>   Has anyone seen or heard of anything like this before?
>
>   Is there any way to limit the number of pending NFS requests that the
> system will accept? Allowing 270,000 strikes me as kind of absurd.
>
> (I don't suppose anyone with a test environment wants to take a shot
> at reproducing this. For us, this happens within an hour or three of
> running at this load, and generally happens faster with smaller number
> of NFS server threads.)
>
> 	- cks
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss


From danmcd at omniti.com  Tue May  5 01:03:32 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 4 May 2015 21:03:32 -0400
Subject: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained
	high write loads
In-Reply-To: <20150504214527.583867A061E@apps0.cs.toronto.edu>
References: <20150504214527.583867A061E@apps0.cs.toronto.edu>
Message-ID: <C0084B79-B540-44B4-8485-7EF86C43B845@omniti.com>


> On May 4, 2015, at 5:45 PM, Chris Siebenmann <cks at cs.toronto.edu> wrote:
> 
> 
> Is there any way to limit the number of pending NFS requests that the
> system will accept? Allowing 270,000 strikes me as kind of absurd.

I swear I've seen someone try to address this before.  Maybe it's from my Nexenta days.  I will be querying the illumos developer's list (as I suspect this affects the other distros as well if they haven't fixed it in their local illumos children).

Thanks,
Dan


From matej at zunaj.si  Tue May  5 07:46:01 2015
From: matej at zunaj.si (Matej Zerovnik)
Date: Tue, 05 May 2015 09:46:01 +0200
Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then
	resumes
Message-ID: <55487539.6030408@zunaj.si>

Hello!

Back again with a follow up from 'iSCSI target hang, no way to restart 
but server reboot', where we had troubles with random iSCSI target 
freezing and only reboot helped.

Once we had enough, we switch to a new gear and software:
- new server - IBM xServer 3550 M4 with 265GB memory and SAS HBA LSI 
Logic SAS2308 controller
- installed the latest OmniOS LTS(r151014)
- updated the firmware on LSI controller to version P19.

We still kept our SATA hard drives in Supermicro JBOD with SAS expander 
and SATA drives.

After the upgrade, things worked smooth for about a week with no errors 
in logs.

After a week, some clients reported that their iSCSI drive failed and 
remounted as read-only. Weirdly, Nagios on our end did not report any 
anomaly. I looked at OmniOS logs, and there was nothing connected with 
iscsi in them at all. After a while, all clients connected back, so 
iscsi target did not crash like it used to.

Looking at the clients logs, it seems like there was a connection error:
Apr 29 10:33:53 317 kernel: connection1:0: detected conn error (1021)
Apr 29 10:33:54 317 iscsid: Kernel reported iSCSI connection 1:0 error 
(1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result 
of SCSI error recovery) state (3)
Apr 29 10:33:56 317 iscsid: connection1:0 is operational after recovery 
(1 attempts)
Apr 29 10:36:37 317 kernel: connection1:0: detected conn error (1021)
Apr 29 10:36:37 317 iscsid: Kernel reported iSCSI connection 1:0 error 
(1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result 
of SCSI error recovery) state (3)
Apr 29 10:36:40 317 iscsid: connection1:0 is operational after recovery 
(1 attempts)
Apr 29 10:36:50 317 kernel: sd 3:0:0:0: Device offlined - not ready 
after error recovery

For test, I set up a ping from my workstation to clients server and our 
iscsi target, to see if there is a network problem when iscsi drops. A 
week later it happened again. I looked at ping requests and ping was 
going through without a problem and nagios check on iscsi port was also 
working, yet our traffic graph shows 100% drop:
http://i59.tinypic.com/59vl10.png

I failed to catch the server in 'down' state to investigate.

Looking up on the internet about the error that client gets, it looks 
like there could be too many commands sent and iscsi timed out.
Our pool is made out of cca 40 drives in one RAIDZ vdev, so we can't do 
many IOPS, so I suspect clients send too many IO requests, it takes 
server too long to respond and iscsi crashes. Does that sounds like a 
possible option?
Is there a way to measure how many iscsi commands are sent to drives, to 
see if there is a peak when it crashes?
Is there a way to measure how busy are disks and if they really cant 
return data that fast?
What else should/can I check/monitor to find out what our problem it?

Matej









From dwq at xmweixun.com  Tue May  5 07:54:18 2015
From: dwq at xmweixun.com (dwq at xmweixun.com)
Date: Tue, 5 May 2015 15:54:18 +0800
Subject: [OmniOS-discuss]  Writeback Cache Auto disabled
Message-ID: <001901d08708$bff6b910$3fe42b30$@xmweixun.com>

Hi All,

         When I present lu to hpux or aix, lu writeback cache auto
disabled,why?

 

LU Name: 600144F00000000000005548DC360005

    Operational Status: Online

    Provider Name     : sbd

    Alias             : /dev/zvol/rdsk/wxnas/hpuxtest03

    View Entry Count  : 1

    Data File         : /dev/zvol/rdsk/wxnas/hpuxtest03

    Meta File         : not set

    Size              : 21474836480

    Block Size        : 512

    Management URL    : not set

    Vendor ID         : SUN     

    Product ID        : COMSTAR         

    Serial Num        : not set

    Write Protect     : Disabled

    Writeback Cache   : Disabled

Access State      : Active

 

 

Thanks.

 

Version:

SunOS wxos1 5.11 omnios-b281e50 i86pc i386 i86pc

Deng

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150505/b2343269/attachment-0001.html>

From mir at miras.org  Tue May  5 09:21:11 2015
From: mir at miras.org (mir at miras.org)
Date: Tue, 05 May 2015 11:21:11 +0200
Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and
 then resumes
In-Reply-To: <55487539.6030408@zunaj.si>
References: <55487539.6030408@zunaj.si>
Message-ID: <a1e7408b7a614dc4c3e96a85459bad62@miras.org>

On 2015-05-05 09:46, Matej Zerovnik wrote:
> 
> We still kept our SATA hard drives in Supermicro JBOD with SAS
> expander and SATA drives.
> 
Your problem boils down to using SATA disks in a SAS expander. Search 
omnios user list and you will find numerous proofs that using SATA disks 
in a SAS expander causes weird behaviors and instability.

The fact is that SATA disks is unsupported in a SAS expander due to 
incompatibility between command sets in SAS and SATA. As an example SATA 
NCQ is not passed through the SAS expander which might could be the 
cause of your strange iSCSI disconnects experienced on the client side.

----

This mail was virus scanned and spam checked before delivery.
This mail is also DKIM signed. See header dkim-signature.


From narayan.desai at gmail.com  Tue May  5 14:32:19 2015
From: narayan.desai at gmail.com (Narayan Desai)
Date: Tue, 5 May 2015 09:32:19 -0500
Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and
	then resumes
In-Reply-To: <a1e7408b7a614dc4c3e96a85459bad62@miras.org>
References: <55487539.6030408@zunaj.si>
	<a1e7408b7a614dc4c3e96a85459bad62@miras.org>
Message-ID: <CABweQmLrOgD_utd_8HV4u5KzTXOH2xTjpG4KhgiNQUt+R2_LvA@mail.gmail.com>

And, if you don't have the luxury of discarding hardware and replacing it
with a supported configuration, you might look at finding marginal drives,
either via error counters displayed in iostat -En, or drives with really
high service times (in iostat -xnz output). We found (on a similar setup),
that being really aggressive about drive replacement helped a lot.

If you have desktop sata drives, then the drive firmware is part of the
problem. Desktop drives retry for quite a long time when they encounter
errors, which produce really inconsistent performance profiles. When you
aggregate into a raid set (including in ZFS) tail latencies really start to
matter for performance, and the pool just starts going out to lunch for a
long time. If you can figure out and replace the drive is causing the
problem (even if it isn't causing any hard errors), the pool performance
goes back to normal.
 -nld

On Tue, May 5, 2015 at 4:21 AM, <mir at miras.org> wrote:

> On 2015-05-05 09:46, Matej Zerovnik wrote:
>
>>
>> We still kept our SATA hard drives in Supermicro JBOD with SAS
>> expander and SATA drives.
>>
>>  Your problem boils down to using SATA disks in a SAS expander. Search
> omnios user list and you will find numerous proofs that using SATA disks in
> a SAS expander causes weird behaviors and instability.
>
> The fact is that SATA disks is unsupported in a SAS expander due to
> incompatibility between command sets in SAS and SATA. As an example SATA
> NCQ is not passed through the SAS expander which might could be the cause
> of your strange iSCSI disconnects experienced on the client side.
>
> ----
>
> This mail was virus scanned and spam checked before delivery.
> This mail is also DKIM signed. See header dkim-signature.
>
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150505/54b21dae/attachment.html>

From richard.elling at richardelling.com  Tue May  5 15:17:02 2015
From: richard.elling at richardelling.com (Richard Elling)
Date: Tue, 5 May 2015 08:17:02 -0700
Subject: [OmniOS-discuss] Writeback Cache Auto disabled
In-Reply-To: <001901d08708$bff6b910$3fe42b30$@xmweixun.com>
References: <001901d08708$bff6b910$3fe42b30$@xmweixun.com>
Message-ID: <70165454-5855-455D-BE88-8AB444934C45@RichardElling.com>


> On May 5, 2015, at 12:54 AM, <dwq at xmweixun.com> <dwq at xmweixun.com> wrote:
> 
> Hi All,
>          When I present lu to hpux or aix, lu writeback cache auto disabled,why?

In SCSI, initiators can change the write cache policy.
 ? richard

>  
> LU Name: 600144F00000000000005548DC360005
>     Operational Status: Online
>     Provider Name     : sbd
>     Alias             : /dev/zvol/rdsk/wxnas/hpuxtest03
>     View Entry Count  : 1
>     Data File         : /dev/zvol/rdsk/wxnas/hpuxtest03
>     Meta File         : not set
>     Size              : 21474836480
>     Block Size        : 512
>     Management URL    : not set
>     Vendor ID         : SUN     
>     Product ID        : COMSTAR         
>     Serial Num        : not set
>     Write Protect     : Disabled
>     Writeback Cache   : Disabled
> Access State      : Active
>  
>  
> Thanks.
>  
> Version:
> SunOS wxos1 5.11 omnios-b281e50 i86pc i386 i86pc
> Deng
>  
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com <mailto:OmniOS-discuss at lists.omniti.com>
> http://lists.omniti.com/mailman/listinfo/omnios-discuss <http://lists.omniti.com/mailman/listinfo/omnios-discuss>
--

Richard.Elling at RichardElling.com
+1-760-896-4422



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150505/41afb73a/attachment-0001.html>

From ci4 at outlook.com  Tue May  5 15:20:18 2015
From: ci4 at outlook.com (Chavdar Ivanov)
Date: Tue, 5 May 2015 15:20:18 +0000
Subject: [OmniOS-discuss] =?utf-8?q?Can=27t_update_bloody?=
Message-ID: <DUB405-EAS281DE3E6934796A0231BEC1E2D10@phx.gbl>

Hi, 


I tried updating one of my VMs running omnios bloody. Full refresh goes through, update fails because it can't find the manifest for package/pkg - confirmed via the web view. 


Is there any present problem with the bloody repo? 


Chavdar Ivanov 






Sent from Windows Mail
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150505/9a12547d/attachment.html>

From vab at bb-c.de  Tue May  5 15:35:48 2015
From: vab at bb-c.de (Volker A. Brandt)
Date: Tue, 5 May 2015 17:35:48 +0200
Subject: [OmniOS-discuss] Can't update bloody
In-Reply-To: <DUB405-EAS281DE3E6934796A0231BEC1E2D10@phx.gbl>
References: <DUB405-EAS281DE3E6934796A0231BEC1E2D10@phx.gbl>
Message-ID: <21832.58196.941714.304987@glaurung.bb-c.de>

> I tried updating one of my VMs running omnios bloody. Full refresh
> goes through, update fails because it can't find the manifest for
> package/pkg - confirmed via the web view.

FWIW, I update my local copy of the bloody repo each morning (the
dead of night in the US :-), and here is what I have been seeing
for a few days now:

  Processing packages for publisher omnios ...
  Retrieving and evaluating 2035 package(s)...
  Download Manifests (1087/2035) \pkgrecv: http protocol error: code: 404 reason: Not Found
  URL: 'http://pkg.omniti.com/omnios/bloody/omnios/manifest/0/package%2Fpkg at 0.5.11%2C5.11-0.151015%3A20150422T144502Z' (happened 4 times)

So I can confirm that the manifest file for package/pkg is
physically missing.


Regards -- Volker
-- 
------------------------------------------------------------------------
Volker A. Brandt               Consulting and Support for Oracle Solaris
Brandt & Brandt Computer GmbH                   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim, GERMANY            Email: vab at bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513              Schuhgr??e: 46
Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt

"When logic and proportion have fallen sloppy dead"

From danmcd at omniti.com  Tue May  5 15:41:53 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 5 May 2015 11:41:53 -0400
Subject: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained
	high write loads
In-Reply-To: <C0084B79-B540-44B4-8485-7EF86C43B845@omniti.com>
References: <20150504214527.583867A061E@apps0.cs.toronto.edu>
	<C0084B79-B540-44B4-8485-7EF86C43B845@omniti.com>
Message-ID: <704EC1DD-4C8D-4B17-A535-03BF737DB12F@omniti.com>


> On May 4, 2015, at 9:03 PM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> I swear I've seen someone try to address this before.  Maybe it's from my Nexenta days.  I will be querying the illumos developer's list (as I suspect this affects the other distros as well if they haven't fixed it in their local illumos children).

Folks who aren't Chris, see here:

	http://www.listbox.com/member/archive/182179/2015/05/sort/time_rev/page/1/entry/4:58/20150505065805:98F5E7D6-F315-11E4-87B6-F3BD9F3176C1/

The hard part will be testing this.  I'm not sure I have the HW in-house to do it.  I may need illumos community help.

FYI,
Dan


From cks at cs.toronto.edu  Tue May  5 15:48:05 2015
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Tue, 05 May 2015 11:48:05 -0400
Subject: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained
	high write loads
In-Reply-To: danmcd's message of Tue, 05 May 2015 11:41:53 -0400.
	<704EC1DD-4C8D-4B17-A535-03BF737DB12F@omniti.com>
Message-ID: <20150505154805.BC7367A05A8@apps0.cs.toronto.edu>

> > On May 4, 2015, at 9:03 PM, Dan McDonald <danmcd at omniti.com> wrote:
> > I swear I've seen someone try to address this before.  Maybe it's from =
> my Nexenta days.  I will be querying the illumos developer's list (as I =
> suspect this affects the other distros as well if they haven't fixed it =
> in their local illumos children).
> 
> Folks who aren't Chris, see here:
> 
> http://www.listbox.com/member/archive/182179/2015/05/sort/time_rev/page/1/entry/4:58/20150505065805:98F5E7D6-F315-11E4-87B6-F3BD9F3176C1/
> 
> The hard part will be testing this.  I'm not sure I have the HW in-house
> to do it.  I may need illumos community help.

 Since we have a test environment where we can reproduce this and a high
interest in seeing it fixed, we can test new kernel packages and so on.

(If given specific howto instructions we can probably build test kernels
from source, but we've never tried to do any OmniOS source building
before so it may take us some time to get up to speed on that. It'd be
much easier to take a prebuilt test kernel, drop it in, and go.)

	- cks

From danmcd at omniti.com  Tue May  5 15:55:31 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 5 May 2015 11:55:31 -0400
Subject: [OmniOS-discuss] Small update - new lint libraries for some
	userland packages
Message-ID: <47BB3308-83A6-4742-8B03-9E59DA2B8D72@omniti.com>

I've just updated the openssl, zlib, trousers, and libxml2 packages to include lint libraries.  This is a non-reboot update, but you may need to restart services requiring any of the aforementioned packages that are non-system-related.

I'm making this change because later today, I'll be pushing back changes in illumos-gate that allow people to build stock illumos-gate on OmniOS r151014 or later.  Technically, you can do it on 012 as well, but with a ton of lint.  014 and later will be able to build stock illumos-gate, which will make OmniOS a more attractive platform to illumos developers. This list will be Cc:ed on some of those illumos announcements.

Thank you OmniOS community!
Dan


From danmcd at omniti.com  Tue May  5 15:58:12 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 5 May 2015 11:58:12 -0400
Subject: [OmniOS-discuss] Can't update bloody
In-Reply-To: <21832.58196.941714.304987@glaurung.bb-c.de>
References: <DUB405-EAS281DE3E6934796A0231BEC1E2D10@phx.gbl>
	<21832.58196.941714.304987@glaurung.bb-c.de>
Message-ID: <5ECE2D45-CBD8-49AB-8F7A-DB138B6E9C3B@omniti.com>

Hmmm.

I'll be updating the whole wad of bloody later this week.  Can y'all wait a couple of days?  I want to include some illumos updates that I'm about to push this afternoon.

Thanks,
Dan


From danmcd at omniti.com  Tue May  5 16:02:54 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 5 May 2015 12:02:54 -0400
Subject: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained
	high write loads
In-Reply-To: <20150505154805.BC7367A05A8@apps0.cs.toronto.edu>
References: <20150505154805.BC7367A05A8@apps0.cs.toronto.edu>
Message-ID: <F30C25D3-9DA3-4BE4-8D24-B8E57A620141@omniti.com>


> On May 5, 2015, at 11:48 AM, Chris Siebenmann <cks at cs.toronto.edu> wrote:
> 
>>> On May 4, 2015, at 9:03 PM, Dan McDonald <danmcd at omniti.com> wrote:
>>> I swear I've seen someone try to address this before.  Maybe it's from =
>> my Nexenta days.  I will be querying the illumos developer's list (as I =
>> suspect this affects the other distros as well if they haven't fixed it =
>> in their local illumos children).
>> 
>> Folks who aren't Chris, see here:
>> 
>> http://www.listbox.com/member/archive/182179/2015/05/sort/time_rev/page/1/entry/4:58/20150505065805:98F5E7D6-F315-11E4-87B6-F3BD9F3176C1/
>> 
>> The hard part will be testing this.  I'm not sure I have the HW in-house
>> to do it.  I may need illumos community help.
> 
> Since we have a test environment where we can reproduce this and a high
> interest in seeing it fixed, we can test new kernel packages and so on.
> 
> (If given specific howto instructions we can probably build test kernels
> from source, but we've never tried to do any OmniOS source building
> before so it may take us some time to get up to speed on that. It'd be
> much easier to take a prebuilt test kernel, drop it in, and go.)

I can turn around the whole world in an hour or less and provide ONU images if your'e on 012 or 014.  What revision are you running currently? I can also help you get a build-illumos-omnios up and running as well.  Pick your favorite.

I won't be able to do this until later this afternoon, however.  I've some pressing illumos things first.

Thanks,
Dan


From cks at cs.toronto.edu  Tue May  5 16:15:54 2015
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Tue, 05 May 2015 12:15:54 -0400
Subject: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained
	high write loads
In-Reply-To: danmcd's message of Tue, 05 May 2015 12:02:54 -0400.
	<F30C25D3-9DA3-4BE4-8D24-B8E57A620141@omniti.com>
Message-ID: <20150505161554.CFA547A05A8@apps0.cs.toronto.edu>

> >> The hard part will be testing this. I'm not sure I have the HW
> >> in-house to do it.  I may need illumos community help.
> >
> > Since we have a test environment where we can reproduce this and a
> > high interest in seeing it fixed, we can test new kernel packages
> > and so on.
> >
> > (If given specific howto instructions we can probably build test
> > kernels from source, but we've never tried to do any OmniOS source
> > building before so it may take us some time to get up to speed on
> > that. It'd be much easier to take a prebuilt test kernel, drop it
> > in, and go.)
>
> I can turn around the whole world in an hour or less and provide
> ONU images if your'e on 012 or 014. What revision are you running
> currently? I can also help you get a build-illumos-omnios up and
> running as well. Pick your favorite.

 For now, the simplest thing is installable kernel images (I assume
that's ONU images) for r151014, which is what our test environment
is using now and what we'd wind up on with all of our production
fileservers[*]. I won't be able to start any testing with the images
until this afternoon at the earliest, so I don't think it's urgent to
build them right away.

 Thanks for all of this!

	- cks
[*: our production fileservers are currently at r151010 but we're
    already looking at an r151014 upgrade. having this fix as part
    of r151014 would make that upgrade definite, and there's other
    things in 14 that we want, eg >16 group support over NFS.
]

From vab at bb-c.de  Tue May  5 16:35:36 2015
From: vab at bb-c.de (Volker A. Brandt)
Date: Tue, 5 May 2015 18:35:36 +0200
Subject: [OmniOS-discuss] Can't update bloody
In-Reply-To: <5ECE2D45-CBD8-49AB-8F7A-DB138B6E9C3B@omniti.com>
References: <DUB405-EAS281DE3E6934796A0231BEC1E2D10@phx.gbl>
	<21832.58196.941714.304987@glaurung.bb-c.de>
	<5ECE2D45-CBD8-49AB-8F7A-DB138B6E9C3B@omniti.com>
Message-ID: <21832.61784.40290.77774@glaurung.bb-c.de>

> I'll be updating the whole wad of bloody later this week.  Can y'all
> wait a couple of days?  I want to include some illumos updates that
> I'm about to push this afternoon.

Sure.  Thanks for all your good work!!


Regards -- Volker
-- 
------------------------------------------------------------------------
Volker A. Brandt               Consulting and Support for Oracle Solaris
Brandt & Brandt Computer GmbH                   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim, GERMANY            Email: vab at bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513              Schuhgr??e: 46
Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt

"When logic and proportion have fallen sloppy dead"

From matej at zunaj.si  Tue May  5 16:48:28 2015
From: matej at zunaj.si (Matej Zerovnik)
Date: Tue, 5 May 2015 18:48:28 +0200
Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and
 then resumes
In-Reply-To: <CABweQmLrOgD_utd_8HV4u5KzTXOH2xTjpG4KhgiNQUt+R2_LvA@mail.gmail.com>
References: <55487539.6030408@zunaj.si>
	<a1e7408b7a614dc4c3e96a85459bad62@miras.org>
	<CABweQmLrOgD_utd_8HV4u5KzTXOH2xTjpG4KhgiNQUt+R2_LvA@mail.gmail.com>
Message-ID: <201505051648.t45GmpA4025308@lists-il.int.omniti.net>

I will replace the hardwarw in about 4 months with all SAS drives, but I would love to have a working setup for the time being as well;)

I looked at smart stats and there doesnt seem to be any errors. Also, no hard/soft/transfer error reported by any drive. Will take a look at service time tomorrow, maybe put the drives to graphite and look at them over a longer period.

I looked at iostat -x status today and stats for pool itself reported 100% busy most of the time, 98-100% wait, 500-1300 transactions in queue, around 500 active,... First line, that is average from boot, says avg service time.is around 1600ms which seems like aaaalot. Can it be due to really big queue?

Would it help to create 5 10drives raidz pools instead of one with 50 drives?

Matej


-----Original Message-----
From: "Narayan Desai" <narayan.desai at gmail.com>
Sent: ?5.?5.?2015 16:32
To: "Michael Rasmussen" <mir at miras.org>
Cc: "Matej Zerovnik" <matej at zunaj.si>; "omnios-discuss" <omnios-discuss at lists.omniti.com>
Subject: Re: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then resumes

And, if you don't have the luxury of discarding hardware and replacing it with a supported configuration, you might look at finding marginal drives, either via error counters displayed in iostat -En, or drives with really high service times (in iostat -xnz output). We found (on a similar setup), that being really aggressive about drive replacement helped a lot. 


If you have desktop sata drives, then the drive firmware is part of the problem. Desktop drives retry for quite a long time when they encounter errors, which produce really inconsistent performance profiles. When you aggregate into a raid set (including in ZFS) tail latencies really start to matter for performance, and the pool just starts going out to lunch for a long time. If you can figure out and replace the drive is causing the problem (even if it isn't causing any hard errors), the pool performance goes back to normal.
 -nld


On Tue, May 5, 2015 at 4:21 AM, <mir at miras.org> wrote:

On 2015-05-05 09:46, Matej Zerovnik wrote:


We still kept our SATA hard drives in Supermicro JBOD with SAS
expander and SATA drives.


Your problem boils down to using SATA disks in a SAS expander. Search omnios user list and you will find numerous proofs that using SATA disks in a SAS expander causes weird behaviors and instability.

The fact is that SATA disks is unsupported in a SAS expander due to incompatibility between command sets in SAS and SATA. As an example SATA NCQ is not passed through the SAS expander which might could be the cause of your strange iSCSI disconnects experienced on the client side.

----

This mail was virus scanned and spam checked before delivery.
This mail is also DKIM signed. See header dkim-signature.


_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150505/70024fd5/attachment.html>

From narayan.desai at gmail.com  Tue May  5 17:24:05 2015
From: narayan.desai at gmail.com (Narayan Desai)
Date: Tue, 5 May 2015 12:24:05 -0500
Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and
	then resumes
In-Reply-To: <5548f474.24cab40a.12a2.7329SMTPIN_ADDED_MISSING@mx.google.com>
References: <55487539.6030408@zunaj.si>
	<a1e7408b7a614dc4c3e96a85459bad62@miras.org>
	<CABweQmLrOgD_utd_8HV4u5KzTXOH2xTjpG4KhgiNQUt+R2_LvA@mail.gmail.com>
	<5548f474.24cab40a.12a2.7329SMTPIN_ADDED_MISSING@mx.google.com>
Message-ID: <CABweQmLrranJ1A6O=JE08OFLqVSjieH7WP5niA0UR_7K2uFevg@mail.gmail.com>

If the theory is that you have a small number of drives causing trouble,
then smaller raid sets would probably help, depending on the number of
marginal devices you have.

I bet that you see a few drives pegged when you start looking at device
level service times.
 -nld

On Tue, May 5, 2015 at 11:48 AM, Matej Zerovnik <matej at zunaj.si> wrote:

> I will replace the hardwarw in about 4 months with all SAS drives, but I
> would love to have a working setup for the time being as well;)
>
> I looked at smart stats and there doesnt seem to be any errors. Also, no
> hard/soft/transfer error reported by any drive. Will take a look at service
> time tomorrow, maybe put the drives to graphite and look at them over a
> longer period.
>
> I looked at iostat -x status today and stats for pool itself reported 100%
> busy most of the time, 98-100% wait, 500-1300 transactions in queue, around
> 500 active,... First line, that is average from boot, says avg service
> time.is around 1600ms which seems like aaaalot. Can it be due to really
> big queue?
>
> Would it help to create 5 10drives raidz pools instead of one with 50
> drives?
>
> Matej
> ------------------------------
> From: Narayan Desai <narayan.desai at gmail.com>
> Sent: ?5.?5.?2015 16:32
> To: Michael Rasmussen <mir at miras.org>
> Cc: Matej Zerovnik <matej at zunaj.si>; omnios-discuss
> <omnios-discuss at lists.omniti.com>
> Subject: Re: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and
> then resumes
>
> And, if you don't have the luxury of discarding hardware and replacing it
> with a supported configuration, you might look at finding marginal drives,
> either via error counters displayed in iostat -En, or drives with really
> high service times (in iostat -xnz output). We found (on a similar setup),
> that being really aggressive about drive replacement helped a lot.
>
> If you have desktop sata drives, then the drive firmware is part of the
> problem. Desktop drives retry for quite a long time when they encounter
> errors, which produce really inconsistent performance profiles. When you
> aggregate into a raid set (including in ZFS) tail latencies really start to
> matter for performance, and the pool just starts going out to lunch for a
> long time. If you can figure out and replace the drive is causing the
> problem (even if it isn't causing any hard errors), the pool performance
> goes back to normal.
>  -nld
>
> On Tue, May 5, 2015 at 4:21 AM, <mir at miras.org> wrote:
>
>> On 2015-05-05 09:46, Matej Zerovnik wrote:
>>
>>>
>>> We still kept our SATA hard drives in Supermicro JBOD with SAS
>>> expander and SATA drives.
>>>
>>>  Your problem boils down to using SATA disks in a SAS expander. Search
>> omnios user list and you will find numerous proofs that using SATA disks in
>> a SAS expander causes weird behaviors and instability.
>>
>> The fact is that SATA disks is unsupported in a SAS expander due to
>> incompatibility between command sets in SAS and SATA. As an example SATA
>> NCQ is not passed through the SAS expander which might could be the cause
>> of your strange iSCSI disconnects experienced on the client side.
>>
>> ----
>>
>> This mail was virus scanned and spam checked before delivery.
>> This mail is also DKIM signed. See header dkim-signature.
>>
>>
>> _______________________________________________
>> OmniOS-discuss mailing list
>> OmniOS-discuss at lists.omniti.com
>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150505/ba1a0cca/attachment.html>

From danmcd at omniti.com  Tue May  5 17:32:17 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 5 May 2015 13:32:17 -0400
Subject: [OmniOS-discuss] FLAG DAY - 4719 affects nightly, package, and poold
Message-ID: <1BAFD21A-EF91-4C6E-8A2A-4D2AB691574E@omniti.com>

Illumos #4719 introduces a flag day for people who build illumos-gate.
Starting now, you will need a Java Developers Kit (JDK) 7 or later.
OpenIndiana 151a9 does NOT have this by default.  Builders must either set
JAVA_ROOT to a source of JDK7, or must have /usr/java populated with JDK7.

Users still on JDK6 will see build errors in the packaging portions like
such:

==== package build errors (non-DEBUG) ====

dmake: Warning: Command failed for target `packages.i386/developer-dtrace.dep'
dmake: Warning: Command failed for target `packages.i386/service-network-dns-mdns.dep'
dmake: Warning: Target `install' not remade because of errors

.  .  .


These are due to javadoc changes between 6 and 7.  The dtrace and mdns
packages generate javadoc, so their packaging manifests are updated to the 7
versions.


ALSO, because poold defines JAVA_ROOT in its binaries, you must set JAVA_ROOT
when building poold to match the runtime java on your ONU or
otherwised-packaged target.  For example, on my OI 151a9 test builder, I
untarred an openjdk7 in /usr/jdk/instances/openjdk7/ and set
JAVA_ROOT=/usr/jdk/instances/openjdk7/.


IMPORTANT --> If you are an OI 151a9 user, and wish to use poold, installing
openjdk7 in instances is not sufficient.  You will need to set /usr/java to
point to the openjdk7 instance as well.  Illumos bug 5851 tracks this.


This change is the last of several steps that will allow other platforms
(like OmniOS, e.g.) to build stock illumos-gate.  I will post a separate note
on building illumos-gate on OmniOS.


Thanks to Richard PALO for creating these diffs in the first place.

Thanks!
Dan


From danmcd at omniti.com  Tue May  5 17:32:21 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 5 May 2015 13:32:21 -0400
Subject: [OmniOS-discuss] HEADS UP -- illumos-gate can now be built on
	OmniOS r151014 or later
Message-ID: <D8F62323-340B-4334-8BBC-9DE7198ADAC9@omniti.com>

With the pushes of 4719, 5878, and 5879, one may now build stock illumos-gate
on OmniOS, revisions r151014 or later.

An OmniOS .env file will need certain variables set.  I'm attaching a sample
one I use, but I will go over the critical variables here.

To build on OmniOS, you must:

1.) Use "gcc only" build.

# GCC-only, REQUIRED for building on OmniOS.
__GNUC="";           export __GNUC
CW_NO_SHADOW=1;     export CW_NO_SHADO


2.) Use ONLY_LINT_DEFS along with the sunstudio12.1 version of lint you can
get as a binary with OmniOS:

# Lint if you have the OmniOS-supplied usable-for-linting-only sunstudio12.1.
ONLY_LINT_DEFS=-I${SPRO_ROOT}/sunstudio12.1/prod/include/lint; export ONLY_LINT_DEFS


3.) Change the GCC_ROOT to OmniOS's.  You have to do this for illumos-omnios
as well, so this shouldn't be shocking:

GCC_ROOT=/opt/gcc-4.4.4/;           export GCC_ROOT


4.) Set the PERL_* variables to cope with OmniOS using perl 5.16.1:

# These are required for building on OmniOS.
export PERL_VERSION=5.16.1
export PERL_PKGVERS=-5161
export PERL_ARCH=i86pc-solaris-thread-multi-64int


5.) Like with illumos-omnios, set ONNV_BUILDNUM to THE SAME release as you
wish to ONU from.  So if you're building mid-2015's bloody, set it to 151015,
if you're ONUing the current stable, use 151014:

# SET ONNV_BUILDNUM appropriately - to ONU r151014, set this to 151014.
export ONNV_BUILDNUM=151014


Please note that if you build illumos-gate on OmniOS, you cannot ONU a
non-OmniOS machine with the generated packages.  You CAN ONU an OmniOS
machine, however (just make sure ONNV_BUILDNUM matches the release you wish
to ONU from).

Thanks!
Dan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: illumos-gate-omnios.env.sh
Type: application/octet-stream
Size: 8379 bytes
Desc: not available
URL: <https://omniosce.org/ml-archive/attachments/20150505/0b0cdd64/attachment-0001.obj>

From doug at will.to  Tue May  5 20:28:27 2015
From: doug at will.to (Doug Hughes)
Date: Tue, 5 May 2015 16:28:27 -0400
Subject: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained
 high write loads
In-Reply-To: <20150505161554.CFA547A05A8@apps0.cs.toronto.edu>
References: <F30C25D3-9DA3-4BE4-8D24-B8E57A620141@omniti.com>
	<20150505161554.CFA547A05A8@apps0.cs.toronto.edu>
Message-ID: <CAOpmc6xd+ReZcdRsa6KaHdZUqQME6nambM-BvCHH6LvEpfGbKA@mail.gmail.com>

I managed to get my system in a state with dd test across a bunch of
client nodes (4k writes, many nodes in parallel, all to the same file -- by
mistake, I meant to do many files), that all of the ttys except for
/dev/console are stuck. It was showing signs of desparation swapping a few
times, but it seems to have recovered from that. I have killed all of the
write-intensive I/O and the host is mostly fine. Load has fallen, no
residual I/O to disks, but the ttys that are not console are still stuck.

I had quite a few pauses in my vmstat output while the memory exhaustion
from write load took place. In contrast, just can't bring the machine down
with read load, as you might expect. The arc does an admiral job with the
72GB ram and can totally fill up the 10g pipes outbound.

It didn't lock up completely, but it came close, and there's some residual
damage lingering with respect to the ttys.

(config = 2xquad core Intel Sandybridge CPU in Sun X4275 with 72GB ram and
12x4TB disks)



On Tue, May 5, 2015 at 12:15 PM, Chris Siebenmann <cks at cs.toronto.edu>
wrote:

> > >> The hard part will be testing this. I'm not sure I have the HW
> > >> in-house to do it.  I may need illumos community help.
> > >
> > > Since we have a test environment where we can reproduce this and a
> > > high interest in seeing it fixed, we can test new kernel packages
> > > and so on.
> > >
> > > (If given specific howto instructions we can probably build test
> > > kernels from source, but we've never tried to do any OmniOS source
> > > building before so it may take us some time to get up to speed on
> > > that. It'd be much easier to take a prebuilt test kernel, drop it
> > > in, and go.)
> >
> > I can turn around the whole world in an hour or less and provide
> > ONU images if your'e on 012 or 014. What revision are you running
> > currently? I can also help you get a build-illumos-omnios up and
> > running as well. Pick your favorite.
>
>  For now, the simplest thing is installable kernel images (I assume
> that's ONU images) for r151014, which is what our test environment
> is using now and what we'd wind up on with all of our production
> fileservers[*]. I won't be able to start any testing with the images
> until this afternoon at the earliest, so I don't think it's urgent to
> build them right away.
>
>  Thanks for all of this!
>
>         - cks
> [*: our production fileservers are currently at r151010 but we're
>     already looking at an r151014 upgrade. having this fix as part
>     of r151014 would make that upgrade definite, and there's other
>     things in 14 that we want, eg >16 group support over NFS.
> ]
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150505/a86f0753/attachment.html>

From dwq at xmweixun.com  Wed May  6 07:49:48 2015
From: dwq at xmweixun.com (dwq at xmweixun.com)
Date: Wed, 6 May 2015 15:49:48 +0800
Subject: [OmniOS-discuss] =?utf-8?b?562U5aSNOiAgIFdyaXRlYmFjayBDYWNoZSBB?=
	=?utf-8?q?uto_disabled?=
In-Reply-To: <70165454-5855-455D-BE88-8AB444934C45@RichardElling.com>
References: <001901d08708$bff6b910$3fe42b30$@xmweixun.com>
	<70165454-5855-455D-BE88-8AB444934C45@RichardElling.com>
Message-ID: <002801d087d1$3a8454d0$af8cfe70$@xmweixun.com>

Hi Richard

         I use stmfadm modify-lu ?p wcd=false LU Name, change write cache to enable,but when client read  or wirte io  from lu,lu status (writeback cache) change to disable again.

 

 

 

Best Regards,
Deng Wei Quan / ???

Mob: +86 13906055059

Mail: dwq at xmweixun.com <mailto:dwq at xmweixun.com> 

????????????

 

???: dwq+auto_=dengweiquan=139.com at xmweixun.com [mailto:dwq+auto_=dengweiquan=139.com at xmweixun.com] ?? Richard Elling
????: 2015?5?5? 23:17
???: dwq at xmweixun.com
??: omnios-discuss at lists.omniti.com
??: Re: [OmniOS-discuss] Writeback Cache Auto disabled

 

 

On May 5, 2015, at 12:54 AM, <dwq at xmweixun.com <mailto:dwq at xmweixun.com> > <dwq at xmweixun.com <mailto:dwq at xmweixun.com> > wrote:

 

Hi All,

         When I present lu to hpux or aix, lu writeback cache auto disabled,why?

 

In SCSI, initiators can change the write cache policy.

 ? richard





 

LU Name: 600144F00000000000005548DC360005

    Operational Status: Online

    Provider Name     : sbd

    Alias             : /dev/zvol/rdsk/wxnas/hpuxtest03

    View Entry Count  : 1

    Data File         : /dev/zvol/rdsk/wxnas/hpuxtest03

    Meta File         : not set

    Size              : 21474836480

    Block Size        : 512

    Management URL    : not set

    Vendor ID         : SUN     

    Product ID        : COMSTAR         

    Serial Num        : not set

    Write Protect     : Disabled

    Writeback Cache   : Disabled

Access State      : Active

 

 

Thanks.

 

Version:

SunOS wxos1 5.11 omnios-b281e50 i86pc i386 i86pc

Deng

 

_______________________________________________
OmniOS-discuss mailing list
 <mailto:OmniOS-discuss at lists.omniti.com> OmniOS-discuss at lists.omniti.com
 <http://lists.omniti.com/mailman/listinfo/omnios-discuss> http://lists.omniti.com/mailman/listinfo/omnios-discuss

 

--

 

Richard.Elling at RichardElling.com <mailto:Richard.Elling at RichardElling.com> 
+1-760-896-4422



 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150506/a021852c/attachment.html>

From danmcd at omniti.com  Wed May  6 14:13:39 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 6 May 2015 10:13:39 -0400
Subject: [OmniOS-discuss] New OmniOS bloody update
Message-ID: <B11C92F3-589F-49E6-8F39-283808E729A9@omniti.com>

Based on omnios-build commit 69a5016 and illumos-omnios commit 385735e.

This is another whole-wad update, even package/pkg.  I've fixed the repo to address the corruption a few of you were seeing.  I think this is the result of me trying to avoid a "pkg update pkg" pre-step last time, but forgetting to rebuild the repository index afterwards.  Anyway, you shouldn't see that now.

Since last time:

- You may now build stock illumos-gate on bloody thanks to the inclusion of lint libraries for some userland packages. (Same ones in r151014.)

- curl is now at 7.42.1.

- ISC DHCP is now at 4.3.2

- Upstream illumos-gate now has a few things we've had in OmniOS for a while. These also enable the building of stock illumos-gate on OmniOS.

- A few networking bugfixes are now upstreamed compliments of Joyent.

- A longstanding tar(1) bug where "tar -xzf" can fail has been fixed.

- You can now host an SMB/CIFS server in a non-global zone.  (NOTE:  sharemgr(1M) isn't zone-aware, so you will have to do it the old-fashioned way, see http://www.listbox.com/member/archive/182179/2015/04/sort/time_rev/page/4/entry/7:552/20150428134823:C190ED2C-EDCE-11E4-98D2-8987C5A0D07F/ for details.)

- Some miscellaneous bugfixes.

Happy updating!
Dan


From dain.bentley at gmail.com  Wed May  6 17:12:09 2015
From: dain.bentley at gmail.com (Dain Bentley)
Date: Wed, 6 May 2015 13:12:09 -0400
Subject: [OmniOS-discuss] restarting ssh on omnios doesn't load new
	parameters
Message-ID: <CALthgeddT8zEskGqKL+VYWU6_mjBQCCKZgeO6J4xJjLhjniG9g@mail.gmail.com>

So I enabled root on SSH to do some work and then disabled root login with
PermitRootLogin no in my sshd_config and used the following:
svcadm restart network/ssh:default

Thing is root can still log in...is this a bug?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150506/5dcccec8/attachment.html>

From jdg117 at elvis.arl.psu.edu  Wed May  6 17:50:43 2015
From: jdg117 at elvis.arl.psu.edu (John D Groenveld)
Date: Wed, 06 May 2015 13:50:43 -0400
Subject: [OmniOS-discuss] restarting ssh on omnios doesn't load new
	parameters
In-Reply-To: Your message of "Wed, 06 May 2015 13:12:09 EDT."
	<CALthgeddT8zEskGqKL+VYWU6_mjBQCCKZgeO6J4xJjLhjniG9g@mail.gmail.com> 
References: <CALthgeddT8zEskGqKL+VYWU6_mjBQCCKZgeO6J4xJjLhjniG9g@mail.gmail.com>
Message-ID: <201505061750.t46Hoh3m002331@elvis.arl.psu.edu>

In message <CALthgeddT8zEskGqKL+VYWU6_mjBQCCKZgeO6J4xJjLhjniG9g at mail.gmail.com>, Dain Bentley writes:
>So I enabled root on SSH to do some work and then disabled root login with
>PermitRootLogin no in my sshd_config and used the following:
>svcadm restart network/ssh:default
>
>Thing is root can still log in...is this a bug?

Did you SIGHUP the right PID?

John
groenveld at acm.org

From dain.bentley at gmail.com  Wed May  6 20:29:52 2015
From: dain.bentley at gmail.com (Dain Bentley)
Date: Wed, 6 May 2015 16:29:52 -0400
Subject: [OmniOS-discuss] restarting ssh on omnios doesn't load new
	parameters
In-Reply-To: <201505061750.t46Hoh3m002331@elvis.arl.psu.edu>
References: <CALthgeddT8zEskGqKL+VYWU6_mjBQCCKZgeO6J4xJjLhjniG9g@mail.gmail.com>
	<201505061750.t46Hoh3m002331@elvis.arl.psu.edu>
Message-ID: <CALthgeeXqQEw7=vVPYukzMgXfUd2U6XxgB4KohtS_beZusMVvg@mail.gmail.com>

I used svcadm

On Wed, May 6, 2015 at 1:50 PM, John D Groenveld <jdg117 at elvis.arl.psu.edu>
wrote:

> In message <
> CALthgeddT8zEskGqKL+VYWU6_mjBQCCKZgeO6J4xJjLhjniG9g at mail.gmail.com>, Dain
> Bentley writes:
> >So I enabled root on SSH to do some work and then disabled root login with
> >PermitRootLogin no in my sshd_config and used the following:
> >svcadm restart network/ssh:default
> >
> >Thing is root can still log in...is this a bug?
>
> Did you SIGHUP the right PID?
>
> John
> groenveld at acm.org
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150506/a5793285/attachment.html>

From jimklimov at cos.ru  Wed May  6 21:35:52 2015
From: jimklimov at cos.ru (Jim Klimov)
Date: Wed, 06 May 2015 23:35:52 +0200
Subject: [OmniOS-discuss] [developer] FLAG DAY - 4719 affects nightly,
	package, and poold
In-Reply-To: <1BAFD21A-EF91-4C6E-8A2A-4D2AB691574E@omniti.com>
References: <1BAFD21A-EF91-4C6E-8A2A-4D2AB691574E@omniti.com>
Message-ID: <99D121CC-C484-4A74-8453-E8B42F5BF57E@cos.ru>

5 ??? 2015??. 19:32:17 CEST, Dan McDonald <danmcd at omniti.com> ?????:
>Illumos #4719 introduces a flag day for people who build illumos-gate.
>Starting now, you will need a Java Developers Kit (JDK) 7 or later.
>OpenIndiana 151a9 does NOT have this by default.  Builders must either
>set
>JAVA_ROOT to a source of JDK7, or must have /usr/java populated with
>JDK7.
>
>Users still on JDK6 will see build errors in the packaging portions
>like
>such:
>
>==== package build errors (non-DEBUG) ====
>
>dmake: Warning: Command failed for target
>`packages.i386/developer-dtrace.dep'
>dmake: Warning: Command failed for target
>`packages.i386/service-network-dns-mdns.dep'
>dmake: Warning: Target `install' not remade because of errors
>
>.  .  .
>
>
>These are due to javadoc changes between 6 and 7.  The dtrace and mdns
>packages generate javadoc, so their packaging manifests are updated to
>the 7
>versions.
>
>
>ALSO, because poold defines JAVA_ROOT in its binaries, you must set
>JAVA_ROOT
>when building poold to match the runtime java on your ONU or
>otherwised-packaged target.  For example, on my OI 151a9 test builder,
>I
>untarred an openjdk7 in /usr/jdk/instances/openjdk7/ and set
>JAVA_ROOT=/usr/jdk/instances/openjdk7/.
>
>
>IMPORTANT --> If you are an OI 151a9 user, and wish to use poold,
>installing
>openjdk7 in instances is not sufficient.  You will need to set
>/usr/java to
>point to the openjdk7 instance as well.  Illumos bug 5851 tracks this.
>
>
>This change is the last of several steps that will allow other
>platforms
>(like OmniOS, e.g.) to build stock illumos-gate.  I will post a
>separate note
>on building illumos-gate on OmniOS.
>
>
>Thanks to Richard PALO for creating these diffs in the first place.
>
>Thanks!
>Dan
>
>
>
>-------------------------------------------
>illumos-developer
>Archives: https://www.listbox.com/member/archive/182179/=now
>RSS Feed:
>https://www.listbox.com/member/archive/rss/182179/22416750-c03c8c44
>Modify Your Subscription:
>https://www.listbox.com/member/?member_id=22416750&id_secret=22416750-eb7e3ed7
>Powered by Listbox: http://www.listbox.com


Out of curiosity: Did you happen to check if the newer JDK magically solves the problem with Sun DHCP builds not producing functional bits of software for the past year or two?

Jim
--
Typos courtesy of K-9 Mail on my Samsung Android

From doug at will.to  Thu May  7 01:53:48 2015
From: doug at will.to (Doug Hughes)
Date: Wed, 06 May 2015 21:53:48 -0400
Subject: [OmniOS-discuss] strange local repository corruption
Message-ID: <554AC5AC.2080409@will.to>

this is a relatively fresh install and not much going on on the machine, 
and I see somehow that the pkg repo got corrupted relatively recently on r14

t at x4275-3-15-20:/usr/local/orca-r535# pkg refresh

An error was encountered while attempting to read image state information
to perform the requested operation.  Details follow:

Catalog file '/var/pkg/state/installed/catalog.attrs' is invalid.
Use 'pkgrepo rebuild' to create a new package catalog.
root at x4275-3-15-20:/usr/local/orca-r535#
ot at x4275-3-15-20:/usr/local/orca-r535# pkgrepo rebuild
pkgrepo rebuild: A package repository location must be provided using -s.
Try `pkgrepo --help or -?' for more information.
root at x4275-3-15-20:/usr/local/orca-r535# ls -l /var/pkg/state/known/
total 2
-rw-r--r-- 1 root root 0 May  6 01:50 catalog.attrs
-rw-r--r-- 1 root root 0 May  6 01:50 catalog.base.C
-rw-r--r-- 1 root root 0 May  6 01:50 catalog.dependency.C
-rw-r--r-- 1 root root 0 May  6 01:50 catalog.summary.C
root at x4275-3-15-20:/usr/local/orca-r535#

weird huh? any easy way to recover from this?


From danmcd at omniti.com  Thu May  7 03:00:39 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 6 May 2015 23:00:39 -0400
Subject: [OmniOS-discuss] strange local repository corruption
In-Reply-To: <554AC5AC.2080409@will.to>
References: <554AC5AC.2080409@will.to>
Message-ID: <B6FB38BA-7CE0-4BBB-AC2B-5872F2D7FA01@omniti.com>

pkgrepo rebuild -s <repo-path>

Dan

Sent from my iPhone (typos, autocorrect, and all)

> On May 6, 2015, at 9:53 PM, Doug Hughes <doug at will.to> wrote:
> 
> this is a relatively fresh install and not much going on on the machine, and I see somehow that the pkg repo got corrupted relatively recently on r14
> 
> t at x4275-3-15-20:/usr/local/orca-r535# pkg refresh
> 
> An error was encountered while attempting to read image state information
> to perform the requested operation.  Details follow:
> 
> Catalog file '/var/pkg/state/installed/catalog.attrs' is invalid.
> Use 'pkgrepo rebuild' to create a new package catalog.
> root at x4275-3-15-20:/usr/local/orca-r535#
> ot at x4275-3-15-20:/usr/local/orca-r535# pkgrepo rebuild
> pkgrepo rebuild: A package repository location must be provided using -s.
> Try `pkgrepo --help or -?' for more information.
> root at x4275-3-15-20:/usr/local/orca-r535# ls -l /var/pkg/state/known/
> total 2
> -rw-r--r-- 1 root root 0 May  6 01:50 catalog.attrs
> -rw-r--r-- 1 root root 0 May  6 01:50 catalog.base.C
> -rw-r--r-- 1 root root 0 May  6 01:50 catalog.dependency.C
> -rw-r--r-- 1 root root 0 May  6 01:50 catalog.summary.C
> root at x4275-3-15-20:/usr/local/orca-r535#
> 
> weird huh? any easy way to recover from this?
> 
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

From doug at will.to  Thu May  7 03:08:31 2015
From: doug at will.to (Doug Hughes)
Date: Wed, 06 May 2015 23:08:31 -0400
Subject: [OmniOS-discuss] strange local repository corruption
In-Reply-To: <B6FB38BA-7CE0-4BBB-AC2B-5872F2D7FA01@omniti.com>
References: <554AC5AC.2080409@will.to>
	<B6FB38BA-7CE0-4BBB-AC2B-5872F2D7FA01@omniti.com>
Message-ID: <554AD72F.8090503@will.to>

Didn't work. still 0. I ended up copying the repo from another machine 
built at the same time and then doing a refresh.


On 5/6/2015 11:00 PM, Dan McDonald wrote:
> pkgrepo rebuild -s <repo-path>
>
> Dan
>
> Sent from my iPhone (typos, autocorrect, and all)
>
>> On May 6, 2015, at 9:53 PM, Doug Hughes <doug at will.to> wrote:
>>
>> this is a relatively fresh install and not much going on on the machine, and I see somehow that the pkg repo got corrupted relatively recently on r14
>>
>> t at x4275-3-15-20:/usr/local/orca-r535# pkg refresh
>>
>> An error was encountered while attempting to read image state information
>> to perform the requested operation.  Details follow:
>>
>> Catalog file '/var/pkg/state/installed/catalog.attrs' is invalid.
>> Use 'pkgrepo rebuild' to create a new package catalog.
>> root at x4275-3-15-20:/usr/local/orca-r535#
>> ot at x4275-3-15-20:/usr/local/orca-r535# pkgrepo rebuild
>> pkgrepo rebuild: A package repository location must be provided using -s.
>> Try `pkgrepo --help or -?' for more information.
>> root at x4275-3-15-20:/usr/local/orca-r535# ls -l /var/pkg/state/known/
>> total 2
>> -rw-r--r-- 1 root root 0 May  6 01:50 catalog.attrs
>> -rw-r--r-- 1 root root 0 May  6 01:50 catalog.base.C
>> -rw-r--r-- 1 root root 0 May  6 01:50 catalog.dependency.C
>> -rw-r--r-- 1 root root 0 May  6 01:50 catalog.summary.C
>> root at x4275-3-15-20:/usr/local/orca-r535#
>>
>> weird huh? any easy way to recover from this?
>>
>> _______________________________________________
>> OmniOS-discuss mailing list
>> OmniOS-discuss at lists.omniti.com
>> http://lists.omniti.com/mailman/listinfo/omnios-discuss


From paladinemishakal at gmail.com  Thu May  7 10:36:29 2015
From: paladinemishakal at gmail.com (Lawrence Giam)
Date: Thu, 7 May 2015 18:36:29 +0800
Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and
	uname
In-Reply-To: <4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com>
References: <CANKMAMYdUGS4WzSbzNDEc3zyy_bfrf_642jCXNo7FjRfqjxZdw@mail.gmail.com>
	<4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com>
Message-ID: <CAGueQCdxsyu7a44qFWoesK8XCOLQ1AG84vXz66o1JGfjv=Jqpg@mail.gmail.com>

With the current R151014 having NFS issue, I may decide to stay on R151012
for a while but people are forgetful, so is there a way to update a file to
reflect the build version?

On Tue, May 5, 2015 at 1:43 AM, Dan McDonald <danmcd at omniti.com> wrote:

>
> > On May 4, 2015, at 1:10 PM, Davide Poletto <davide.poletto at gmail.com>
> wrote:
> >
> > Just to say I've noticed that uname -v reports "illumos-omnios" on a
> > OmniOS 151012 which was "omnios-10b9c79" after I updated it today
> > (packages released on 17.04.2015 at official repository):
> >
> > OmniOS 5.11     omnios-10b9c79  September 2014
> > root at nas:/root#
> >
> > OmniOS 5.11     illumos-omnios  April 2015
> > root at nas:/root#
> >
> > Is that OK/by Design?
>
> That was my fault during the kernel build.  I had the wrong variable set
> in my .env file.
>
> > Just for reference on OmniOS 151014, after the same big set of updates
> > (released the same day, 17.04.2015), the uname -v changed from
> > "omnios-a708424" (from its ISO install) to "omnios-170cea2".
>
> Yes, I believe only r151012 was affected poorly by this.  Since 012 is in
> its last 6 months of support life, I'm not particularly concerned.
>
> Dan
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150507/5450f3e4/attachment-0001.html>

From danmcd at omniti.com  Thu May  7 14:00:22 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Thu, 7 May 2015 10:00:22 -0400
Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and
	uname
In-Reply-To: <CAGueQCdxsyu7a44qFWoesK8XCOLQ1AG84vXz66o1JGfjv=Jqpg@mail.gmail.com>
References: <CANKMAMYdUGS4WzSbzNDEc3zyy_bfrf_642jCXNo7FjRfqjxZdw@mail.gmail.com>
	<4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com>
	<CAGueQCdxsyu7a44qFWoesK8XCOLQ1AG84vXz66o1JGfjv=Jqpg@mail.gmail.com>
Message-ID: <565EE100-E1CF-4F07-AF0E-A6101F20EFDF@omniti.com>


> On May 7, 2015, at 6:36 AM, Lawrence Giam <paladinemishakal at gmail.com> wrote:
> 
> With the current R151014 having NFS issue,

What NFS issue?  Are you confusing '012 and '014?  '012 had some lock manager corner-cases, but '014 has fixed that.

Dan


From danmcd at omniti.com  Thu May  7 14:01:13 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Thu, 7 May 2015 10:01:13 -0400
Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and
	uname
In-Reply-To: <565EE100-E1CF-4F07-AF0E-A6101F20EFDF@omniti.com>
References: <CANKMAMYdUGS4WzSbzNDEc3zyy_bfrf_642jCXNo7FjRfqjxZdw@mail.gmail.com>
	<4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com>
	<CAGueQCdxsyu7a44qFWoesK8XCOLQ1AG84vXz66o1JGfjv=Jqpg@mail.gmail.com>
	<565EE100-E1CF-4F07-AF0E-A6101F20EFDF@omniti.com>
Message-ID: <598573DD-D37C-45C2-873B-6252CD729775@omniti.com>


> On May 7, 2015, at 10:00 AM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> 
>> On May 7, 2015, at 6:36 AM, Lawrence Giam <paladinemishakal at gmail.com> wrote:
>> 
>> With the current R151014 having NFS issue,
> 
> What NFS issue?  Are you confusing '012 and '014?  '012 had some lock manager corner-cases, but '014 has fixed that.

And if you're talking about the one Chris S. has reported --> it's also present in 010 and 012, and likely earlier as well.

Dan


From john.barfield at bissinc.com  Thu May  7 19:22:56 2015
From: john.barfield at bissinc.com (John Barfield)
Date: Thu, 7 May 2015 19:22:56 +0000
Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and
 uname
In-Reply-To: <598573DD-D37C-45C2-873B-6252CD729775@omniti.com>
References: <CANKMAMYdUGS4WzSbzNDEc3zyy_bfrf_642jCXNo7FjRfqjxZdw@mail.gmail.com>
	<4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com>
	<CAGueQCdxsyu7a44qFWoesK8XCOLQ1AG84vXz66o1JGfjv=Jqpg@mail.gmail.com>
	<565EE100-E1CF-4F07-AF0E-A6101F20EFDF@omniti.com>
	<598573DD-D37C-45C2-873B-6252CD729775@omniti.com>
Message-ID: <05A50C77-7549-4DD0-9886-BF30D5904131@bissinc.com>

Hey Dan has the statd NFS bug been resolved in this release?

https://www.illumos.org/issues/4518


If not I already built the the binary if it would be helpful to anyone. 



John Barfield / Sr Principal Engineer
+1 (214) 425-0783/ john.barfield at bissinc.com
BISS, Inc. Office: +1 (214) 506-8354 

4925 Greenville Ave Suite 900
Dallas, TX 75206
support.bissinc.com <http://htmlsig.com/support.bissinc.com>
This e-mail message may contain confidential or legally privileged 
information and is intended only for the use of the intended recipient(s). 
Any unauthorized disclosure, dissemination, distribution, copying or the 
taking of any action in reliance on the information herein is prohibited. 
E-mails are not secure and cannot be guaranteed to be error free as they 
can be intercepted, amended, or contain viruses. Anyone who communicates 
with us by e-mail is deemed to have accepted these risks. Company Name is 
not responsible for errors or omissions in this message and denies any 
responsibility for any damage arising from the use of e-mail. Any opinion 
and other statement contained in this message and any attachment are 
solely those of the author and do not necessarily represent those of the 
company.







On 5/7/15, 9:01 AM, "Dan McDonald" <danmcd at omniti.com> wrote:

>
>> On May 7, 2015, at 10:00 AM, Dan McDonald <danmcd at omniti.com> wrote:
>> 
>> 
>>> On May 7, 2015, at 6:36 AM, Lawrence Giam <paladinemishakal at gmail.com> 
>>>wrote:
>>> 
>>> With the current R151014 having NFS issue,
>> 
>> What NFS issue?  Are you confusing '012 and '014?  '012 had some lock 
>>manager corner-cases, but '014 has fixed that.
>
>And if you're talking about the one Chris S. has reported --> it's also 
>present in 010 and 012, and likely earlier as well.
>
>Dan
>
>_______________________________________________
>OmniOS-discuss mailing list
>OmniOS-discuss at lists.omniti.com
>http://lists.omniti.com/mailman/listinfo/omnios-discuss

From danmcd at omniti.com  Thu May  7 19:26:04 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Thu, 7 May 2015 15:26:04 -0400
Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and
	uname
In-Reply-To: <05A50C77-7549-4DD0-9886-BF30D5904131@bissinc.com>
References: <CANKMAMYdUGS4WzSbzNDEc3zyy_bfrf_642jCXNo7FjRfqjxZdw@mail.gmail.com>
	<4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com>
	<CAGueQCdxsyu7a44qFWoesK8XCOLQ1AG84vXz66o1JGfjv=Jqpg@mail.gmail.com>
	<565EE100-E1CF-4F07-AF0E-A6101F20EFDF@omniti.com>
	<598573DD-D37C-45C2-873B-6252CD729775@omniti.com>
	<05A50C77-7549-4DD0-9886-BF30D5904131@bissinc.com>
Message-ID: <91E2CF48-BD99-4D1F-AA17-D21C48BE5F2A@omniti.com>


> On May 7, 2015, at 3:22 PM, John Barfield <john.barfield at bissinc.com> wrote:
> 
> Hey Dan has the statd NFS bug been resolved in this release?
> 
> https://www.illumos.org/issues/4518

4518 is in OmniOS r151014:

https://github.com/omniti-labs/illumos-omnios/commit/98573c1925f3692d1e8ea9eb018cb915fc0becc5

And:

bloody(~/ws/illumos-omnios)[0]% git branch -r --contains 98573c1925f3692d1e8ea9eb018cb915fc0becc5
  origin/HEAD -> origin/master
  origin/master
  origin/r151014
  origin/upstream
bloody(~/ws/illumos-omnios)[0]% 


Dan



From john.barfield at bissinc.com  Thu May  7 19:26:53 2015
From: john.barfield at bissinc.com (John Barfield)
Date: Thu, 7 May 2015 19:26:53 +0000
Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and
 uname
In-Reply-To: <91E2CF48-BD99-4D1F-AA17-D21C48BE5F2A@omniti.com>
References: <CANKMAMYdUGS4WzSbzNDEc3zyy_bfrf_642jCXNo7FjRfqjxZdw@mail.gmail.com>
	<4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com>
	<CAGueQCdxsyu7a44qFWoesK8XCOLQ1AG84vXz66o1JGfjv=Jqpg@mail.gmail.com>
	<565EE100-E1CF-4F07-AF0E-A6101F20EFDF@omniti.com>
	<598573DD-D37C-45C2-873B-6252CD729775@omniti.com>
	<05A50C77-7549-4DD0-9886-BF30D5904131@bissinc.com>
	<91E2CF48-BD99-4D1F-AA17-D21C48BE5F2A@omniti.com>
Message-ID: <99021AD4-5AE3-41D0-BB68-494DD57059FD@bissinc.com>

Okay awesome! I?ve been holding back on upgrading but now I probably will. 


John Barfield / Sr Principal Engineer
+1 (214) 425-0783/ john.barfield at bissinc.com
BISS, Inc. Office: +1 (214) 506-8354 

4925 Greenville Ave Suite 900
Dallas, TX 75206
support.bissinc.com <http://htmlsig.com/support.bissinc.com>
This e-mail message may contain confidential or legally privileged 
information and is intended only for the use of the intended recipient(s). 
Any unauthorized disclosure, dissemination, distribution, copying or the 
taking of any action in reliance on the information herein is prohibited. 
E-mails are not secure and cannot be guaranteed to be error free as they 
can be intercepted, amended, or contain viruses. Anyone who communicates 
with us by e-mail is deemed to have accepted these risks. Company Name is 
not responsible for errors or omissions in this message and denies any 
responsibility for any damage arising from the use of e-mail. Any opinion 
and other statement contained in this message and any attachment are 
solely those of the author and do not necessarily represent those of the 
company.






On 5/7/15, 2:26 PM, "Dan McDonald" <danmcd at omniti.com> wrote:

>
>> On May 7, 2015, at 3:22 PM, John Barfield <john.barfield at bissinc.com> 
>>wrote:
>> 
>> Hey Dan has the statd NFS bug been resolved in this release?
>> 
>> https://www.illumos.org/issues/4518
>
>4518 is in OmniOS r151014:
>
>https://github.com/omniti-labs/illumos-omnios/commit/98573c1925f3692d1e8ea
>9eb018cb915fc0becc5
>
>And:
>
>bloody(~/ws/illumos-omnios)[0]% git branch -r --contains 
>98573c1925f3692d1e8ea9eb018cb915fc0becc5
>  origin/HEAD -> origin/master
>  origin/master
>  origin/r151014
>  origin/upstream
>bloody(~/ws/illumos-omnios)[0]% 
>
>
>Dan
>
>

From doug at will.to  Thu May  7 20:33:30 2015
From: doug at will.to (Doug Hughes)
Date: Thu, 7 May 2015 16:33:30 -0400
Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and
	uname
In-Reply-To: <99021AD4-5AE3-41D0-BB68-494DD57059FD@bissinc.com>
References: <CANKMAMYdUGS4WzSbzNDEc3zyy_bfrf_642jCXNo7FjRfqjxZdw@mail.gmail.com>
	<4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com>
	<CAGueQCdxsyu7a44qFWoesK8XCOLQ1AG84vXz66o1JGfjv=Jqpg@mail.gmail.com>
	<565EE100-E1CF-4F07-AF0E-A6101F20EFDF@omniti.com>
	<598573DD-D37C-45C2-873B-6252CD729775@omniti.com>
	<05A50C77-7549-4DD0-9886-BF30D5904131@bissinc.com>
	<91E2CF48-BD99-4D1F-AA17-D21C48BE5F2A@omniti.com>
	<99021AD4-5AE3-41D0-BB68-494DD57059FD@bissinc.com>
Message-ID: <CAOpmc6zpQT4KpyUm1YHrNOwnLpo8=EtQoBBR7KDpu2196MGHQQ@mail.gmail.com>

I can confirm that no more nlockmgr's for us since going with r14!

On Thu, May 7, 2015 at 3:26 PM, John Barfield <john.barfield at bissinc.com>
wrote:

> Okay awesome! I?ve been holding back on upgrading but now I probably will.
>
>
> John Barfield / Sr Principal Engineer
> +1 (214) 425-0783/ john.barfield at bissinc.com
> BISS, Inc. Office: +1 (214) 506-8354
>
> 4925 Greenville Ave Suite 900
> Dallas, TX 75206
> support.bissinc.com <http://htmlsig.com/support.bissinc.com>
> This e-mail message may contain confidential or legally privileged
> information and is intended only for the use of the intended recipient(s).
> Any unauthorized disclosure, dissemination, distribution, copying or the
> taking of any action in reliance on the information herein is prohibited.
> E-mails are not secure and cannot be guaranteed to be error free as they
> can be intercepted, amended, or contain viruses. Anyone who communicates
> with us by e-mail is deemed to have accepted these risks. Company Name is
> not responsible for errors or omissions in this message and denies any
> responsibility for any damage arising from the use of e-mail. Any opinion
> and other statement contained in this message and any attachment are
> solely those of the author and do not necessarily represent those of the
> company.
>
>
>
>
>
>
> On 5/7/15, 2:26 PM, "Dan McDonald" <danmcd at omniti.com> wrote:
>
> >
> >> On May 7, 2015, at 3:22 PM, John Barfield <john.barfield at bissinc.com>
> >>wrote:
> >>
> >> Hey Dan has the statd NFS bug been resolved in this release?
> >>
> >> https://www.illumos.org/issues/4518
> >
> >4518 is in OmniOS r151014:
> >
> >
> https://github.com/omniti-labs/illumos-omnios/commit/98573c1925f3692d1e8ea
> >9eb018cb915fc0becc5
> >
> >And:
> >
> >bloody(~/ws/illumos-omnios)[0]% git branch -r --contains
> >98573c1925f3692d1e8ea9eb018cb915fc0becc5
> >  origin/HEAD -> origin/master
> >  origin/master
> >  origin/r151014
> >  origin/upstream
> >bloody(~/ws/illumos-omnios)[0]%
> >
> >
> >Dan
> >
> >
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150507/acd8d346/attachment-0001.html>

From skiselkov.ml at gmail.com  Fri May  8 16:48:26 2015
From: skiselkov.ml at gmail.com (Saso Kiselkov)
Date: Fri, 08 May 2015 18:48:26 +0200
Subject: [OmniOS-discuss] What repos do people use to build a *AMP server?
Message-ID: <554CE8DA.6010504@gmail.com>

I've decided to try and update my r151006 box to something newer, seeing
as r151014 just came out and it's supposed to be LTS. Trouble is, I'm
trying to build a *AMP box and I can't find any prebuilt packages for it
in any of these repos:
http://omnios.omniti.com/wiki.php/Packaging
What do you guys use for getting pre-built software? Do all people here
just roll their own?

Also, allow me to say, I *hate* consolidations and the way they lock
accessible package versions. Where are the days when OSes used to be
backwards-compatible?

Cheers,
-- 
Saso

From chip at innovates.com  Fri May  8 16:56:30 2015
From: chip at innovates.com (Schweiss, Chip)
Date: Fri, 8 May 2015 11:56:30 -0500
Subject: [OmniOS-discuss] What repos do people use to build a *AMP
	server?
In-Reply-To: <554CE8DA.6010504@gmail.com>
References: <554CE8DA.6010504@gmail.com>
Message-ID: <CALeZrrSA0JhfeeX0wcP-YjrMU4JwJJ6-fuHvOYXJCYg22ZBBVA@mail.gmail.com>

I've done really well with the OpenCSW packages on OmniOS.

-Chip
On May 8, 2015 11:50 AM, "Saso Kiselkov" <skiselkov.ml at gmail.com> wrote:

> I've decided to try and update my r151006 box to something newer, seeing
> as r151014 just came out and it's supposed to be LTS. Trouble is, I'm
> trying to build a *AMP box and I can't find any prebuilt packages for it
> in any of these repos:
> http://omnios.omniti.com/wiki.php/Packaging
> What do you guys use for getting pre-built software? Do all people here
> just roll their own?
>
> Also, allow me to say, I *hate* consolidations and the way they lock
> accessible package versions. Where are the days when OSes used to be
> backwards-compatible?
>
> Cheers,
> --
> Saso
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150508/373ec207/attachment.html>

From mir at miras.org  Fri May  8 17:11:02 2015
From: mir at miras.org (Michael Rasmussen)
Date: Fri, 8 May 2015 19:11:02 +0200
Subject: [OmniOS-discuss] What repos do people use to build a *AMP
 server?
In-Reply-To: <CALeZrrSA0JhfeeX0wcP-YjrMU4JwJJ6-fuHvOYXJCYg22ZBBVA@mail.gmail.com>
References: <554CE8DA.6010504@gmail.com>
	<CALeZrrSA0JhfeeX0wcP-YjrMU4JwJJ6-fuHvOYXJCYg22ZBBVA@mail.gmail.com>
Message-ID: <20150508191102.3e6089fc@sleipner.datanom.net>

On Fri, 8 May 2015 11:56:30 -0500
"Schweiss, Chip" <chip at innovates.com> wrote:

> I've done really well with the OpenCSW packages on OmniOS.
> 
There is also a fine repository here:
http://pkg.niksula.hut.fi/en/index.shtml

-- 
Hilsen/Regards
Michael Rasmussen

Get my public GnuPG keys:
michael <at> rasmussen <dot> cc
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E
mir <at> datanom <dot> net
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C
mir <at> miras <dot> org
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917
--------------------------------------------------------------
/usr/games/fortune -es says:
It's hard to argue that God hated Oklahoma.  If He didn't, why is it so
close to Texas?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <https://omniosce.org/ml-archive/attachments/20150508/70361f28/attachment.bin>

From skiselkov.ml at gmail.com  Fri May  8 17:14:01 2015
From: skiselkov.ml at gmail.com (Saso Kiselkov)
Date: Fri, 08 May 2015 19:14:01 +0200
Subject: [OmniOS-discuss] What repos do people use to build a *AMP
	server?
In-Reply-To: <CALeZrrSA0JhfeeX0wcP-YjrMU4JwJJ6-fuHvOYXJCYg22ZBBVA@mail.gmail.com>
References: <554CE8DA.6010504@gmail.com>
	<CALeZrrSA0JhfeeX0wcP-YjrMU4JwJJ6-fuHvOYXJCYg22ZBBVA@mail.gmail.com>
Message-ID: <554CEED9.4020906@gmail.com>

On 5/8/15 6:56 PM, Schweiss, Chip wrote:
> I've done really well with the OpenCSW packages on OmniOS.

Thanks, seems to be working pretty well. Still, lamentable that there's
no IPS mirrors around (although, given how IPS can be obnoxious, I'm not
surprised).

Cheers,
-- 
Saso


From alka at hfg-gmuend.de  Fri May  8 20:27:33 2015
From: alka at hfg-gmuend.de (=?utf-8?Q?G=C3=BCnther_Alka?=)
Date: Fri, 8 May 2015 22:27:33 +0200
Subject: [OmniOS-discuss] What repos do people use to build a *AMP
	server?
In-Reply-To: <554CE8DA.6010504@gmail.com>
References: <554CE8DA.6010504@gmail.com>
Message-ID: <8D2EC53B-E904-402A-9E2B-974382467069@hfg-gmuend.de>

You can use the pkgin repo from SmartOS as it is the most complete source.
It is used by the amp setup script provided as a community add-on for napp-it

(you do not need napp-it to use the script)
http://napp-it.org/extensions/amp_en.html <http://napp-it.org/extensions/amp_en.html>


Gea


> Am 08.05.2015 um 18:48 schrieb Saso Kiselkov <skiselkov.ml at gmail.com>:
> 
> I've decided to try and update my r151006 box to something newer, seeing
> as r151014 just came out and it's supposed to be LTS. Trouble is, I'm
> trying to build a *AMP box and I can't find any prebuilt packages for it
> in any of these repos:
> http://omnios.omniti.com/wiki.php/Packaging
> What do you guys use for getting pre-built software? Do all people here
> just roll their own?
> 
> Also, allow me to say, I *hate* consolidations and the way they lock
> accessible package versions. Where are the days when OSes used to be
> backwards-compatible?
> 
> Cheers,
> -- 
> Saso
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150508/3ed51ac5/attachment.html>

From richard.elling at richardelling.com  Fri May  8 20:52:39 2015
From: richard.elling at richardelling.com (Richard Elling)
Date: Fri, 8 May 2015 13:52:39 -0700
Subject: [OmniOS-discuss] Writeback Cache Auto disabled
In-Reply-To: <002801d087d1$3a8454d0$af8cfe70$@xmweixun.com>
References: <001901d08708$bff6b910$3fe42b30$@xmweixun.com>
	<70165454-5855-455D-BE88-8AB444934C45@RichardElling.com>
	<002801d087d1$3a8454d0$af8cfe70$@xmweixun.com>
Message-ID: <DC18AA21-D868-4907-9BAB-5D0F227D28FF@richardelling.com>


> On May 6, 2015, at 12:49 AM, dwq at xmweixun.com wrote:
> 
> Hi Richard
>          I use stmfadm modify-lu ?p wcd=false LU Name, change write cache to enable,but when client read  or wirte io  from lu,lu status (writeback cache) change to disable again.

This is correct. Initiators can override the target's default.
 -- richard

>  
>  
>  
> Best Regards,
> Deng Wei Quan / ???
> Mob: +86 13906055059
> Mail: dwq at xmweixun.com <mailto:dwq at xmweixun.com>
> ????????????
>  
> ???: dwq+auto_=dengweiquan=139.com at xmweixun.com [mailto:dwq+auto_=dengweiquan=139.com at xmweixun.com] ?? Richard Elling
> ????: 2015?5?5? 23:17
> ???: dwq at xmweixun.com
> ??: omnios-discuss at lists.omniti.com
> ??: Re: [OmniOS-discuss] Writeback Cache Auto disabled
>  
>  
>> On May 5, 2015, at 12:54 AM, <dwq at xmweixun.com <mailto:dwq at xmweixun.com>> <dwq at xmweixun.com <mailto:dwq at xmweixun.com>> wrote:
>>  
>> Hi All,
>>          When I present lu to hpux or aix, lu writeback cache auto disabled,why?
>  
> In SCSI, initiators can change the write cache policy.
>  ? richard
> 
> 
>>  
>> LU Name: 600144F00000000000005548DC360005
>>     Operational Status: Online
>>     Provider Name     : sbd
>>     Alias             : /dev/zvol/rdsk/wxnas/hpuxtest03
>>     View Entry Count  : 1
>>     Data File         : /dev/zvol/rdsk/wxnas/hpuxtest03
>>     Meta File         : not set
>>     Size              : 21474836480
>>     Block Size        : 512
>>     Management URL    : not set
>>     Vendor ID         : SUN     
>>     Product ID        : COMSTAR         
>>     Serial Num        : not set
>>     Write Protect     : Disabled
>>     Writeback Cache   : Disabled
>> Access State      : Active
>>  
>>  
>> Thanks.
>>  
>> Version:
>> SunOS wxos1 5.11 omnios-b281e50 i86pc i386 i86pc
>> Deng
>>  
>> _______________________________________________
>> OmniOS-discuss mailing list
>> OmniOS-discuss at lists.omniti.com <mailto:OmniOS-discuss at lists.omniti.com>
>> http://lists.omniti.com/mailman/listinfo/omnios-discuss <http://lists.omniti.com/mailman/listinfo/omnios-discuss>
>  
> --
>  
> Richard.Elling at RichardElling.com <mailto:Richard.Elling at RichardElling.com>
> +1-760-896-4422
> 
> 
>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150508/1f0b9281/attachment-0001.html>

From richard.elling at richardelling.com  Sat May  9 00:49:52 2015
From: richard.elling at richardelling.com (Richard Elling)
Date: Fri, 8 May 2015 17:49:52 -0700
Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and
	then resumes
In-Reply-To: <201505051648.t45GmpA4025308@lists-il.int.omniti.net>
References: <55487539.6030408@zunaj.si>
	<a1e7408b7a614dc4c3e96a85459bad62@miras.org>
	<CABweQmLrOgD_utd_8HV4u5KzTXOH2xTjpG4KhgiNQUt+R2_LvA@mail.gmail.com>
	<201505051648.t45GmpA4025308@lists-il.int.omniti.net>
Message-ID: <40C78E86-F32D-4588-AF98-EB9820019960@richardelling.com>


> On May 5, 2015, at 9:48 AM, Matej Zerovnik <matej at zunaj.si> wrote:
> 
> I will replace the hardwarw in about 4 months with all SAS drives, but I would love to have a working setup for the time being as well;)
> 
> I looked at smart stats and there doesnt seem to be any errors. Also, no hard/soft/transfer error reported by any drive. Will take a look at service time tomorrow, maybe put the drives to graphite and look at them over a longer period.
> 
> I looked at iostat -x status today and stats for pool itself reported 100% busy most of the time, 98-100% wait, 500-1300 transactions in queue, around 500 active,... First line, that is average from boot, says avg service time.is around 1600ms which seems like aaaalot. Can it be due to really big queue?
> 
> Would it help to create 5 10drives raidz pools instead of one with 50 drives?

It is a bad idea to build a single raidz set with 50 drives. Very bad. Hence the zpool
man page says, "The recommended number is between 3 and 9 to help increase performance."
But this recommendation applies to reliability, too.
 -- richard

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150508/17f32767/attachment.html>

From dave-oo at pooserville.com  Sat May  9 17:38:25 2015
From: dave-oo at pooserville.com (Dave Pooser)
Date: Sat, 09 May 2015 12:38:25 -0500
Subject: [OmniOS-discuss] r151012 is coming...
In-Reply-To: <E1B54D4D-59E7-4E32-94DA-22E22DBCA015@omniti.com>
References: <E1B54D4D-59E7-4E32-94DA-22E22DBCA015@omniti.com>
Message-ID: <D173AD9F.2F0D2B%dave-lists@pooserville.com>

On 9/2/14, 1:22 PM, "Dan McDonald" <danmcd at omniti.com> wrote:

>This includes HW goodies like LSI 3008-based 12G SAS (albeit not at
>optimal performance yet)

How sub-optimal is the LSI 3008-based support currently (as in 014)? Are
we talking "faster than 6G SAS but not as fast as it should be" or "same
speed as 6G SAS" or something else? The application would be a storage
server running 24-36 hard drives as multiple RAIDz2 devices, used mostly
for archiving large video files, so ridiculous performance isn't necessary
-- mostly I'm looking at SuperMicro boards that already have the 3008
inside and want to know if I need to consider adding a better-supported
HBA instead. 
-- 
Dave Pooser
Cat-Herder-in-Chief, Pooserville.com



From danmcd at omniti.com  Sat May  9 17:45:40 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Sat, 9 May 2015 13:45:40 -0400
Subject: [OmniOS-discuss] r151012 is coming...
In-Reply-To: <D173AD9F.2F0D2B%dave-lists@pooserville.com>
References: <E1B54D4D-59E7-4E32-94DA-22E22DBCA015@omniti.com>
	<D173AD9F.2F0D2B%dave-lists@pooserville.com>
Message-ID: <AE927FD8-2354-4A82-B594-9B1DBC23F5BB@omniti.com>

I *believe* it's more than 6G, but not quite 12 yet.  I didn't have any 3008 boards in house to see, but the illumos community did.  You may be better off asking an illumos mailing list that question.

I'd go with a 3008 on the board one, just make sure it has the IT firmware, and the correct (not latest) version. I think 28 is the known good version.  Storage types here can confirm/deny that data point.

Dan

Sent from my iPhone (typos, autocorrect, and all)

> On May 9, 2015, at 1:38 PM, Dave Pooser <dave-oo at pooserville.com> wrote:
> 
>> On 9/2/14, 1:22 PM, "Dan McDonald" <danmcd at omniti.com> wrote:
>> 
>> This includes HW goodies like LSI 3008-based 12G SAS (albeit not at
>> optimal performance yet)
> 
> How sub-optimal is the LSI 3008-based support currently (as in 014)? Are
> we talking "faster than 6G SAS but not as fast as it should be" or "same
> speed as 6G SAS" or something else? The application would be a storage
> server running 24-36 hard drives as multiple RAIDz2 devices, used mostly
> for archiving large video files, so ridiculous performance isn't necessary
> -- mostly I'm looking at SuperMicro boards that already have the 3008
> inside and want to know if I need to consider adding a better-supported
> HBA instead. 
> -- 
> Dave Pooser
> Cat-Herder-in-Chief, Pooserville.com
> 
> 
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

From nagele at wildbit.com  Sat May  9 18:06:51 2015
From: nagele at wildbit.com (Chris Nagele)
Date: Sat, 9 May 2015 14:06:51 -0400
Subject: [OmniOS-discuss] High density 2.5" chassis
Message-ID: <CAHfYOdUS0fDkpsTPvUyQ2XikLP32np28i_eTvVExrTry1J8FoQ@mail.gmail.com>

Hi all. Continuing on my all SSD discussion, I am looking for some
recommendations on a new Supermicro
chassis for our file servers. So far I have been looking at this
thing:

http://www.supermicro.com/products/chassis/4U/417/SC417E16-R1400LP.cfm

Does anyone have experience with this? If so, what would you recommend
for a motherboard and HBA to support all of the disks? We've
traditionally used the X9DRD-7LN4F-JBOD or the X9DRi-F with a LSI
9211-8i HBA.

Thanks,
Chris

From chip at innovates.com  Sat May  9 19:28:36 2015
From: chip at innovates.com (Schweiss, Chip)
Date: Sat, 9 May 2015 14:28:36 -0500
Subject: [OmniOS-discuss] High density 2.5" chassis
In-Reply-To: <CAHfYOdUS0fDkpsTPvUyQ2XikLP32np28i_eTvVExrTry1J8FoQ@mail.gmail.com>
References: <CAHfYOdUS0fDkpsTPvUyQ2XikLP32np28i_eTvVExrTry1J8FoQ@mail.gmail.com>
Message-ID: <CALeZrrSqBV2Zi+LCRAZHD+zAjDGEnp_o_O4TQBD3AGAxe6YnCA@mail.gmail.com>

I have an SSD server in one of those chassis.  Here's a write-up about it
on my blog, there are 3 postings about it.

http://www.bigdatajunkie.com/index.php/9-solaris/zfs/10-short-stroking-consumer-ssds

Not necessarily a build for everyone, but it has been absolutely awesome
for our use. After a few bumps at the beginning and giving up on HA on this
server, it has been rock solid.  Many will swear against the interposers,
but combined with Samsung SSDs they have worked very well.

-Chip


On Sat, May 9, 2015 at 1:06 PM, Chris Nagele <nagele at wildbit.com> wrote:

> Hi all. Continuing on my all SSD discussion, I am looking for some
> recommendations on a new Supermicro
> chassis for our file servers. So far I have been looking at this
> thing:
>
> http://www.supermicro.com/products/chassis/4U/417/SC417E16-R1400LP.cfm
>
> Does anyone have experience with this? If so, what would you recommend
> for a motherboard and HBA to support all of the disks? We've
> traditionally used the X9DRD-7LN4F-JBOD or the X9DRi-F with a LSI
> 9211-8i HBA.
>
> Thanks,
> Chris
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150509/8a652341/attachment.html>

From danmcd at omniti.com  Mon May 11 15:48:24 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 11 May 2015 11:48:24 -0400
Subject: [OmniOS-discuss] KVM Performance Update
Message-ID: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com>

I first want to apologize for not recognizing the cause of KVM performance problems (which were DROPPED PACKETS) much sooner.  Until recently, our KVM deployments in house have been either on r151006, or nothing else.  I've added an OI KVM box to our r151014 build machine, to make sure I have a platform to attempt replications.

What happened was that upstream illumos KVM (from Joyent) had a platform flag day during r151012's development --> the VND code. Joyent's illumos child has Virtual Networking Devices (VND) that allow KVM instances to not depend on an actual NIC's Promiscuous Mode to receive packets.  They updated their illumos, and subsequently their KVM.  Remember that "KVM" has two parts:  The kernel KVM driver (from Joyent's illumos-kvm repo), and the "KVM-cmd", which is QEMU (from Joyent's illumos-kvm-cmd repo).

Other distros do not have VND currently (the illumos community is attempting to fix this, and Joyent is leading here, modulo their own day jobs).  The compilation of illumos-kvm-cmd's latest revisions (the QEMU bits) fails without having VND around. We reset illumos-kvm-cmd to the pre-VND revision, but did NOT reset illumos-kvm bits to pre-VND.  Since the world compiled and ran in this split state, I moved forward.  The PROBLEM was that the amount of internal buffering for promiscuous devices is low, and while VND fixes the problem by reducing the use of promiscuous mode, non-VND illumos (like OmniOS) still needs to increase limits.  The up-to-date kernel side eliminated the method for increasing these buffering limits, causing MUCH higher packet drop rates.

Quoting Joyent's Robert Mustacchi:

> By default the stream high watermark for the promisc mode is quite low.
> And for some reason, that I don't recall, there was no great way to do
> that ourselves from user land (could be wrong entirely). As a result, if
> you don't set it, we're basically going to start dropping mblk_t's
> queued on the stream.
> 
> Basically without vnd, you need both of those. With vnd, then you can
> get rid of it in both QEMU and KVM.

Tobi Oetiker (who deserves a ton of credit for calling this problem out, AND determining it was packet drops) helped me test two solutions to the problem:

1.) Revert illumos-kvm to the pre-VND level as well.

2.) Keep up to date with illumos-kvm and illumos-kvm-cmd, but explicitly revert the VND changes in BOTH.

I'm strongly leaning toward committing solution #2. Regardless of which, I will be issuing an update for r151014 later this week that will push KVM performance back to its pre-VND-bump levels.

GOING FORWARD, once VND is upstreamed into illumos-gate, I can eliminate the VND backouts (or just catch up the built repos if I use option #1 above).

Thank you all for your patience, and again, sorry for not addressing this sooner.

Dan McDonald -- OmniOS Engineering



From nagele at wildbit.com  Mon May 11 16:24:33 2015
From: nagele at wildbit.com (Chris Nagele)
Date: Mon, 11 May 2015 12:24:33 -0400
Subject: [OmniOS-discuss] High density 2.5" chassis
In-Reply-To: <CALeZrrSqBV2Zi+LCRAZHD+zAjDGEnp_o_O4TQBD3AGAxe6YnCA@mail.gmail.com>
References: <CAHfYOdUS0fDkpsTPvUyQ2XikLP32np28i_eTvVExrTry1J8FoQ@mail.gmail.com>
	<CALeZrrSqBV2Zi+LCRAZHD+zAjDGEnp_o_O4TQBD3AGAxe6YnCA@mail.gmail.com>
Message-ID: <CAHfYOdWmp+MiWjUxncy60oYeCozQK8bjhKDatnO4zngWjpa6jQ@mail.gmail.com>

Thanks Chip. That's a great write up. I've definitely heard a lot of
negative things about interposers, right we've been using them for
years as well. Not saying it is fine, but just my experience.

If we didn't use interposers how else would it work with that many drives?

Chris

Chris Nagele
Co-founder, Wildbit
Beanstalk, Postmark, dploy.io


On Sat, May 9, 2015 at 3:28 PM, Schweiss, Chip <chip at innovates.com> wrote:
> I have an SSD server in one of those chassis.  Here's a write-up about it on
> my blog, there are 3 postings about it.
>
> http://www.bigdatajunkie.com/index.php/9-solaris/zfs/10-short-stroking-consumer-ssds
>
> Not necessarily a build for everyone, but it has been absolutely awesome for
> our use. After a few bumps at the beginning and giving up on HA on this
> server, it has been rock solid.  Many will swear against the interposers,
> but combined with Samsung SSDs they have worked very well.
>
> -Chip
>
>
> On Sat, May 9, 2015 at 1:06 PM, Chris Nagele <nagele at wildbit.com> wrote:
>>
>> Hi all. Continuing on my all SSD discussion, I am looking for some
>> recommendations on a new Supermicro
>> chassis for our file servers. So far I have been looking at this
>> thing:
>>
>> http://www.supermicro.com/products/chassis/4U/417/SC417E16-R1400LP.cfm
>>
>> Does anyone have experience with this? If so, what would you recommend
>> for a motherboard and HBA to support all of the disks? We've
>> traditionally used the X9DRD-7LN4F-JBOD or the X9DRi-F with a LSI
>> 9211-8i HBA.
>>
>> Thanks,
>> Chris
>> _______________________________________________
>> OmniOS-discuss mailing list
>> OmniOS-discuss at lists.omniti.com
>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
>

From john.barfield at bissinc.com  Mon May 11 17:15:59 2015
From: john.barfield at bissinc.com (John Barfield)
Date: Mon, 11 May 2015 17:15:59 +0000
Subject: [OmniOS-discuss] KVM Performance Update
In-Reply-To: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com>
References: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com>
Message-ID: <4379D4FD-F566-4A07-AABA-5A7355635B20@bissinc.com>

This is great news! Thank you. 




John Barfield / Sr Principal Engineer
+1 (214) 425-0783/ john.barfield at bissinc.com
BISS, Inc. Office: +1 (214) 506-8354 

4925 Greenville Ave Suite 900
Dallas, TX 75206
support.bissinc.com <http://htmlsig.com/support.bissinc.com>
This e-mail message may contain confidential or legally privileged 
information and is intended only for the use of the intended recipient(s). 
Any unauthorized disclosure, dissemination, distribution, copying or the 
taking of any action in reliance on the information herein is prohibited. 
E-mails are not secure and cannot be guaranteed to be error free as they 
can be intercepted, amended, or contain viruses. Anyone who communicates 
with us by e-mail is deemed to have accepted these risks. Company Name is 
not responsible for errors or omissions in this message and denies any 
responsibility for any damage arising from the use of e-mail. Any opinion 
and other statement contained in this message and any attachment are 
solely those of the author and do not necessarily represent those of the 
company.






On 5/11/15, 10:48 AM, "Dan McDonald" <danmcd at omniti.com> wrote:

>I first want to apologize for not recognizing the cause of KVM 
>performance problems (which were DROPPED PACKETS) much sooner.  Until 
>recently, our KVM deployments in house have been either on r151006, or 
>nothing else.  I've added an OI KVM box to our r151014 build machine, to 
>make sure I have a platform to attempt replications.
>
>What happened was that upstream illumos KVM (from Joyent) had a platform 
>flag day during r151012's development --> the VND code. Joyent's illumos 
>child has Virtual Networking Devices (VND) that allow KVM instances to 
>not depend on an actual NIC's Promiscuous Mode to receive packets.  They 
>updated their illumos, and subsequently their KVM.  Remember that "KVM" 
>has two parts:  The kernel KVM driver (from Joyent's illumos-kvm repo), 
>and the "KVM-cmd", which is QEMU (from Joyent's illumos-kvm-cmd repo).
>
>Other distros do not have VND currently (the illumos community is 
>attempting to fix this, and Joyent is leading here, modulo their own day 
>jobs).  The compilation of illumos-kvm-cmd's latest revisions (the QEMU 
>bits) fails without having VND around. We reset illumos-kvm-cmd to the 
>pre-VND revision, but did NOT reset illumos-kvm bits to pre-VND.  Since 
>the world compiled and ran in this split state, I moved forward.  The 
>PROBLEM was that the amount of internal buffering for promiscuous devices 
>is low, and while VND fixes the problem by reducing the use of 
>promiscuous mode, non-VND illumos (like OmniOS) still needs to increase 
>limits.  The up-to-date kernel side eliminated the method for increasing 
>these buffering limits, causing MUCH higher packet drop rates.
>
>Quoting Joyent's Robert Mustacchi:
>
>> By default the stream high watermark for the promisc mode is quite low.
>> And for some reason, that I don't recall, there was no great way to do
>> that ourselves from user land (could be wrong entirely). As a result, if
>> you don't set it, we're basically going to start dropping mblk_t's
>> queued on the stream.
>> 
>> Basically without vnd, you need both of those. With vnd, then you can
>> get rid of it in both QEMU and KVM.
>
>Tobi Oetiker (who deserves a ton of credit for calling this problem out, 
>AND determining it was packet drops) helped me test two solutions to the 
>problem:
>
>1.) Revert illumos-kvm to the pre-VND level as well.
>
>2.) Keep up to date with illumos-kvm and illumos-kvm-cmd, but explicitly 
>revert the VND changes in BOTH.
>
>I'm strongly leaning toward committing solution #2. Regardless of which, 
>I will be issuing an update for r151014 later this week that will push 
>KVM performance back to its pre-VND-bump levels.
>
>GOING FORWARD, once VND is upstreamed into illumos-gate, I can eliminate 
>the VND backouts (or just catch up the built repos if I use option #1 
>above).
>
>Thank you all for your patience, and again, sorry for not addressing this 
>sooner.
>
>Dan McDonald -- OmniOS Engineering
>
>
>_______________________________________________
>OmniOS-discuss mailing list
>OmniOS-discuss at lists.omniti.com
>http://lists.omniti.com/mailman/listinfo/omnios-discuss

From matej at zunaj.si  Tue May 12 05:13:35 2015
From: matej at zunaj.si (Matej Zerovnik)
Date: Tue, 12 May 2015 07:13:35 +0200
Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and
 then resumes
In-Reply-To: <40C78E86-F32D-4588-AF98-EB9820019960@richardelling.com>
References: <55487539.6030408@zunaj.si>
	<a1e7408b7a614dc4c3e96a85459bad62@miras.org>
	<CABweQmLrOgD_utd_8HV4u5KzTXOH2xTjpG4KhgiNQUt+R2_LvA@mail.gmail.com>
	<201505051648.t45GmpA4025308@lists-il.int.omniti.net>
	<40C78E86-F32D-4588-AF98-EB9820019960@richardelling.com>
Message-ID: <55518BFF.6080608@zunaj.si>

I know building a single 50 drives RaidZ2 is a bad idea. As I said, it's 
a legacy that I can't easily change. I already have a backup pool with 
7x10 drives RaidZ2 to which I hope I will be able to switch this week. I 
hope to get some better results and less crashing...

What is interesting is that when the 'event' happens, server works 
normaly, ZFS is accessable and writable(at least, there is no errors in 
log files), only iscsi reports errors and drops the connection. Another 
interesting thing is that after the 'event', all write stops, only read 
continues for another 30min. After 30min all traffic stops for half an 
hour. After that, everything starts to coming back up... Weird?!

Matej

On 09. 05. 2015 02:49, Richard Elling wrote:
>
>> On May 5, 2015, at 9:48 AM, Matej Zerovnik <matej at zunaj.si 
>> <mailto:matej at zunaj.si>> wrote:
>>
>> I will replace the hardwarw in about 4 months with all SAS drives, 
>> but I would love to have a working setup for the time being as well;)
>>
>> I looked at smart stats and there doesnt seem to be any errors. Also, 
>> no hard/soft/transfer error reported by any drive. Will take a look 
>> at service time tomorrow, maybe put the drives to graphite and look 
>> at them over a longer period.
>>
>> I looked at iostat -x status today and stats for pool itself reported 
>> 100% busy most of the time, 98-100% wait, 500-1300 transactions in 
>> queue, around 500 active,... First line, that is average from boot, 
>> says avg service time.is <http://time.is> around 1600ms which seems 
>> like aaaalot. Can it be due to really big queue?
>>
>> Would it help to create 5 10drives raidz pools instead of one with 50 
>> drives?
>
> It is a bad idea to build a single raidz set with 50 drives. Very bad. 
> Hence the zpool
> man page says, "The recommended number is between 3 and 9 to help 
> increase performance."
> But this recommendation applies to reliability, too.
>  -- richard
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150512/b4f74f48/attachment.html>

From paladinemishakal at gmail.com  Tue May 12 10:23:34 2015
From: paladinemishakal at gmail.com (Lawrence Giam)
Date: Tue, 12 May 2015 18:23:34 +0800
Subject: [OmniOS-discuss] Debugging crash dump
Message-ID: <CAGueQCdFEVwWC2Wmq=9U=sOvLXUsXdyZaSr5vg85_E_+3iK9zg@mail.gmail.com>

Hi All,

I have a few time the server panic and auto-rebooted with crash dump. I am
looking at this post
http://wiki.illumos.org/display/illumos/How+To+Report+Problems but it seem
the info is not updated.

When I run this:
echo '::panicinfo\n::cpuinfo -v\n::threadlist -v
10\n::msgbuf\n*panic_thread::findstack -v\n::stacks' | mdb 5 > ~/crash.5

root at sgbk02:/var/crash/unknown# echo '::panicinfo\n::cpuinfo
-v\n::threadlist -v 10\n::msgbuf\n*panic_thread::findstack -v\n::stacks' |
mdb 5 > crash.5
mdb: warning: dump is from SunOS 5.11 omnios-8c08411; dcmds and macros may
not match kernel implementation
mdb: failed to read .symtab header for 'unix', id=0: no mapping for address
mdb: failed to read .symtab header for 'genunix', id=1: no mapping for
address
mdb: failed to read modctl at ffffff113ba4cf08: no mapping for address
mdb: invalid command '::panicinfo': unknown dcmd name
mdb: invalid command '::cpuinfo': unknown dcmd name
mdb: invalid command '::threadlist': unknown dcmd name
mdb: invalid command '::msgbuf': unknown dcmd name
mdb: invalid command '::findstack': unknown dcmd name
mdb: invalid command '::stacks': unknown dcmd name

Can someone update the wiki on how to get the kernel messages and stack
information?

Thanks & Regards,
Lawrence.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150512/6530c7c0/attachment.html>

From paladinemishakal at gmail.com  Tue May 12 10:33:42 2015
From: paladinemishakal at gmail.com (Lawrence Giam)
Date: Tue, 12 May 2015 18:33:42 +0800
Subject: [OmniOS-discuss] Help with debugging crash dump
Message-ID: <CAGueQCeosdA5=gAyoorN+jb-yp9=SmbVQX75z4U8rLdjg5HKow@mail.gmail.com>

Hi All,

I have tried to analyse the crash dump and the following is what I get:

root at sgbk02:/var/crash/unknown# mdb -k unix.3 vmcore.3
mdb: warning: dump is from SunOS 5.11 omnios-8c08411; dcmds and macros may
not match kernel implementation
mdb: failed to read .symtab header for 'unix', id=0: no mapping for address
mdb: failed to read .symtab header for 'genunix', id=1: no mapping for
address
mdb: failed to read modctl at ffffff113ba4cf08: no mapping for address
> ::stack
> ::showrev
Hostname: sgsan3
Release: 5.11
Kernel architecture: i86pc
Application architecture: amd64
Kernel version: SunOS 5.11 i86pc omnios-8c08411
Platform: i86pc
> ::status
debugging crash dump vmcore.3 (64-bit) from sgsan3
operating system: 5.11 omnios-8c08411 (i86pc)
image uuid: 299f9dfc-c835-4319-b0cb-d5c0b0c5841e
panic message: BAD TRAP: type=e (#pf Page fault) rp=ffffff007d2cd450
addr=ffffff14a0c700c8
dump content: kernel pages only
> $r
%rax = 0x0000000000000000                 %r9  = 0xffffff14a0c6fea8
%rbx = 0xffffff14a0c6f990                 %r10 = 0x6636314141416574
%rcx = 0xffffff116bc9da00                 %r11 = 0xffffff007d2cd530
%rdx = 0xffffff14a0c6fdb8                 %r12 = 0x0000000000000016
%rsi = 0x0000000000000000                 %r13 = 0xffffff11b84409b8
%rdi = 0xffffff113b62de00                 %r14 = 0x0000000000000002
%r8  = 0xffffff11b84409b8                 %r15 = 0xffffff14a0c6fea8

%rip = 0xfffffffff82a22b8 smb_fsop_lookup+0x118
%rbp = 0xffffff007d2cd6b0
%rsp = 0xffffff007d2cd540
%rflags = 0x00010286
  id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0
  status=<of,df,IF,tf,SF,zf,af,PF,cf>

                        %cs = 0x0030    %ds = 0x004b    %es = 0x004b
%trapno = 0xe           %fs = 0x0000    %gs = 0x01c3
   %err = 0x0
> 0xfffffffff82a22b8::dis
mdb: failed to read instruction at 0xfffffffff82a22b8: no mapping for
address
> 0xfffffffff82a22b8::dump
                    0 1 2 3  4 5 6 7 \/ 9 a b  c d e f  01234567v9abcdef
mdb: failed to read data at 0xfffffffff82a22b8: no mapping for address
> ::quit

Look at this, it looks like some SMB issue. Can some one show me how to get
more info from the crash dump?

Thanks & Regards.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150512/8ecfbbed/attachment-0001.html>

From danmcd at omniti.com  Tue May 12 13:02:28 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 12 May 2015 09:02:28 -0400
Subject: [OmniOS-discuss] Debugging crash dump
In-Reply-To: <CAGueQCdFEVwWC2Wmq=9U=sOvLXUsXdyZaSr5vg85_E_+3iK9zg@mail.gmail.com>
References: <CAGueQCdFEVwWC2Wmq=9U=sOvLXUsXdyZaSr5vg85_E_+3iK9zg@mail.gmail.com>
Message-ID: <BF526A3F-AF08-40CA-8A40-80855AE619E1@omniti.com>


> On May 12, 2015, at 6:23 AM, Lawrence Giam <paladinemishakal at gmail.com> wrote:
> 
> Hi All,
> 
> I have a few time the server panic and auto-rebooted with crash dump. I am looking at this post http://wiki.illumos.org/display/illumos/How+To+Report+Problems but it seem the info is not updated.
> 
> When I run this:
> echo '::panicinfo\n::cpuinfo -v\n::threadlist -v 10\n::msgbuf\n*panic_thread::findstack -v\n::stacks' | mdb 5 > ~/crash.5
> 
> root at sgbk02:/var/crash/unknown# echo '::panicinfo\n::cpuinfo -v\n::threadlist -v 10\n::msgbuf\n*panic_thread::findstack -v\n::stacks' | mdb 5 > crash.5
> mdb: warning: dump is from SunOS 5.11 omnios-8c08411; dcmds and macros may not match kernel implementation
<SNIP!>

8c08411 is OmniOS r151010 -- that's where your dump is from.  Are you running the analysis on a later-release machine?  It looks like that's the case, and there have been mdb changes between 1-2 stable releases that would make reading 010 dumps difficult.

"$c" or "$C" show you the kernel stack, and "::msgbuf" shows you the in-kernel-memory dmesg(1M) output.

Dan


From johan.kragsterman at capvert.se  Tue May 12 17:08:19 2015
From: johan.kragsterman at capvert.se (Johan Kragsterman)
Date: Tue, 12 May 2015 19:08:19 +0200
Subject: [OmniOS-discuss] opendj in a zone
Message-ID: <OF32CC3463.3A395AB4-ONC1257E43.005D0D65-C1257E43.005E2574@inse.com>


Hi!


Right now I'm trying to do some things that are not really within my knowledge.

So I need to bother you guys a little bit, for advices...

I'm setting up OpenDJ from Forgerock in a zone(151014). OpenDJ is a directory server, with SUN heritage, and it is java based.

I seem to have managed to get the server up and running, but I can't reach the management console, I get some java errors.

So if anyone of you got any input on this, pls let me know....(I'm a complete nooob with java...):





root at z1:/etc/opendj/bin# ./control-panel
Could not launch Control Panel.  Check that you have access to the display.
Check file /var/tmp/opendj-control-panel-1827401860598694601.log for details.
root at z1:/etc/opendj/bin# cat /var/tmp/opendj-control-panel-1827401860598694601.log
May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.util.ControlPanelLog initLogFileHandler
INFO: Application launched May 12, 2015 4:52:28 PM UTC
May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.ControlPanelLauncher$1 run
WARNING: Error setting look and feel: java.awt.AWTError: Toolkit not found: sun.awt.X11.XToolkit
java.awt.AWTError: Toolkit not found: sun.awt.X11.XToolkit
        at java.awt.Toolkit$2.run(Toolkit.java:876)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.awt.Toolkit.getDefaultToolkit(Toolkit.java:861)
        at java.awt.Toolkit.getEventQueue(Toolkit.java:1752)
        at java.awt.EventQueue.isDispatchThread(EventQueue.java:1018)
        at javax.swing.SwingUtilities.isEventDispatchThread(SwingUtilities.java:1360)
        at org.opends.quicksetup.ui.UIFactory.initializeLookAndFeel(UIFactory.java:722)
        at org.opends.guitools.controlpanel.ControlPanelLauncher.initLookAndFeel(ControlPanelLauncher.java:240)
        at org.opends.guitools.controlpanel.ControlPanelLauncher.access$000(ControlPanelLauncher.java:61)
        at org.opends.guitools.controlpanel.ControlPanelLauncher$1.run(ControlPanelLauncher.java:178)
        at java.lang.Thread.run(Thread.java:745)

May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.ControlPanelLauncher$1 run
WARNING: Error launching GUI: java.awt.HeadlessException
May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.ControlPanelLauncher$1 run
WARNING: java.awt.GraphicsEnvironment.checkHeadless(GraphicsEnvironment.java:207)
java.awt.Window.<init>(Window.java:535)
java.awt.Frame.<init>(Frame.java:420)
java.awt.Frame.<init>(Frame.java:385)
org.opends.quicksetup.SplashScreen.<init>(SplashScreen.java:104)
org.opends.guitools.controlpanel.ControlPanelSplashScreen.<init>(ControlPanelLauncher.java:300)
org.opends.guitools.controlpanel.ControlPanelSplashScreen.main(ControlPanelLauncher.java:316)
org.opends.guitools.controlpanel.ControlPanelLauncher$1.run(ControlPanelLauncher.java:185)
java.lang.Thread.run(Thread.java:745)

root at z1:/etc/opendj/bin# 



There are definitly no port access problems, the ports in use are 389 and 4444:


root at z1:/etc/opendj/bin# netstat -an

UDP: IPv4
   Local Address        Remote Address      State
-------------------- -------------------- ----------
      *.111                               Idle
      *.*                                 Unbound
      *.37411                             Idle
      *.111                               Idle
      *.*                                 Unbound
      *.50848                             Idle

UDP: IPv6
   Local Address                     Remote Address                   State      If
--------------------------------- --------------------------------- ---------- -----
      *.111                                                         Idle       
      *.*                                                           Unbound    
      *.37411                                                       Idle       

TCP: IPv4
   Local Address        Remote Address    Swind Send-Q Rwind Recv-Q    State
-------------------- -------------------- ----- ------ ----- ------ -----------
      *.111                *.*                0      0 128000      0 LISTEN
      *.*                  *.*                0      0 128000      0 IDLE
      *.111                *.*                0      0 128000      0 LISTEN
      *.*                  *.*                0      0 128000      0 IDLE
      *.22                 *.*                0      0 128000      0 LISTEN
      *.54631              *.*                0      0 128000      0 LISTEN
      *.58404              *.*                0      0 128000      0 LISTEN
      *.4444               *.*                0      0 128000      0 LISTEN
      *.389                *.*                0      0 128000      0 LISTEN

TCP: IPv6
   Local Address                     Remote Address                 Swind Send-Q Rwind Recv-Q   State      If
--------------------------------- --------------------------------- ----- ------ ----- ------ ----------- -----
      *.111                             *.*                             0      0 128000      0 LISTEN      
      *.*                               *.*                             0      0 128000      0 IDLE        
      *.22                              *.*                             0      0 128000      0 LISTEN      
      *.58404                           *.*                             0      0 128000      0 LISTEN      
      *.4444                            *.*                             0      0 128000      0 LISTEN      
      *.389                             *.*                             0      0 128000      0 LISTEN      

Active UNIX domain sockets
Address  Type          Vnode     Conn  Local Addr      Remote Addr
ffffff03f2f99048 stream-ord 0000000 0000000                               
ffffff03f2f99b58 stream-ord 0000000 0000000                               
ffffff03f282fb48 stream-ord ffffff03f27b4580 0000000 /var/run/.inetd.uds                
root at z1:/etc/opendj/bin# 






Best regards from/Med v?nliga h?lsningar fr?n

Johan Kragsterman

Capvert


From danmcd at omniti.com  Tue May 12 17:38:04 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 12 May 2015 13:38:04 -0400
Subject: [OmniOS-discuss] opendj in a zone
In-Reply-To: <OF32CC3463.3A395AB4-ONC1257E43.005D0D65-C1257E43.005E2574@inse.com>
References: <OF32CC3463.3A395AB4-ONC1257E43.005D0D65-C1257E43.005E2574@inse.com>
Message-ID: <AD6D8C33-750B-4077-8436-428791D58FBB@omniti.com>


> On May 12, 2015, at 1:08 PM, Johan Kragsterman <johan.kragsterman at capvert.se> wrote:
> 
> So if anyone of you got any input on this, pls let me know....(I'm a complete nooob with java...):

I'm no Java wizard, so take this with a grain of salt, but...

> root at z1:/etc/opendj/bin# ./control-panel
> Could not launch Control Panel.  Check that you have access to the display.
> Check file /var/tmp/opendj-control-panel-1827401860598694601.log for details.
> root at z1:/etc/opendj/bin# cat /var/tmp/opendj-control-panel-1827401860598694601.log
> May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.util.ControlPanelLog initLogFileHandler
> INFO: Application launched May 12, 2015 4:52:28 PM UTC
> May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.ControlPanelLauncher$1 run
> WARNING: Error setting look and feel: java.awt.AWTError: Toolkit not found: sun.awt.X11.XToolkit
> java.awt.AWTError: Toolkit not found: sun.awt.X11.XToolkit

From that last quoted line, it looks like you'll need X11 libraries, possibly X11 *JAVA* libraries as well.

We don't supply X11 at all in the "omnios" publisher.  I'd suggest using pkgsrc and installing X11 libraries to help you out.

I'm sure there are others on the list with experience installing X11 libraries on OmniOS, possibly even to help out Java apps.

Dan


From johan.kragsterman at capvert.se  Tue May 12 18:55:13 2015
From: johan.kragsterman at capvert.se (Johan Kragsterman)
Date: Tue, 12 May 2015 20:55:13 +0200
Subject: [OmniOS-discuss] Ang: Re:  opendj in a zone
In-Reply-To: <AB40E24C-CBBD-4580-A605-BB593EA4CD41@holyarmy.org>
References: <AB40E24C-CBBD-4580-A605-BB593EA4CD41@holyarmy.org>,
	<OF32CC3463.3A395AB4-ONC1257E43.005D0D65-C1257E43.005E2574@inse.com>
	<AD6D8C33-750B-4077-8436-428791D58FBB@omniti.com>
Message-ID: <OF96A39384.3E0164BF-ONC1257E43.0066D47F-C1257E43.0067EEE9@inse.com>


Hi!


-----Benjamin Sherman <benjamin at holyarmy.org> skrev: -----
Till: Dan McDonald <danmcd at omniti.com>
Fr?n: Benjamin Sherman <benjamin at holyarmy.org>
Datum: 2015-05-12 20:33
Kopia: Johan Kragsterman <johan.kragsterman at capvert.se>, omnios-discuss at lists.omniti.com
?rende: Re: [OmniOS-discuss] opendj in a zone

Johan,

I use OpenDJ and I've run it both on Linux and OmniOS. ?

The simplest solution is do not attempt to run the control panel app from the OmniOS server.

1) Download/copy the OpenDJ package, or even a tarball of your installation from the OmniOS machine to a "desktop" machine, anything where you have Java and X11 (or Mac or Windows). ?
2) From that machine, run the control-panel.
3) When it starts, you'll need to provide IP, port, and credentials to make it talk to the OpenDJ daemon process on OmniOS machine, but it should work just fine otherwise.

As Dan suggested, you can also install the X11 libs on OmniOS, but you'd still need a local X11 server for the remote control-panel process to use for user interaction.






Ah, I see! Good, thanks!

OK, I'm not trying to run it from the omnios zone, but from a linux LTSP fat client, so I would need to rebuild the chroot with opendj inside. But that's not a problem, I can do that. Or I can put the software on the LTSP server and run X over ssh...

Question is, do I really need the control panel? Can I live without it? I guess I can administrate the server with other tools?

May I ask what your use case is? Do you also use other Forgerock software, like OpenAM and OpenIDM? Do you use any of the REST2LDAP tools?


Regards Johan







-Benjamin

> On May 12, 2015, at 10:38 AM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> 
>> On May 12, 2015, at 1:08 PM, Johan Kragsterman <johan.kragsterman at capvert.se> wrote:
>> 
>> So if anyone of you got any input on this, pls let me know....(I'm a complete nooob with java...):
> 
> I'm no Java wizard, so take this with a grain of salt, but...
> 
>> root at z1:/etc/opendj/bin# ./control-panel
>> Could not launch Control Panel. ?Check that you have access to the display.
>> Check file /var/tmp/opendj-control-panel-1827401860598694601.log for details.
>> root at z1:/etc/opendj/bin# cat /var/tmp/opendj-control-panel-1827401860598694601.log
>> May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.util.ControlPanelLog initLogFileHandler
>> INFO: Application launched May 12, 2015 4:52:28 PM UTC
>> May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.ControlPanelLauncher$1 run
>> WARNING: Error setting look and feel: java.awt.AWTError: Toolkit not found: sun.awt.X11.XToolkit
>> java.awt.AWTError: Toolkit not found: sun.awt.X11.XToolkit
> 
> From that last quoted line, it looks like you'll need X11 libraries, possibly X11 *JAVA* libraries as well.
> 
> We don't supply X11 at all in the "omnios" publisher. ?I'd suggest using pkgsrc and installing X11 libraries to help you out.
> 
> I'm sure there are others on the list with experience installing X11 libraries on OmniOS, possibly even to help out Java apps.
> 
> Dan
> 
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss





From danmcd at omniti.com  Tue May 12 18:59:02 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 12 May 2015 14:59:02 -0400
Subject: [OmniOS-discuss] KVM Performance Update
In-Reply-To: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com>
References: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com>
Message-ID: <90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com>


> On May 11, 2015, at 11:48 AM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> 
> 1.) Revert illumos-kvm to the pre-VND level as well.
> 
> 2.) Keep up to date with illumos-kvm and illumos-kvm-cmd, but explicitly revert the VND changes in BOTH.
> 
> I'm strongly leaning toward committing solution #2. Regardless of which, I will be issuing an update for r151014 later this week that will push KVM performance back to its pre-VND-bump levels.

I chose option #2:

	https://github.com/omniti-labs/omnios-build/commit/0268a2ff04b1cbed2324054cb97a0f36c58989b0

There's now an update for r151014 that has the updated system/kvm (qemu/userland) and driver/virtualization/kvm (kernel KVM driver) on the repo server.  A "pkg update" will update your packages AND boot archive without.  I do recommend, however, you power down your KVM instances and "pkill qemu" prior to running the update.

Along with this update is a small fix to onu(1) for illumos developers who are working with r151014 as their base system for ONU-ing.

Thank you all again for your patience,
Dan


From benjamin at holyarmy.org  Tue May 12 18:33:36 2015
From: benjamin at holyarmy.org (Benjamin Sherman)
Date: Tue, 12 May 2015 11:33:36 -0700
Subject: [OmniOS-discuss] opendj in a zone
In-Reply-To: <AD6D8C33-750B-4077-8436-428791D58FBB@omniti.com>
References: <OF32CC3463.3A395AB4-ONC1257E43.005D0D65-C1257E43.005E2574@inse.com>
	<AD6D8C33-750B-4077-8436-428791D58FBB@omniti.com>
Message-ID: <AB40E24C-CBBD-4580-A605-BB593EA4CD41@holyarmy.org>

Johan,

I use OpenDJ and I've run it both on Linux and OmniOS.  

The simplest solution is do not attempt to run the control panel app from the OmniOS server.

1) Download/copy the OpenDJ package, or even a tarball of your installation from the OmniOS machine to a "desktop" machine, anything where you have Java and X11 (or Mac or Windows).  
2) From that machine, run the control-panel.
3) When it starts, you'll need to provide IP, port, and credentials to make it talk to the OpenDJ daemon process on OmniOS machine, but it should work just fine otherwise.

As Dan suggested, you can also install the X11 libs on OmniOS, but you'd still need a local X11 server for the remote control-panel process to use for user interaction.


-Benjamin

> On May 12, 2015, at 10:38 AM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> 
>> On May 12, 2015, at 1:08 PM, Johan Kragsterman <johan.kragsterman at capvert.se> wrote:
>> 
>> So if anyone of you got any input on this, pls let me know....(I'm a complete nooob with java...):
> 
> I'm no Java wizard, so take this with a grain of salt, but...
> 
>> root at z1:/etc/opendj/bin# ./control-panel
>> Could not launch Control Panel.  Check that you have access to the display.
>> Check file /var/tmp/opendj-control-panel-1827401860598694601.log for details.
>> root at z1:/etc/opendj/bin# cat /var/tmp/opendj-control-panel-1827401860598694601.log
>> May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.util.ControlPanelLog initLogFileHandler
>> INFO: Application launched May 12, 2015 4:52:28 PM UTC
>> May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.ControlPanelLauncher$1 run
>> WARNING: Error setting look and feel: java.awt.AWTError: Toolkit not found: sun.awt.X11.XToolkit
>> java.awt.AWTError: Toolkit not found: sun.awt.X11.XToolkit
> 
> From that last quoted line, it looks like you'll need X11 libraries, possibly X11 *JAVA* libraries as well.
> 
> We don't supply X11 at all in the "omnios" publisher.  I'd suggest using pkgsrc and installing X11 libraries to help you out.
> 
> I'm sure there are others on the list with experience installing X11 libraries on OmniOS, possibly even to help out Java apps.
> 
> Dan
> 
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss


From omnios at citrus-it.net  Tue May 12 20:32:46 2015
From: omnios at citrus-it.net (Andy Fiddaman)
Date: Tue, 12 May 2015 20:32:46 +0000 (UTC)
Subject: [OmniOS-discuss] KVM Performance Update
In-Reply-To: <90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com>
References: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com>
	<90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com>
Message-ID: <alpine.GSO.2.00.1505122032130.15788@erncre.pvgehf-vg.arg>


Thanks Dan, and just before I was about to migrate my kvms across to
r151014 too!

Andy

On Tue, 12 May 2015, Dan McDonald wrote:

;
; > On May 11, 2015, at 11:48 AM, Dan McDonald <danmcd at omniti.com> wrote:
; >
; >
; > 1.) Revert illumos-kvm to the pre-VND level as well.
; >
; > 2.) Keep up to date with illumos-kvm and illumos-kvm-cmd, but explicitly revert the VND changes in BOTH.
; >
; > I'm strongly leaning toward committing solution #2. Regardless of which, I will be issuing an update for r151014 later this week that will push KVM performance back to its pre-VND-bump levels.
;
; I chose option #2:
;
; 	https://github.com/omniti-labs/omnios-build/commit/0268a2ff04b1cbed2324054cb97a0f36c58989b0
;
; There's now an update for r151014 that has the updated system/kvm (qemu/userland) and driver/virtualization/kvm (kernel KVM driver) on the repo server.  A "pkg update" will update your packages AND boot archive without.  I do recommend, however, you power down your KVM instances and "pkill qemu" prior to running the update.
;
; Along with this update is a small fix to onu(1) for illumos developers who are working with r151014 as their base system for ONU-ing.
;
; Thank you all again for your patience,
; Dan
;
; _______________________________________________
; OmniOS-discuss mailing list
; OmniOS-discuss at lists.omniti.com
; http://lists.omniti.com/mailman/listinfo/omnios-discuss
;
-- 
Citrus IT Limited | +44 (0)870 199 8000 | enquiries at citrus-it.co.uk
Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ
Registered in England and Wales | Company number 4899123


From hasslerd at gmx.li  Wed May 13 09:02:57 2015
From: hasslerd at gmx.li (Dominik Hassler)
Date: Wed, 13 May 2015 11:02:57 +0200
Subject: [OmniOS-discuss] ping rtt for KVM in zone
Message-ID: <trinity-608871ef-6417-4208-b739-64c2a9081949-1431507776934@3capp-gmx-bs43>

Hi,

I am running my KVMs in individual zones and seeing an increased ping rtt by a factor of approx. 7 compared to ping rtt when running the same KVM inside the GZ (cf. attached smokeping chart).

This does *only* affect virtio nics but not e1000 nics. For e1000 nics the ping rtt remains the same, no matter if the KVM runs in the GZ or a NGZ.

Dan's 'KVM Performance Update' did resolve the throughput issue, but not the strange ping behaviour I am seeing.

Any ideas why it only affects virtio nics and when the KVM is in a zone? Any ideas how to improve it?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kvm_virtio_zone.png
Type: image/png
Size: 34510 bytes
Desc: not available
URL: <https://omniosce.org/ml-archive/attachments/20150513/e0562479/attachment-0001.png>

From matthew.lagoe at subrigo.net  Wed May 13 09:11:02 2015
From: matthew.lagoe at subrigo.net (Matthew Lagoe)
Date: Wed, 13 May 2015 02:11:02 -0700
Subject: [OmniOS-discuss] ping rtt for KVM in zone
In-Reply-To: <trinity-608871ef-6417-4208-b739-64c2a9081949-1431507776934@3capp-gmx-bs43>
References: <trinity-608871ef-6417-4208-b739-64c2a9081949-1431507776934@3capp-gmx-bs43>
Message-ID: <003d01d08d5c$bc814340$3583c9c0$@subrigo.net>

Some nic's don?t handle the virtio stuff very well (myricom im looking at you) so that could be part of the problem

Intel typically is pretty good about it however so the e1000's working doesn?t surprise me.

What nics are you specifically having issues with that have the extra delay?

-----Original Message-----
From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] On Behalf Of Dominik Hassler
Sent: Wednesday, May 13, 2015 02:03 AM
To: omnios-discuss at lists.omniti.com
Subject: [OmniOS-discuss] ping rtt for KVM in zone

Hi,

I am running my KVMs in individual zones and seeing an increased ping rtt by a factor of approx. 7 compared to ping rtt when running the same KVM inside the GZ (cf. attached smokeping chart).

This does *only* affect virtio nics but not e1000 nics. For e1000 nics the ping rtt remains the same, no matter if the KVM runs in the GZ or a NGZ.

Dan's 'KVM Performance Update' did resolve the throughput issue, but not the strange ping behaviour I am seeing.

Any ideas why it only affects virtio nics and when the KVM is in a zone? Any ideas how to improve it?



From hasslerd at gmx.li  Wed May 13 09:33:03 2015
From: hasslerd at gmx.li (Dominik Hassler)
Date: Wed, 13 May 2015 11:33:03 +0200
Subject: [OmniOS-discuss] ping rtt for KVM in zone
In-Reply-To: <003d01d08d5c$bc814340$3583c9c0$@subrigo.net>
References: <trinity-608871ef-6417-4208-b739-64c2a9081949-1431507776934@3capp-gmx-bs43>,
	<003d01d08d5c$bc814340$3583c9c0$@subrigo.net>
Message-ID: <trinity-303e3e72-09a8-4319-aeb0-2a6b96cf9109-1431509582872@3capp-gmx-bs59>

Matthew,

I have 'Intel I350' nics. It is not about virtio performance in general but the difference whether the *same* KVM runs in the GZ or in a NGZ.

> Gesendet: Mittwoch, 13. Mai 2015 um 11:11 Uhr
> Von: "Matthew Lagoe" <matthew.lagoe at subrigo.net>
> An: "'Dominik Hassler'" <hasslerd at gmx.li>, omnios-discuss at lists.omniti.com
> Betreff: RE: [OmniOS-discuss] ping rtt for KVM in zone
>
> Some nic's don?t handle the virtio stuff very well (myricom im looking at you) so that could be part of the problem
> 
> Intel typically is pretty good about it however so the e1000's working doesn?t surprise me.
> 
> What nics are you specifically having issues with that have the extra delay?
> 
> -----Original Message-----
> From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] On Behalf Of Dominik Hassler
> Sent: Wednesday, May 13, 2015 02:03 AM
> To: omnios-discuss at lists.omniti.com
> Subject: [OmniOS-discuss] ping rtt for KVM in zone
> 
> Hi,
> 
> I am running my KVMs in individual zones and seeing an increased ping rtt by a factor of approx. 7 compared to ping rtt when running the same KVM inside the GZ (cf. attached smokeping chart).
> 
> This does *only* affect virtio nics but not e1000 nics. For e1000 nics the ping rtt remains the same, no matter if the KVM runs in the GZ or a NGZ.
> 
> Dan's 'KVM Performance Update' did resolve the throughput issue, but not the strange ping behaviour I am seeing.
> 
> Any ideas why it only affects virtio nics and when the KVM is in a zone? Any ideas how to improve it?
> 
> 
>

From mcgee at sci-world.net  Wed May 13 11:40:05 2015
From: mcgee at sci-world.net (Matthew McGee)
Date: Wed, 13 May 2015 07:40:05 -0400
Subject: [OmniOS-discuss] CIFS Issues
Message-ID: <CAPa50n2+qHKtNy8vt7T-gEJxU56FXwdaXs=1vSQyrvWG3pp3fQ@mail.gmail.com>

I am attempting to migrate my CIFS shares from FreeNAS to OmniOS.
I have attempted a number of different installs and for now I am working in
a VM
for speed of reboots and testing.

I have Windows 2012 AD, and a number of Mac OSX & Windows 7 clients.

Server name = DATA
Domain HOME.example.net

I install the system, configure the IP of 10.0.1.230/8, set and test route,
create a base boot environment
and a CIFS boot environment. Reboot into the CIFS boot environment.

I have attempted going straight to Napp-it and I have tried manual
initialization as follows:

verify /etc/hosts and /etc/nodename entries
Verify AD DNS
verify system is using AD DNS server only
nslookup to verify forward & reverse entries are functional and resolve on
the host
pkg install kerberos-5
# Tried with and without this setting
sharectl set -p ddns_enable=true
klcient -T ms_ad
kinit Administrator
klist & verify output
svcadm enable -r smb/server
smbadm join -u Administrator
Successful join
smbadm list shows my domain.
Verified kerberos delegation is allowed on the AD side.
vi /etc/nsswitch.conf and add "ad" to passwd & group lines
Have also tried adding smb line to pam


Both of the following produce valid output
touch foo && chown myuser at HOME.example.net && ls -l foo
id myuser at HOME # Although this doesn't show all my groups

create a zfs filesystem and corresponding share called documents

root at data:/root# smbutil view //myuser at DATA
Password:
Share        Type       Comment
-------------------------------
c$           disk       Default Share
documents    disk
IPC$         IPC        Remote IPC
vss$         disk       VSS

4 shares listed from 4 available

When I attempt to access from a Windows 7 host, I see the following:

\\DATA is not accessible. You might not have permission to use this network
resource.
Contact the administrator of this server to find out if you have access
permissions.
The account is not authorized to log in from this station.


\\10.0.1.230 - Works, I can set permissions, read & write files

Neither the netbios nor FQDN function, but it functions by IP.

Samba on FreeNAS or Fedora works without issues, but I need working FC and
comstar will do that for me.
I cannot seem to get the CIFS piece working and it is the one thing
preventing me from moving forward.
Any assistance would be appreciated. I hate asking for help but I've been
working on this every night for a month
and I know there must be one little thing I am missing, maybe a GPO?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150513/683f2959/attachment.html>

From hasslerd at gmx.li  Wed May 13 12:10:46 2015
From: hasslerd at gmx.li (Dominik Hassler)
Date: Wed, 13 May 2015 14:10:46 +0200
Subject: [OmniOS-discuss] CIFS Issues
In-Reply-To: <CAPa50n2+qHKtNy8vt7T-gEJxU56FXwdaXs=1vSQyrvWG3pp3fQ@mail.gmail.com>
References: <CAPa50n2+qHKtNy8vt7T-gEJxU56FXwdaXs=1vSQyrvWG3pp3fQ@mail.gmail.com>
Message-ID: <trinity-a9a9f1ac-1941-4fbf-82f1-561c2855b3c0-1431519046174@3capp-gmx-bs51>

Did you try to end your FQDN with a trailing dot?

like: 'DATA.HOME.example.net.' in your example?
?

Gesendet:?Mittwoch, 13. Mai 2015 um 13:40 Uhr
Von:?"Matthew McGee" <mcgee at sci-world.net>
An:?omnios-discuss at lists.omniti.com
Betreff:?[OmniOS-discuss] CIFS Issues

I am attempting to migrate my CIFS shares from FreeNAS to OmniOS.
I have attempted a number of different installs and for now I am working in a VM
for speed of reboots and testing.
?
I have Windows 2012 AD, and a number of Mac OSX & Windows 7 clients.
?
Server name = DATA
Domain HOME.example.net[http://HOME.example.net]
?I install the system, configure the IP of 10.0.1.230/8[http://10.0.1.230/8], set and test route, create a base boot environmentand a CIFS boot environment. Reboot into the CIFS boot environment.
?I have attempted going straight to Napp-it and I have tried manual initialization as follows:
?verify /etc/hosts and /etc/nodename entries
Verify AD DNS
verify system is using AD DNS server only
nslookup to verify forward & reverse entries are functional and resolve on the host
pkg install kerberos-5# Tried with and without this setting
sharectl set -p ddns_enable=true
klcient -T ms_ad
kinit Administrator
klist & verify output
svcadm enable -r smb/server

smbadm join -u Administrator
Successful join
smbadm list shows my domain.
Verified kerberos delegation is allowed on the AD side.
vi /etc/nsswitch.conf and add "ad" to passwd & group lines
Have also tried adding smb line to pam
?
?
Both of the following produce valid output
touch foo && chown myuser at HOME.example.net[myuser at HOME.example.net] && ls -l foo
id myuser at HOME # Although this doesn't show all my groups
create a zfs filesystem and corresponding share called documents

root at data:/root# smbutil view //myuser at DATA
Password:
Share??????? Type?????? Comment
-------------------------------
c$?????????? disk?????? Default Share
documents??? disk
IPC$???????? IPC??????? Remote IPC
vss$???????? disk?????? VSS

4 shares listed from 4 available

When I attempt to access from a Windows 7 host, I see the following:

\\DATA is not accessible. You might not have permission to use this network resource.
Contact the administrator of this server to find out if you have access permissions.
The account is not authorized to log in from this station.

?
\\10.0.1.230 - Works, I can set permissions, read & write files
?
Neither the netbios nor FQDN function, but it functions by IP.
?
Samba on FreeNAS or Fedora works without issues, but I need working FC and comstar will do that for me.
I cannot seem to get the CIFS piece working and it is the one thing preventing me from moving forward.
Any assistance would be appreciated. I hate asking for help but I've been working on this every night for a month
and I know there must be one little thing I am missing, maybe a GPO?_______________________________________________ OmniOS-discuss mailing list OmniOS-discuss at lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss[http://lists.omniti.com/mailman/listinfo/omnios-discuss]

From danmcd at omniti.com  Wed May 13 13:34:36 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 13 May 2015 09:34:36 -0400
Subject: [OmniOS-discuss] ping rtt for KVM in zone
In-Reply-To: <trinity-608871ef-6417-4208-b739-64c2a9081949-1431507776934@3capp-gmx-bs43>
References: <trinity-608871ef-6417-4208-b739-64c2a9081949-1431507776934@3capp-gmx-bs43>
Message-ID: <EF4CF6EA-56F2-4D13-AB0E-5E60ACC02655@omniti.com>


> On May 13, 2015, at 5:02 AM, Dominik Hassler <hasslerd at gmx.li> wrote:
> 
> Any ideas why it only affects virtio nics and when the KVM is in a zone? Any ideas how to improve it?

I'm not 100% sure, but I suspect it has to do with the fact that KVM needs to put the vnic/nic into promiscuous mode.  In a zone, this gets harder, because of permissions the process in a zone needs beyond what it would need in the global zone.

I tried to get Joyent's VND upstreamed in time for r151014.  I suspect VND will still hold an improvement on many fronts, including this one.

Dan


From asc1111 at gmail.com  Wed May 13 16:08:46 2015
From: asc1111 at gmail.com (Aaron Curry)
Date: Wed, 13 May 2015 10:08:46 -0600
Subject: [OmniOS-discuss] CIFS Issues
In-Reply-To: <trinity-a9a9f1ac-1941-4fbf-82f1-561c2855b3c0-1431519046174@3capp-gmx-bs51>
References: <CAPa50n2+qHKtNy8vt7T-gEJxU56FXwdaXs=1vSQyrvWG3pp3fQ@mail.gmail.com>
	<trinity-a9a9f1ac-1941-4fbf-82f1-561c2855b3c0-1431519046174@3capp-gmx-bs51>
Message-ID: <CAOqBcP-ygWiydiLX2pPk2J=g+AZ8bNFjtJ4H1yWy1uAHBAS8nw@mail.gmail.com>

I ran into the same issue when setting up my home server. Access to CIFS
works by IP but not name. I ended up setting up a second IP address and
created a DNS entry with a different name for that IP. I have no idea why
it works but it does.

Aaron

On Wed, May 13, 2015 at 6:10 AM, Dominik Hassler <hasslerd at gmx.li> wrote:

> Did you try to end your FQDN with a trailing dot?
>
> like: 'DATA.HOME.example.net.' in your example?
>
>
> Gesendet: Mittwoch, 13. Mai 2015 um 13:40 Uhr
> Von: "Matthew McGee" <mcgee at sci-world.net>
> An: omnios-discuss at lists.omniti.com
> Betreff: [OmniOS-discuss] CIFS Issues
>
> I am attempting to migrate my CIFS shares from FreeNAS to OmniOS.
> I have attempted a number of different installs and for now I am working
> in a VM
> for speed of reboots and testing.
>
> I have Windows 2012 AD, and a number of Mac OSX & Windows 7 clients.
>
> Server name = DATA
> Domain HOME.example.net[http://HOME.example.net]
>  I install the system, configure the IP of
> 10.0.1.230/8[http://10.0.1.230/8], set and test route, create a base boot
> environmentand a CIFS boot environment. Reboot into the CIFS boot
> environment.
>  I have attempted going straight to Napp-it and I have tried manual
> initialization as follows:
>  verify /etc/hosts and /etc/nodename entries
> Verify AD DNS
> verify system is using AD DNS server only
> nslookup to verify forward & reverse entries are functional and resolve on
> the host
> pkg install kerberos-5# Tried with and without this setting
> sharectl set -p ddns_enable=true
> klcient -T ms_ad
> kinit Administrator
> klist & verify output
> svcadm enable -r smb/server
>
> smbadm join -u Administrator
> Successful join
> smbadm list shows my domain.
> Verified kerberos delegation is allowed on the AD side.
> vi /etc/nsswitch.conf and add "ad" to passwd & group lines
> Have also tried adding smb line to pam
>
>
> Both of the following produce valid output
> touch foo && chown myuser at HOME.example.net[myuser at HOME.example.net] && ls
> -l foo
> id myuser at HOME # Although this doesn't show all my groups
> create a zfs filesystem and corresponding share called documents
>
> root at data:/root# smbutil view //myuser at DATA
> Password:
> Share        Type       Comment
> -------------------------------
> c$           disk       Default Share
> documents    disk
> IPC$         IPC        Remote IPC
> vss$         disk       VSS
>
> 4 shares listed from 4 available
>
> When I attempt to access from a Windows 7 host, I see the following:
>
> \\DATA is not accessible. You might not have permission to use this
> network resource.
> Contact the administrator of this server to find out if you have access
> permissions.
> The account is not authorized to log in from this station.
>
>
> \\10.0.1.230 - Works, I can set permissions, read & write files
>
> Neither the netbios nor FQDN function, but it functions by IP.
>
> Samba on FreeNAS or Fedora works without issues, but I need working FC and
> comstar will do that for me.
> I cannot seem to get the CIFS piece working and it is the one thing
> preventing me from moving forward.
> Any assistance would be appreciated. I hate asking for help but I've been
> working on this every night for a month
> and I know there must be one little thing I am missing, maybe a
> GPO?_______________________________________________ OmniOS-discuss mailing
> list OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss[http://lists.omniti.com/mailman/listinfo/omnios-discuss]
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150513/cefc414c/attachment-0001.html>

From danmcd at omniti.com  Wed May 13 17:55:41 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 13 May 2015 13:55:41 -0400
Subject: [OmniOS-discuss] VENOM (CVE-2015-3456) update
Message-ID: <B019EFEA-A0E3-409B-AD28-ED57C3810E09@omniti.com>

Some of you probably have been tracking VENOM (aka. CVE-2015-3456).

I have patched the qemu that OmniOS's KVM uses with a VENOM fix and pushed updates on to the repo servers.  Source people can consult:

	https://github.com/joyent/illumos-kvm-cmd/commit/407546e5132f54065f3f78ac293ad7a8d16bf57c

for the fix itself.

r151006 --> new system/kvm package, with just VENOM patched.

r151014 --> new system/kvm package, with just VENOM patched.

r151012 --> new system/kvm AND driver/virtualization/kvm. VENOM is patched, and due to 012's closeness to 014, the 014 performance changes came along for the ride.

I'd recommend:

1.) Shutting down all KVM instances, and make sure "pgrep qemu" in the global zone shows no processes.  If you still see qemu processes, kill them after insuring your KVMs are shut down.

2.) pkg update

3.) Restarting your KVM instances, all of which will use the new, patched QEMU.

Thank you folks!
Dan


From mir at miras.org  Wed May 13 18:14:35 2015
From: mir at miras.org (Michael Rasmussen)
Date: Wed, 13 May 2015 20:14:35 +0200
Subject: [OmniOS-discuss] KVM Performance Update
In-Reply-To: <90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com>
References: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com>
	<90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com>
Message-ID: <20150513201435.4d3a3d7c@sleipner.datanom.net>

On Tue, 12 May 2015 14:59:02 -0400
Dan McDonald <danmcd at omniti.com> wrote:

> 
> I chose option #2:
> 
> 	https://github.com/omniti-labs/omnios-build/commit/0268a2ff04b1cbed2324054cb97a0f36c58989b0
> 
> There's now an update for r151014 that has the updated system/kvm (qemu/userland) and driver/virtualization/kvm (kernel KVM driver) on the repo server.  A "pkg update" will update your packages AND boot archive without.  I do recommend, however, you power down your KVM instances and "pkill qemu" prior to running the update.
> 
Has someone made performance test with the patched kvm package?

-- 
Hilsen/Regards
Michael Rasmussen

Get my public GnuPG keys:
michael <at> rasmussen <dot> cc
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E
mir <at> datanom <dot> net
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C
mir <at> miras <dot> org
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917
--------------------------------------------------------------
/usr/games/fortune -es says:
Give your very best today.  Heaven knows it's little enough.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <https://omniosce.org/ml-archive/attachments/20150513/9ac31e66/attachment.bin>

From danmcd at omniti.com  Wed May 13 18:28:22 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 13 May 2015 14:28:22 -0400
Subject: [OmniOS-discuss] KVM Performance Update
In-Reply-To: <20150513201435.4d3a3d7c@sleipner.datanom.net>
References: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com>
	<90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com>
	<20150513201435.4d3a3d7c@sleipner.datanom.net>
Message-ID: <E8772EA1-F7C0-4B69-B39D-B57ADB378704@omniti.com>


> On May 13, 2015, at 2:14 PM, Michael Rasmussen <mir at miras.org> wrote:
> 
> Has someone made performance test with the patched kvm package?

Tobi's sheet has a preliminary version.  Not sure if he's tested with the one that actually is in the repo servers now.

ALSO, 012 got the perf fix because it was easier to bring that along for the ride instead of addressing VENOM by itself for 012.

Dan


From hasslerd at gmx.li  Wed May 13 18:39:21 2015
From: hasslerd at gmx.li (Dominik Hassler)
Date: Wed, 13 May 2015 20:39:21 +0200
Subject: [OmniOS-discuss] KVM Performance Update
Message-ID: <y6rn0fkct7m8vt9s3i9i1tps.1431542146242@email.android.com>


    
I've applied yesterday's kvm performance patch, did performance tests and posted the results in tobi's sheet.


Sent from my Samsung device

-------- Original message --------
From: Dan McDonald <danmcd at omniti.com> 
Date: 13/05/2015  20:28  (GMT+01:00) 
To: Michael Rasmussen <mir at miras.org> 
Cc: omnios-discuss at lists.omniti.com 
Subject: Re: [OmniOS-discuss] KVM Performance Update 


> On May 13, 2015, at 2:14 PM, Michael Rasmussen <mir at miras.org> wrote:
> 
> Has someone made performance test with the patched kvm package?

Tobi's sheet has a preliminary version.? Not sure if he's tested with the one that actually is in the repo servers now.

ALSO, 012 got the perf fix because it was easier to bring that along for the ride instead of addressing VENOM by itself for 012.

Dan

_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discussi
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150513/dfa5efb5/attachment.html>

From nsmith at careyweb.com  Wed May 13 18:50:54 2015
From: nsmith at careyweb.com (Nate Smith)
Date: Wed, 13 May 2015 14:50:54 -0400
Subject: [OmniOS-discuss] High density 2.5" chassis
In-Reply-To: <CALeZrrSqBV2Zi+LCRAZHD+zAjDGEnp_o_O4TQBD3AGAxe6YnCA@mail.gmail.com>
References: <CAHfYOdUS0fDkpsTPvUyQ2XikLP32np28i_eTvVExrTry1J8FoQ@mail.gmail.com>
	<CALeZrrSqBV2Zi+LCRAZHD+zAjDGEnp_o_O4TQBD3AGAxe6YnCA@mail.gmail.com>
Message-ID: <40849b67-966a-4f47-97e3-5e3a39124afe@careyweb.com>

I?ve been running an all-ssd setup on a Dell R720, with dual 9207-8i cards connected to dual 8x2.5 disk backplane. (9207-8i is one of the only cards that doesn?t interfere with the BIOS, as dell Implemented it for Tape Drive Support). Boot disks are hooked up internally connected to the onboard sata (I could use USB).? I?ve been using Samsung 843TN drives which could be purchased fairly cheaply for a while. They are underprovisioned at 480GB, and feature a supercap to ensure writes in the event of a powerloss. Plus they have a long write endurance cycle. It has worked well so far, outside of some Queue Depth problems with my fibre channel. I was originally going to use the R720XD, but I found that the backplane uses expanders instead of going 1:1.? I run a 15 disk RAIDZ6 with a hotspare.
 
-Nate
 
 
 
From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] On Behalf Of Schweiss, Chip
Sent: Saturday, May 09, 2015 3:29 PM
To: Chris Nagele
Cc: omnios-discuss at lists.omniti.com
Subject: Re: [OmniOS-discuss] High density 2.5" chassis
 
I have an SSD server in one of those chassis.  Here's a write-up about it on my blog, there are 3 postings about it.

http://www.bigdatajunkie.com/index.php/9-solaris/zfs/10-short-stroking-consumer-ssds
Not necessarily a build for everyone, but it has been absolutely awesome for our use. After a few bumps at the beginning and giving up on HA on this server, it has been rock solid.  Many will swear against the interposers, but combined with Samsung SSDs they have worked very well.
-Chip
 
 
On Sat, May 9, 2015 at 1:06 PM, Chris Nagele <nagele at wildbit.com> wrote:
Hi all. Continuing on my all SSD discussion, I am looking for some
recommendations on a new Supermicro
chassis for our file servers. So far I have been looking at this
thing:

http://www.supermicro.com/products/chassis/4U/417/SC417E16-R1400LP.cfm

Does anyone have experience with this? If so, what would you recommend
for a motherboard and HBA to support all of the disks? We've
traditionally used the X9DRD-7LN4F-JBOD or the X9DRi-F with a LSI
9211-8i HBA.

Thanks,
Chris
_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150513/63db14c2/attachment.html>

From mir at miras.org  Wed May 13 20:13:12 2015
From: mir at miras.org (Michael Rasmussen)
Date: Wed, 13 May 2015 22:13:12 +0200
Subject: [OmniOS-discuss] KVM Performance Update
In-Reply-To: <E8772EA1-F7C0-4B69-B39D-B57ADB378704@omniti.com>
References: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com>
	<90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com>
	<20150513201435.4d3a3d7c@sleipner.datanom.net>
	<E8772EA1-F7C0-4B69-B39D-B57ADB378704@omniti.com>
Message-ID: <20150513221312.5e69fb09@sleipner.datanom.net>

On Wed, 13 May 2015 14:28:22 -0400
Dan McDonald <danmcd at omniti.com> wrote:

> 
> Tobi's sheet has a preliminary version.  Not sure if he's tested with the one that actually is in the repo servers now.
> 
> ALSO, 012 got the perf fix because it was easier to bring that along for the ride instead of addressing VENOM by itself for 012.
> 
If I read the numbers correct I still find the performance
disappointing with the patch. Doing the same kind of test using Linux
or FreeBSD host to Linux or FreeBSD guest gives much higher performance.

-- 
Hilsen/Regards
Michael Rasmussen

Get my public GnuPG keys:
michael <at> rasmussen <dot> cc
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E
mir <at> datanom <dot> net
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C
mir <at> miras <dot> org
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917
--------------------------------------------------------------
/usr/games/fortune -es says:
That wouldn't be good enough.
		-- Larry Wall in <199710131621.JAA14907 at wall.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <https://omniosce.org/ml-archive/attachments/20150513/64c32cd9/attachment-0001.bin>

From hasslerd at gmx.li  Wed May 13 20:26:23 2015
From: hasslerd at gmx.li (Dominik Hassler)
Date: Wed, 13 May 2015 22:26:23 +0200
Subject: [OmniOS-discuss] KVM Performance Update
In-Reply-To: <20150513221312.5e69fb09@sleipner.datanom.net>
References: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com>	<90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com>	<20150513201435.4d3a3d7c@sleipner.datanom.net>	<E8772EA1-F7C0-4B69-B39D-B57ADB378704@omniti.com>
	<20150513221312.5e69fb09@sleipner.datanom.net>
Message-ID: <5553B36F.6040709@gmx.li>

Well, don't forget, my latest tests were w/ KWMs running inside zones. 
As Dan pointed out today in another thread, the lack of VND upstream 
might have a bigger impact on KVMs running inside zones.

On 05/13/2015 10:13 PM, Michael Rasmussen wrote:
> On Wed, 13 May 2015 14:28:22 -0400
> Dan McDonald <danmcd at omniti.com> wrote:
>
>>
>> Tobi's sheet has a preliminary version.  Not sure if he's tested with the one that actually is in the repo servers now.
>>
>> ALSO, 012 got the perf fix because it was easier to bring that along for the ride instead of addressing VENOM by itself for 012.
>>
> If I read the numbers correct I still find the performance
> disappointing with the patch. Doing the same kind of test using Linux
> or FreeBSD host to Linux or FreeBSD guest gives much higher performance.
>
>
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>

From mcgee at sci-world.net  Wed May 13 22:45:36 2015
From: mcgee at sci-world.net (Matthew McGee)
Date: Wed, 13 May 2015 18:45:36 -0400
Subject: [OmniOS-discuss] CIFS Issues
In-Reply-To: <trinity-a9a9f1ac-1941-4fbf-82f1-561c2855b3c0-1431519046174@3capp-gmx-bs51>
References: <CAPa50n2+qHKtNy8vt7T-gEJxU56FXwdaXs=1vSQyrvWG3pp3fQ@mail.gmail.com>
	<trinity-a9a9f1ac-1941-4fbf-82f1-561c2855b3c0-1431519046174@3capp-gmx-bs51>
Message-ID: <CAPa50n1wLvFBm6yJbAmxs_q9iNfFQNRPEqtKTVdg5WJJgeHHHg@mail.gmail.com>

Interesting. Using the trailing "." for an absolute FQDN works.
Any hints on how to make it work without the full FQDN?
I assume it's probably a kerberos related issue?

On Wed, May 13, 2015 at 8:10 AM, Dominik Hassler <hasslerd at gmx.li> wrote:

Did you try to end your FQDN with a trailing dot?
>
> like: 'DATA.HOME.example.net.' in your example?
>
>
> Gesendet: Mittwoch, 13. Mai 2015 um 13:40 Uhr
> Von: "Matthew McGee" <mcgee at sci-world.net>
> An: omnios-discuss at lists.omniti.com
> Betreff: [OmniOS-discuss] CIFS Issues
>
> I am attempting to migrate my CIFS shares from FreeNAS to OmniOS.
> I have attempted a number of different installs and for now I am working
> in a VM
> for speed of reboots and testing.
>
> I have Windows 2012 AD, and a number of Mac OSX & Windows 7 clients.
>
> Server name = DATA
> Domain HOME.example.net[http://HOME.example.net]
>  I install the system, configure the IP of
> 10.0.1.230/8[http://10.0.1.230/8], set and test route, create a base boot
> environmentand a CIFS boot environment. Reboot into the CIFS boot
> environment.
>  I have attempted going straight to Napp-it and I have tried manual
> initialization as follows:
>  verify /etc/hosts and /etc/nodename entries
> Verify AD DNS
> verify system is using AD DNS server only
> nslookup to verify forward & reverse entries are functional and resolve on
> the host
> pkg install kerberos-5# Tried with and without this setting
> sharectl set -p ddns_enable=true
> klcient -T ms_ad
> kinit Administrator
> klist & verify output
> svcadm enable -r smb/server
>
> smbadm join -u Administrator
> Successful join
> smbadm list shows my domain.
> Verified kerberos delegation is allowed on the AD side.
> vi /etc/nsswitch.conf and add "ad" to passwd & group lines
> Have also tried adding smb line to pam
>
>
> Both of the following produce valid output
> touch foo && chown myuser at HOME.example.net[myuser at HOME.example.net] && ls
> -l foo
> id myuser at HOME # Although this doesn't show all my groups
> create a zfs filesystem and corresponding share called documents
>
> root at data:/root# smbutil view //myuser at DATA
> Password:
> Share        Type       Comment
> -------------------------------
> c$           disk       Default Share
> documents    disk
> IPC$         IPC        Remote IPC
> vss$         disk       VSS
>
> 4 shares listed from 4 available
>
> When I attempt to access from a Windows 7 host, I see the following:
>
> \\DATA is not accessible. You might not have permission to use this
> network resource.
> Contact the administrator of this server to find out if you have access
> permissions.
> The account is not authorized to log in from this station.
>
>
> \\10.0.1.230 - Works, I can set permissions, read & write files
>
> Neither the netbios nor FQDN function, but it functions by IP.
>
> Samba on FreeNAS or Fedora works without issues, but I need working FC and
> comstar will do that for me.
> I cannot seem to get the CIFS piece working and it is the one thing
> preventing me from moving forward.
> Any assistance would be appreciated. I hate asking for help but I've been
> working on this every night for a month
> and I know there must be one little thing I am missing, maybe a
> GPO?_______________________________________________ OmniOS-discuss mailing
> list OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss[http://lists.omniti.com/mailman/listinfo/omnios-discuss]
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150513/8dd9204b/attachment.html>

From danmcd at omniti.com  Thu May 14 05:15:00 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Thu, 14 May 2015 01:15:00 -0400
Subject: [OmniOS-discuss] CIFS Issues
In-Reply-To: <CAPa50n1wLvFBm6yJbAmxs_q9iNfFQNRPEqtKTVdg5WJJgeHHHg@mail.gmail.com>
References: <CAPa50n2+qHKtNy8vt7T-gEJxU56FXwdaXs=1vSQyrvWG3pp3fQ@mail.gmail.com>
	<trinity-a9a9f1ac-1941-4fbf-82f1-561c2855b3c0-1431519046174@3capp-gmx-bs51>
	<CAPa50n1wLvFBm6yJbAmxs_q9iNfFQNRPEqtKTVdg5WJJgeHHHg@mail.gmail.com>
Message-ID: <136B9632-3196-41B7-961E-B9BD113321BC@omniti.com>


> On May 13, 2015, at 6:45 PM, Matthew McGee <mcgee at sci-world.net> wrote:
> 
> Interesting. Using the trailing "." for an absolute FQDN works.
> Any hints on how to make it work without the full FQDN?
> I assume it's probably a kerberos related issue?

I'd suggest asking the illumos mailing list (discussion or developer).  The SMB experts in illumos all work at Nexenta.

Dan


From alka at hfg-gmuend.de  Thu May 14 11:15:56 2015
From: alka at hfg-gmuend.de (=?utf-8?Q?G=C3=BCnther_Alka?=)
Date: Thu, 14 May 2015 13:15:56 +0200
Subject: [OmniOS-discuss] CIFS Issues
In-Reply-To: <136B9632-3196-41B7-961E-B9BD113321BC@omniti.com>
References: <CAPa50n2+qHKtNy8vt7T-gEJxU56FXwdaXs=1vSQyrvWG3pp3fQ@mail.gmail.com>
	<trinity-a9a9f1ac-1941-4fbf-82f1-561c2855b3c0-1431519046174@3capp-gmx-bs51>
	<CAPa50n1wLvFBm6yJbAmxs_q9iNfFQNRPEqtKTVdg5WJJgeHHHg@mail.gmail.com>
	<136B9632-3196-41B7-961E-B9BD113321BC@omniti.com>
Message-ID: <84BD5B5F-1490-40AB-B176-4991062BE510@hfg-gmuend.de>

Matthew

As you use napp-it and as I have many OmniOS SMB filers in an AD environment without such problems 
can to compare what happens when you use napp-it to join the domain instead doing manually

(menu Services >> SMB >> Active Directory)

Gea
 

> 
> 
>> On May 13, 2015, at 6:45 PM, Matthew McGee <mcgee at sci-world.net> wrote:
>> 
>> Interesting. Using the trailing "." for an absolute FQDN works.
>> Any hints on how to make it work without the full FQDN?
>> I assume it's probably a kerberos related issue?
> 


From ottmarklaas at countermail.com  Wed May 13 18:21:51 2015
From: ottmarklaas at countermail.com (Ottmar Klaas)
Date: Wed, 13 May 2015 14:21:51 -0400
Subject: [OmniOS-discuss] KVM Performance Update
In-Reply-To: <20150513201435.4d3a3d7c@sleipner.datanom.net>
References: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com>
	<90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com>
	<20150513201435.4d3a3d7c@sleipner.datanom.net>
Message-ID: <8DE571E8-C8D1-4406-8D2E-283C255446E7@countermail.com>


On 13 May 2015, at 14:14, Michael Rasmussen wrote:

> On Tue, 12 May 2015 14:59:02 -0400
> Dan McDonald <danmcd at omniti.com> wrote:
>
>>
>> I chose option #2:
>>
>> 	https://github.com/omniti-labs/omnios-build/commit/0268a2ff04b1cbed2324054cb97a0f36c58989b0
>>
>> There's now an update for r151014 that has the updated system/kvm 
>> (qemu/userland) and driver/virtualization/kvm (kernel KVM driver) on 
>> the repo server.  A "pkg update" will update your packages AND boot 
>> archive without.  I do recommend, however, you power down your KVM 
>> instances and "pkill qemu" prior to running the update.
>>
> Has someone made performance test with the patched kvm package?

For me network performance measured via iperf almost quadrupled, hitting 
around 380MBit/s. Both from global zone to ubuntu guest and separate 
computer on network to ubuntu guest. My previous results are listed here 
(amongst other):

https://docs.google.com/spreadsheets/d/1uhCR4A9VawJsNG01AuC5CBVTQYlkoY1m9qdSuAwZp-s/edit#gid=0

>
> -- 
> Hilsen/Regards
> Michael Rasmussen
>
> Get my public GnuPG keys:
> michael <at> rasmussen <dot> cc
> http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E
> mir <at> datanom <dot> net
> http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C
> mir <at> miras <dot> org
> http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917
> --------------------------------------------------------------
> /usr/games/fortune -es says:
> Give your very best today.  Heaven knows it's little enough.
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

From cks at cs.toronto.edu  Fri May 15 15:51:11 2015
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Fri, 15 May 2015 11:51:11 -0400
Subject: [OmniOS-discuss] Clues for tracking down a drastic ZFS fs space
	difference?
In-Reply-To: cks's message of Wed, 29 Apr 2015 15:21:03 -0400.
	<20150429192103.D6B397A0605@apps0.cs.toronto.edu>
Message-ID: <20150515155111.314317A0614@apps0.cs.toronto.edu>

Several weeks ago I reported:
>  We have a filesystem/dataset with no snapshots, no subordinate
> filesystems, nothing complicated (and no compression), that has a
> drastic difference in space used between what df/zfs list/etc report
> at the ZFS level and what du reports at the filesystem level. [...]

(At the time ZFS reported 70.5 GB used and du reported 17 GB.)

 With the assistance of George Wilson of Delphix, we've now identified
what the cause of this was: nlockmgr was apparently holding references
to now-deleted files in the kernel, preventing them from being reclaimed
by ZFS. Because these references were held in the kernel in some way,
they weren't visible to tools like fuser. Restarting nlockmgr immediately
reclaimed the space and dropped usage to what it should be.

 Delphix made a fix to their version of the nlm code to avoid this but
has not yet pushed it upstream. The summary of the problem (from a
comment in the commit):

	A busy client will prevent the idle timeout from ever being
	reached but may have stale holds associated with it. If these
	stale holds are for vnodes which have been removed they will
	prevent the file system from being able to reclaim the file's
	space.

 George Wilson's initial reply to me on the illumos-zfs mailing list
is:
	http://permalink.gmane.org/gmane.os.illumos.zfs/4836

(and it includes a link to the Delphix commit.)

 Obviously this is only a concern for people doing NFS service on
OmniOS machines, but if this is your environment you may want to watch
for this issue and consider periodic precautionary nlockmgr restarts or
the like until the fix is pushed upstream and is incorporated into an
OmniOS update.

	- cks

From martin at waldenvik.se  Fri May 15 19:56:09 2015
From: martin at waldenvik.se (martin at waldenvik.se)
Date: Fri, 15 May 2015 19:56:09 +0000
Subject: [OmniOS-discuss] nfs client in a zone won't start
Message-ID: <etPan.55565022.6cca75d8.12dc@pentos.kyriou.net>

Hi

I created a zone per omnios wiki for a mysql-server (omnios r151014). But i can?t seem to start the nfs/client service. It just says offline*.There are no clue in any of the logs. If i do a svcadm enable -r nfs/client it says svcadm: svc:/milestone/network depends on svc:/network/physical, which has multiple instances.

Any help would be appreciated. I wish to mount a nfs share for backing up mysql-databases

Regards
Martin
Sent with Airmail
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150515/0a90754c/attachment-0001.html>

From danmcd at omniti.com  Fri May 15 20:09:19 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Fri, 15 May 2015 16:09:19 -0400
Subject: [OmniOS-discuss] nfs client in a zone won't start
In-Reply-To: <etPan.55565022.6cca75d8.12dc@pentos.kyriou.net>
References: <etPan.55565022.6cca75d8.12dc@pentos.kyriou.net>
Message-ID: <8D390D89-85FD-429B-92C7-CDCFBBE0E92A@omniti.com>


> On May 15, 2015, at 3:56 PM, martin at waldenvik.se wrote:
> 
> Hi
> 
> I created a zone per omnios wiki for a mysql-server (omnios r151014). But i can?t seem to start the nfs/client service. It just says offline*.There are no clue in any of the logs. If i do a svcadm enable -r nfs/client it says svcadm: svc:/milestone/network depends on svc:/network/physical, which has multiple instances.
> 
> Any help would be appreciated. I wish to mount a nfs share for backing up mysql-databases

Please share the output of svcs -xv network/physical

Also, try enabling nfs/client without -r, and use "svcs -xv" to see what all else you need to activate.  *CLIENT* should work in a zone.

Dan


From jimklimov at cos.ru  Fri May 15 21:25:42 2015
From: jimklimov at cos.ru (Jim Klimov)
Date: Fri, 15 May 2015 23:25:42 +0200
Subject: [OmniOS-discuss] nfs client in a zone won't start
In-Reply-To: <etPan.55565022.6cca75d8.12dc@pentos.kyriou.net>
References: <etPan.55565022.6cca75d8.12dc@pentos.kyriou.net>
Message-ID: <F3803A8D-F9FB-4B08-B24E-44548FD309F8@cos.ru>

15 ??? 2015??. 21:56:09 CEST, "martin at waldenvik.se" <martin at waldenvik.se> ?????:
>Hi
>
>I created a zone per omnios wiki for a mysql-server (omnios r151014).
>But i can?t seem to start the nfs/client service. It just says
>offline*.There are no clue in any of the logs. If i do a svcadm enable
>-r nfs/client it says svcadm: svc:/milestone/network depends on
>svc:/network/physical, which has multiple instances.
>
>Any help would be appreciated. I wish to mount a nfs share for backing
>up mysql-databases
>
>Regards
>Martin
>Sent with Airmail
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>OmniOS-discuss mailing list
>OmniOS-discuss at lists.omniti.com
>http://lists.omniti.com/mailman/listinfo/omnios-discuss

The state offline* (with asterisk) means transition from offline (is in process of onlining). You might want to look into /var/svc/log/*nfs-client*log for possible more details, and/or to manually rerun (or instrument with 'sh -x' and the likes) the scripts and bits of the service to trace into the problem.

While the message about network/physical is common and harmless, do verify that indeed you have one of the networking engines enabled (legacy default, or new magicky nwam).

Also 'svcs -d nfs/client' can show dependencies, and 'svcs -xv' will detail any failures.

Recently there are many online discussions on nlm (nfs lock manager) and recent/nearfuture changes applied to it, so see if enabling or kicking it helps you any.

Finally, did you test if the client works from the global zone?

Good luck,
Jim Klimov
--
Typos courtesy of K-9 Mail on my Samsung Android

From richard.elling at richardelling.com  Fri May 15 21:44:51 2015
From: richard.elling at richardelling.com (Richard Elling)
Date: Fri, 15 May 2015 14:44:51 -0700
Subject: [OmniOS-discuss] nfs client in a zone won't start
In-Reply-To: <F3803A8D-F9FB-4B08-B24E-44548FD309F8@cos.ru>
References: <etPan.55565022.6cca75d8.12dc@pentos.kyriou.net>
	<F3803A8D-F9FB-4B08-B24E-44548FD309F8@cos.ru>
Message-ID: <DA3A3733-84DA-4BE8-83E4-7BF662FF0FD7@richardelling.com>


> On May 15, 2015, at 2:25 PM, Jim Klimov <jimklimov at cos.ru> wrote:
> 
> 15 ??? 2015 ?. 21:56:09 CEST, "martin at waldenvik.se" <martin at waldenvik.se> ?????:
>> Hi
>> 
>> I created a zone per omnios wiki for a mysql-server (omnios r151014).
>> But i can?t seem to start the nfs/client service. It just says
>> offline*.There are no clue in any of the logs. If i do a svcadm enable
>> -r nfs/client it says svcadm: svc:/milestone/network depends on
>> svc:/network/physical, which has multiple instances.
>> 
>> Any help would be appreciated. I wish to mount a nfs share for backing
>> up mysql-databases
>> 
>> Regards
>> Martin
>> Sent with Airmail
>> 
>> 
>> ------------------------------------------------------------------------
>> 
>> _______________________________________________
>> OmniOS-discuss mailing list
>> OmniOS-discuss at lists.omniti.com
>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
> 
> The state offline* (with asterisk) means transition from offline (is in process of onlining). You might want to look into /var/svc/log/*nfs-client*log for possible more details, and/or to manually rerun (or instrument with 'sh -x' and the likes) the scripts and bits of the service to trace into the problem.

pro tip:
cat $(svcs -L nfs/client)
 -- richard


From illumos at cucumber.demon.co.uk  Fri May 15 22:18:29 2015
From: illumos at cucumber.demon.co.uk (Andrew Gabriel)
Date: Fri, 15 May 2015 23:18:29 +0100
Subject: [OmniOS-discuss] nfs client in a zone won't start
In-Reply-To: <DA3A3733-84DA-4BE8-83E4-7BF662FF0FD7@richardelling.com>
References: <etPan.55565022.6cca75d8.12dc@pentos.kyriou.net>	<F3803A8D-F9FB-4B08-B24E-44548FD309F8@cos.ru>
	<DA3A3733-84DA-4BE8-83E4-7BF662FF0FD7@richardelling.com>
Message-ID: <555670B5.8000509@cucumber.demon.co.uk>

Richard Elling wrote:
>> On May 15, 2015, at 2:25 PM, Jim Klimov <jimklimov at cos.ru> wrote:
>>
>>     
>> The state offline* (with asterisk) means transition from offline (is in process of onlining). You might want to look into /var/svc/log/*nfs-client*log for possible more details, and/or to manually rerun (or instrument with 'sh -x' and the likes) the scripts and bits of the service to trace into the problem.
>>     
>
> pro tip:
> cat $(svcs -L nfs/client)
>   
or a "tail -f" running whilst you try starting it from another terminal 
window.

svcs -p nfs/client
can also be useful when it's stuck in a startup script, to see what 
processes it currently has running.

-- 
Andrew


From mcgee at sci-world.net  Fri May 15 22:34:29 2015
From: mcgee at sci-world.net (Matthew McGee)
Date: Fri, 15 May 2015 18:34:29 -0400
Subject: [OmniOS-discuss] CIFS Issues
Message-ID: <CAPa50n0YDcAzg-6LFjRDQ8famqPXznijaRnKqmsHg0njYCYSsw@mail.gmail.com>

I didn't see this message until it came through on the digest.
I have a working system now albeit, its a cludge.

The person who suggested using a DNS alias gets a beer.
I took this idea and did further troubleshooting and found that if the
hostname is in AD, I get the error message. If I remove it from AD
and reboot the client, it works.

There is no discernible difference between using Napp-it and not.
I get the same result either way.

I also find it curious that all my shares are now forcibly in lower case.
My Documents share comes in as documents. No big deal, but strange.

Thank you for the suggestions and I am all ears if you have anything
further.

Message: 4
Date: Thu, 14 May 2015 13:15:56 +0200
From: G?nther Alka <alka at hfg-gmuend.de>
To: omnios-discuss <omnios-discuss at lists.omniti.com>
Subject: Re: [OmniOS-discuss] CIFS Issues
Message-ID: <84BD5B5F-1490-40AB-B176-4991062BE510 at hfg-gmuend.de>
Content-Type: text/plain; charset=us-ascii

Matthew

As you use napp-it and as I have many OmniOS SMB filers in an AD
environment without such problems
can to compare what happens when you use napp-it to join the domain instead
doing manually

(menu Services >> SMB >> Active Directory)

Gea


>
>
>> On May 13, 2015, at 6:45 PM, Matthew McGee <mcgee at sci-world.net> wrote:
>>
>> Interesting. Using the trailing "." for an absolute FQDN works.
>> Any hints on how to make it work without the full FQDN?
>> I assume it's probably a kerberos related issue?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150515/76708958/attachment.html>

From martin at waldenvik.se  Fri May 15 22:44:06 2015
From: martin at waldenvik.se (martin at waldenvik.se)
Date: Fri, 15 May 2015 22:44:06 +0000
Subject: [OmniOS-discuss] nfs client in a zone won't start
In-Reply-To: <555670B5.8000509@cucumber.demon.co.uk>
References: <etPan.55565022.6cca75d8.12dc@pentos.kyriou.net>
	<F3803A8D-F9FB-4B08-B24E-44548FD309F8@cos.ru>
	<DA3A3733-84DA-4BE8-83E4-7BF662FF0FD7@richardelling.com>
	<555670B5.8000509@cucumber.demon.co.uk>
Message-ID: <etPan.5556777f.22ff5261.12dc@pentos.kyriou.net>

Hi

Thanks for all your tips regarding the nfs client. I still does not know what caused it. Maybe some network configuration mishap. The nfs/client worked in the GZ without problem.

Wish you all a nice weekend
Martin
Sent with Airmail


On 16 May 2015 at 00:17:19, Andrew Gabriel (illumos at cucumber.demon.co.uk<mailto:illumos at cucumber.demon.co.uk>) wrote:

Richard Elling wrote:
>> On May 15, 2015, at 2:25 PM, Jim Klimov <jimklimov at cos.ru> wrote:
>>
>>
>> The state offline* (with asterisk) means transition from offline (is in process of onlining). You might want to look into /var/svc/log/*nfs-client*log for possible more details, and/or to manually rerun (or instrument with 'sh -x' and the likes) the scripts and bits of the service to trace into the problem.
>>
>
> pro tip:
> cat $(svcs -L nfs/client)
>
or a "tail -f" running whilst you try starting it from another terminal
window.

svcs -p nfs/client
can also be useful when it's stuck in a startup script, to see what
processes it currently has running.

--
Andrew

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150515/fc3fbe97/attachment.html>

From jstockett at molalla.com  Mon May 18 18:25:34 2015
From: jstockett at molalla.com (Jeff Stockett)
Date: Mon, 18 May 2015 18:25:34 +0000
Subject: [OmniOS-discuss] disk failure causing reboot?
Message-ID: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com>

A drive failed in one of our supermicro 5048R-E1CR36L servers running omnios r151012 last night, and somewhat unexpectedly, the whole system seems to have panicked.

May 18 04:43:08 zfs01 scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,2f02 at 1/pci15d9,808 at 0 (mpt_sas0):
May 18 04:43:08 zfs01         Log info 0x31140000 received for target 29 w50000c0f01f1bf06.
May 18 04:43:08 zfs01         scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc
May 18 04:44:36 zfs01 genunix: [ID 843051 kern.info] NOTICE: SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major
May 18 04:44:36 zfs01 unix: [ID 836849 kern.notice]
May 18 04:44:36 zfs01 ^Mpanic[cpu0]/thread=ffffff00f3ecbc40:
May 18 04:44:36 zfs01 genunix: [ID 918906 kern.notice] I/O to pool 'dpool' appears to be hung.
May 18 04:44:36 zfs01 unix: [ID 100000 kern.notice]
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecba20 zfs:vdev_deadman+10b ()
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecba70 zfs:vdev_deadman+4a ()
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbac0 zfs:vdev_deadman+4a ()
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbaf0 zfs:spa_deadman+ad ()
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbb90 genunix:cyclic_softint+fd ()
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbba0 unix:cbe_low_level+14 ()
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbbf0 unix:av_dispatch_softvect+78 ()
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbc20 apix:apix_dispatch_softint+35 ()
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05990 unix:switch_sp_and_call+13 ()
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e059e0 apix:apix_do_softint+6c ()
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05a40 apix:apix_do_interrupt+34a ()
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05a50 unix:cmnint+ba ()
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05bc0 unix:acpi_cpu_cstate+11b ()
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05bf0 unix:cpu_acpi_idle+8d ()
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05c00 unix:cpu_idle_adaptive+13 ()
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05c20 unix:idle+a7 ()
May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05c30 unix:thread_start+8 ()
May 18 04:44:36 zfs01 unix: [ID 100000 kern.notice]
May 18 04:44:36 zfs01 genunix: [ID 672855 kern.notice] syncing file systems...
May 18 04:44:38 zfs01 genunix: [ID 904073 kern.notice]  done
May 18 04:44:39 zfs01 genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
May 18 04:44:39 zfs01 ahci: [ID 405573 kern.info] NOTICE: ahci0: ahci_tran_reset_dport port 1 reset port
May 18 05:17:56 zfs01 genunix: [ID 100000 kern.notice]
May 18 05:17:56 zfs01 genunix: [ID 665016 kern.notice] ^M100% done: 8607621 pages dumped,
May 18 05:17:56 zfs01 genunix: [ID 851671 kern.notice] dump succeeded

The disks are all 4TB WD40001FYYG enterprise SAS drives.  Googling seems to indicate it is a known problem with the way the various subsystems sometimes interact. Is there any way to fix/workaround this issue?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150518/be99084f/attachment-0001.html>

From danmcd at omniti.com  Mon May 18 18:33:17 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 18 May 2015 14:33:17 -0400
Subject: [OmniOS-discuss] disk failure causing reboot?
In-Reply-To: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com>
References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com>
Message-ID: <B1B135ED-D5FF-4F78-81D4-2EEB8E5DFD81@omniti.com>


> On May 18, 2015, at 2:25 PM, Jeff Stockett <jstockett at molalla.com> wrote:
> 
> A drive failed in one of our supermicro 5048R-E1CR36L servers running omnios r151012 last night, and somewhat unexpectedly, the whole system seems to have panicked.

The panic was done for protection of your pool:

> May 18 04:44:36 zfs01 genunix: [ID 918906 kern.notice] I/O to pool 'dpool' appears to be hung.

<SNIP!>

>  
> The disks are all 4TB WD40001FYYG enterprise SAS drives.  Googling seems to indicate it is a known problem with the way the various subsystems sometimes interact. Is there any way to fix/workaround this issue?

Pull the drive.  I'm assuming you have a raidz or mirrored setup where you can do that, right?  Or is it a question of finding *which* drive failed?

Dan



From illumos at cucumber.demon.co.uk  Mon May 18 18:59:16 2015
From: illumos at cucumber.demon.co.uk (Andrew Gabriel)
Date: Mon, 18 May 2015 19:59:16 +0100
Subject: [OmniOS-discuss] disk failure causing reboot?
In-Reply-To: <B1B135ED-D5FF-4F78-81D4-2EEB8E5DFD81@omniti.com>
References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com>
	<B1B135ED-D5FF-4F78-81D4-2EEB8E5DFD81@omniti.com>
Message-ID: <555A3684.2020409@cucumber.demon.co.uk>

Dan McDonald wrote:
>> On May 18, 2015, at 2:25 PM, Jeff Stockett <jstockett at molalla.com> wrote:
>>
>> A drive failed in one of our supermicro 5048R-E1CR36L servers running omnios r151012 last night, and somewhat unexpectedly, the whole system seems to have panicked.
>>     
>
> The panic was done for protection of your pool:
>
>   
>> May 18 04:44:36 zfs01 genunix: [ID 918906 kern.notice] I/O to pool 'dpool' appears to be hung.
>>     
>
> <SNIP!>
>
>   
>>  
>> The disks are all 4TB WD40001FYYG enterprise SAS drives.  Googling seems to indicate it is a known problem with the way the various subsystems sometimes interact. Is there any way to fix/workaround this issue?
>>     
>
> Pull the drive.  I'm assuming you have a raidz or mirrored setup where you can do that, right?  Or is it a question of finding *which* drive failed?
>   

Must admit I haven't played with this since the protection against no TX 
commits completing for a while went in, but I would have expected FMA 
would have faulted out the disk to prevent hanging the pool, unless 
there was no redundancy for the top level vdev it's in?

Would be interesting to know what the pool layout and state was.

-- 
Andrew

From jstockett at molalla.com  Mon May 18 19:01:46 2015
From: jstockett at molalla.com (Jeff Stockett)
Date: Mon, 18 May 2015 19:01:46 +0000
Subject: [OmniOS-discuss] disk failure causing reboot?
In-Reply-To: <B1B135ED-D5FF-4F78-81D4-2EEB8E5DFD81@omniti.com>
References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com>
	<B1B135ED-D5FF-4F78-81D4-2EEB8E5DFD81@omniti.com>
Message-ID: <136C13E89D22BB468B2A7025993639732F527113@EXMCCMB.molalla.com>

Hi Dan,

The pool is made up of 36 disks - 6 x 6 raidz2 vdevs with some SSDs for l2arc and slog.  I already replaced the drive and the rebuild is nearly done, but I was mostly curious why a disk failure would cause a reboot?  I get that it was apparently hanging the pool up, and that according to some posts I read the developers seem to think it is better the panic/dump/reboot than leave it hung until someone notices, but wouldn't it really be better just to drop the failed drive out of the array? Is it because the system in question is using a SAS expander or is this only expected behavior sometimes depending on how the drive fails?  I guess I might expect this with consumer grade SATA drives, but wasn't expecting it with $$$ enterprise SAS drives.

Thanks,  Jeff

-----Original Message-----
From: Dan McDonald [mailto:danmcd at omniti.com] 
Sent: Monday, May 18, 2015 11:33 AM
To: Jeff Stockett
Cc: omnios-discuss
Subject: Re: [OmniOS-discuss] disk failure causing reboot?


> On May 18, 2015, at 2:25 PM, Jeff Stockett <jstockett at molalla.com> wrote:
> 
> A drive failed in one of our supermicro 5048R-E1CR36L servers running omnios r151012 last night, and somewhat unexpectedly, the whole system seems to have panicked.

The panic was done for protection of your pool:

> May 18 04:44:36 zfs01 genunix: [ID 918906 kern.notice] I/O to pool 'dpool' appears to be hung.

<SNIP!>

>  
> The disks are all 4TB WD40001FYYG enterprise SAS drives.  Googling seems to indicate it is a known problem with the way the various subsystems sometimes interact. Is there any way to fix/workaround this issue?

Pull the drive.  I'm assuming you have a raidz or mirrored setup where you can do that, right?  Or is it a question of finding *which* drive failed?

Dan



From danmcd at omniti.com  Mon May 18 19:09:17 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 18 May 2015 15:09:17 -0400
Subject: [OmniOS-discuss] disk failure causing reboot?
In-Reply-To: <136C13E89D22BB468B2A7025993639732F527113@EXMCCMB.molalla.com>
References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com>
	<B1B135ED-D5FF-4F78-81D4-2EEB8E5DFD81@omniti.com>
	<136C13E89D22BB468B2A7025993639732F527113@EXMCCMB.molalla.com>
Message-ID: <9964F883-77F7-4159-B704-5DB7CC57A1E6@omniti.com>


> On May 18, 2015, at 3:01 PM, Jeff Stockett <jstockett at molalla.com> wrote:
> 
> Hi Dan,
> 
> The pool is made up of 36 disks - 6 x 6 raidz2 vdevs with some SSDs for l2arc and slog.  I already replaced the drive and the rebuild is nearly done, but I was mostly curious why a disk failure would cause a reboot?  I get that it was apparently hanging the pool up, and that according to some posts I read the developers seem to think it is better the panic/dump/reboot than leave it hung until someone notices, but wouldn't it really be better just to drop the failed drive out of the array? Is it because the system in question is using a SAS expander or is this only expected behavior sometimes depending on how the drive fails?  I guess I might expect this with consumer grade SATA drives, but wasn't expecting it with $$$ enterprise SAS drives.

$$$ SAS drives *should* tickle FMA as Andrew G. was saying.  I've heard expanders can complicate things, but I'm not enough of a storage guru to address that directly (I will say that SATA drives + expanders == disaster but you know that already).

There are more storage-informed people on this list, and they may have more insight than I.

Thanks,
Dan


From illumos at cucumber.demon.co.uk  Mon May 18 19:37:22 2015
From: illumos at cucumber.demon.co.uk (Andrew Gabriel)
Date: Mon, 18 May 2015 20:37:22 +0100
Subject: [OmniOS-discuss] disk failure causing reboot?
In-Reply-To: <9964F883-77F7-4159-B704-5DB7CC57A1E6@omniti.com>
References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com>	<B1B135ED-D5FF-4F78-81D4-2EEB8E5DFD81@omniti.com>	<136C13E89D22BB468B2A7025993639732F527113@EXMCCMB.molalla.com>
	<9964F883-77F7-4159-B704-5DB7CC57A1E6@omniti.com>
Message-ID: <555A3F72.2080302@cucumber.demon.co.uk>

Dan McDonald wrote:
>> On May 18, 2015, at 3:01 PM, Jeff Stockett <jstockett at molalla.com> wrote:
>>
>> Hi Dan,
>>
>> The pool is made up of 36 disks - 6 x 6 raidz2 vdevs with some SSDs for l2arc and slog.  I already replaced the drive and the rebuild is nearly done, but I was mostly curious why a disk failure would cause a reboot?  I get that it was apparently hanging the pool up, and that according to some posts I read the developers seem to think it is better the panic/dump/reboot than leave it hung until someone notices, but wouldn't it really be better just to drop the failed drive out of the array? Is it because the system in question is using a SAS expander or is this only expected behavior sometimes depending on how the drive fails?  I guess I might expect this with consumer grade SATA drives, but wasn't expecting it with $$$ enterprise SAS drives.
>>     
>
> $$$ SAS drives *should* tickle FMA as Andrew G. was saying.  I've heard expanders can complicate things, but I'm not enough of a storage guru to address that directly (I will say that SATA drives + expanders == disaster but you know that already).
>
> There are more storage-informed people on this list, and they may have more insight than I.
>   

Might be worth looking at fmdump output, to see what FMA made of the 
disk error at 04:43:08.

-- 
Andrew

From henson at acm.org  Mon May 18 20:08:48 2015
From: henson at acm.org (Paul B. Henson)
Date: Mon, 18 May 2015 13:08:48 -0700
Subject: [OmniOS-discuss] disk failure causing reboot?
In-Reply-To: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com>
References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com>
Message-ID: <20150518200848.GH3720@bender.unx.cpp.edu>

On Mon, May 18, 2015 at 06:25:34PM +0000, Jeff Stockett wrote:
> A drive failed in one of our supermicro 5048R-E1CR36L servers running
> omnios r151012 last night, and somewhat unexpectedly, the whole system
> seems to have panicked.

You don't happen to have failmode set to panic on the pool?

>From the zpool manpage:

       failmode=wait | continue | panic
           Controls the system behavior in the event of catastrophic pool
           failure. This condition is typically a result of a loss of
           connectivity to the underlying storage device(s) or a failure of
           all devices within the pool. The behavior of such an event is
           determined as follows:

           wait
                       Blocks all I/O access until the device connectivity is
                       recovered and the errors are cleared. This is the
                       default behavior.

           continue
                       Returns EIO to any new write I/O requests but allows
                       reads to any of the remaining healthy devices. Any
                       write requests that have yet to be committed to disk
                       would be blocked.

           panic
                       Prints out a message to the console and generates a
                       system crash dump.


From chip at innovates.com  Mon May 18 20:30:34 2015
From: chip at innovates.com (Schweiss, Chip)
Date: Mon, 18 May 2015 15:30:34 -0500
Subject: [OmniOS-discuss] disk failure causing reboot?
In-Reply-To: <20150518200848.GH3720@bender.unx.cpp.edu>
References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com>
	<20150518200848.GH3720@bender.unx.cpp.edu>
Message-ID: <CALeZrrQSa5Lx6Ld8ORT9OS2_=JEZ=FJSA4dgHfX_OkAyGogV8g@mail.gmail.com>

I had the exact same failure mode last week.  With over 1000 spindles I see
this about once a month.

I can publish my dump also if anyone actually want's to try to fix this
problem, but I think there are several of the same thing already linked to
tickets in Illumos-gate.

Pools for the most part should be set to failmode=panic or wait, but a
failed disk should not cause a panic.   The system this happened to me on
failmode was set to wait.  It is also on r151012, waiting on a window to
upgrade to r151014.  My pool is raidz3, so no reason not to kick a bad disk.

All my disks are SAS in DataON JBODs, dual connected across two LSI
HBAs.    BTW, pull a SAS cable and you get a panic too, not degraded
multipath.    Illumos seems to panic on just about any SAS event these days
regardless of redundancy.

-Chip











On Mon, May 18, 2015 at 3:08 PM, Paul B. Henson <henson at acm.org> wrote:

> On Mon, May 18, 2015 at 06:25:34PM +0000, Jeff Stockett wrote:
> > A drive failed in one of our supermicro 5048R-E1CR36L servers running
> > omnios r151012 last night, and somewhat unexpectedly, the whole system
> > seems to have panicked.
>
> You don't happen to have failmode set to panic on the pool?
>
> From the zpool manpage:
>
>        failmode=wait | continue | panic
>            Controls the system behavior in the event of catastrophic pool
>            failure. This condition is typically a result of a loss of
>            connectivity to the underlying storage device(s) or a failure of
>            all devices within the pool. The behavior of such an event is
>            determined as follows:
>
>            wait
>                        Blocks all I/O access until the device connectivity
> is
>                        recovered and the errors are cleared. This is the
>                        default behavior.
>
>            continue
>                        Returns EIO to any new write I/O requests but allows
>                        reads to any of the remaining healthy devices. Any
>                        write requests that have yet to be committed to disk
>                        would be blocked.
>
>            panic
>                        Prints out a message to the console and generates a
>                        system crash dump.
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150518/005045a0/attachment-0001.html>

From jstockett at molalla.com  Mon May 18 20:33:33 2015
From: jstockett at molalla.com (Jeff Stockett)
Date: Mon, 18 May 2015 20:33:33 +0000
Subject: [OmniOS-discuss] disk failure causing reboot?
In-Reply-To: <20150518200848.GH3720@bender.unx.cpp.edu>
References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com>
	<20150518200848.GH3720@bender.unx.cpp.edu>
Message-ID: <136C13E89D22BB468B2A7025993639732F52738E@EXMCCMB.molalla.com>

The pool is set to fail mode wait.  

In looking at the fmdump -e and fmdump -eV output, it looks just like the drive started having media/disk/transport errors around 3:40am and eventually culminated in the reboot around 6:18am.  The funny thing is that driver-assessment = fatal was returned 42 times on the same device in that period, so I'm not quite sure why it didn't just drop the drive - because the documentation says:

Note: An ereport with the value driver-assessment = fatal results in the fault being propagated.  It appears it didn't drop the drive until after it rebooted.  I can upload the crash dump and or fmdump output if anyone is interested.

Thanks,  Jeff

-----Original Message-----
From: Paul Henson [mailto:paul.b.henson at gmail.com] On Behalf Of Paul B. Henson
Sent: Monday, May 18, 2015 1:09 PM
To: Jeff Stockett
Cc: omnios-discuss at lists.omniti.com
Subject: Re: [OmniOS-discuss] disk failure causing reboot?

On Mon, May 18, 2015 at 06:25:34PM +0000, Jeff Stockett wrote:
> A drive failed in one of our supermicro 5048R-E1CR36L servers running 
> omnios r151012 last night, and somewhat unexpectedly, the whole system 
> seems to have panicked.

You don't happen to have failmode set to panic on the pool?

>From the zpool manpage:

       failmode=wait | continue | panic
           Controls the system behavior in the event of catastrophic pool
           failure. This condition is typically a result of a loss of
           connectivity to the underlying storage device(s) or a failure of
           all devices within the pool. The behavior of such an event is
           determined as follows:

           wait
                       Blocks all I/O access until the device connectivity is
                       recovered and the errors are cleared. This is the
                       default behavior.

           continue
                       Returns EIO to any new write I/O requests but allows
                       reads to any of the remaining healthy devices. Any
                       write requests that have yet to be committed to disk
                       would be blocked.

           panic
                       Prints out a message to the console and generates a
                       system crash dump.


From danmcd at omniti.com  Mon May 18 20:38:56 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 18 May 2015 16:38:56 -0400
Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX?
Message-ID: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com>

Now this isn't a gcc update for illumos/illumos-omnios... that way is full of pain, and I'll wait for now.

OTOH, we've transitioned gcc before going into r151008 with 4.8.1.

My question to you all is this:  To which gcc version should we jump?  I see two viable candidates:

	- gcc 4.9.2 (last updated October 2014)

or

	- gcc 5.1 (last updated April 2015)

The current gcc "development" is happening on 6.0, and we're not ready for that.

I appreciate feedback.  I'll be making a decision soon, as I hope to land a compiler upgrade as the major push for this bloody cycle and r151016.

Thanks,
Dan


From vab at bb-c.de  Mon May 18 20:54:43 2015
From: vab at bb-c.de (Volker A. Brandt)
Date: Mon, 18 May 2015 22:54:43 +0200
Subject: [OmniOS-discuss] Can't update bloody
In-Reply-To: <5ECE2D45-CBD8-49AB-8F7A-DB138B6E9C3B@omniti.com>
References: <DUB405-EAS281DE3E6934796A0231BEC1E2D10@phx.gbl>
	<21832.58196.941714.304987@glaurung.bb-c.de>
	<5ECE2D45-CBD8-49AB-8F7A-DB138B6E9C3B@omniti.com>
Message-ID: <21850.20883.162713.422803@glaurung.bb-c.de>

Hi Dan!


> I'll be updating the whole wad of bloody later this week.  Can y'all
> wait a couple of days?  I want to include some illumos updates that
> I'm about to push this afternoon.

So after that push, everything was fine with the bloody repo.
However, it seems that the original problem Chavdar and myself
were seeing has reappeared today:

  # /usr/bin/pkgrecv -s http://pkg.omniti.com/omnios/bloody/ -d /pkg/omnios-151015 '*'
  Processing packages for publisher omnios ...
  Retrieving and evaluating 3057 package(s)...
  Download Manifests (1634/3057) |pkgrecv: http protocol error: code: 404 reason: Not Found
  URL: 'http://pkg.omniti.com/omnios/bloody/omnios/manifest/0/package%2Fpkg at 0.5.11%2C5.11-0.151015%3A20150422T144502Z' (happened 4 times)

Is it just me?


Regards -- Volker
-- 
------------------------------------------------------------------------
Volker A. Brandt               Consulting and Support for Oracle Solaris
Brandt & Brandt Computer GmbH                   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim, GERMANY            Email: vab at bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513              Schuhgr??e: 46
Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt

"When logic and proportion have fallen sloppy dead"

From vab at bb-c.de  Mon May 18 20:56:44 2015
From: vab at bb-c.de (Volker A. Brandt)
Date: Mon, 18 May 2015 22:56:44 +0200
Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX?
In-Reply-To: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com>
References: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com>
Message-ID: <21850.21004.799624.152739@glaurung.bb-c.de>

> My question to you all is this: To which gcc version should we jump?
> I see two viable candidates:
> 
> 	- gcc 4.9.2 (last updated October 2014)
> 
> or
> 
> 	- gcc 5.1 (last updated April 2015)

Na?vely, shouldn't the newer be better?  Less work during the the next
version jump...


Regards -- Volker
-- 
------------------------------------------------------------------------
Volker A. Brandt               Consulting and Support for Oracle Solaris
Brandt & Brandt Computer GmbH                   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim, GERMANY            Email: vab at bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513              Schuhgr??e: 46
Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt

"When logic and proportion have fallen sloppy dead"

From eric.sproul at circonus.com  Mon May 18 21:00:05 2015
From: eric.sproul at circonus.com (Eric Sproul)
Date: Mon, 18 May 2015 17:00:05 -0400
Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX?
In-Reply-To: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com>
References: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com>
Message-ID: <CAO8hXRBQXvuafstcEu6Wpqu9sk1kukTY7yt9DsYwC+9RwoK8hw@mail.gmail.com>

On Mon, May 18, 2015 at 4:38 PM, Dan McDonald <danmcd at omniti.com> wrote:
> My question to you all is this:  To which gcc version should we jump?  I see two viable candidates:
>
>         - gcc 4.9.2 (last updated October 2014)
>
> or
>
>         - gcc 5.1 (last updated April 2015)
>
> The current gcc "development" is happening on 6.0, and we're not ready for that.

It should be noted that due to a version scheme change
(https://gcc.gnu.org/develop.html#timeline) 5.1 is what would have
been 4.10.  It's the first stable release in a new major series (5),
with development moving to 6, as Dan pointed out.

Eric

From danmcd at omniti.com  Mon May 18 21:03:55 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 18 May 2015 17:03:55 -0400
Subject: [OmniOS-discuss] Can't update bloody
In-Reply-To: <21850.20883.162713.422803@glaurung.bb-c.de>
References: <DUB405-EAS281DE3E6934796A0231BEC1E2D10@phx.gbl>
	<21832.58196.941714.304987@glaurung.bb-c.de>
	<5ECE2D45-CBD8-49AB-8F7A-DB138B6E9C3B@omniti.com>
	<21850.20883.162713.422803@glaurung.bb-c.de>
Message-ID: <389521B5-A1BD-4FFB-A457-CD31F421625E@omniti.com>

I am indeed seeing this problem.

I'm not sure how to fix it (or how it got that way in the first place.

The only other thing I can recommend is the new for '014 and later "-m latest" option to pkgrecv, which only gets you the LATEST version(s) of the packages:

nowhere(~/junk)[0]% /usr/bin/pkgrecv -s http://pkg.omniti.com/omnios/bloody/ -d 015.repo '*'
Processing packages for publisher omnios ...
Retrieving and evaluating 3057 package(s)...
Download Manifests (1634/3057) /pkgrecv: http protocol error: code: 404 reason: Not Found
URL: 'http://pkg.omniti.com/omnios/bloody/omnios/manifest/0/package%2Fpkg at 0.5.11%2C5.11-0.151015%3A20150422T144502Z' (happened 4 times)

nowhere(~/junk)[1]% ls                                                          015.repo/ 
nowhere(~/junk)[0]% /bin/rm -rf 015.repo/
nowhere(~/junk)[0]% pkgrepo create 015.repo
nowhere(~/junk)[0]% /usr/bin/pkgrecv -m latest -s http://pkg.omniti.com/omnios/bloody/ -d 015.repo '*'
Processing packages for publisher omnios ...
Retrieving and evaluating 1018 package(s)...
PROCESS                                         ITEMS    GET (MB)   SEND (MB)
SUNWcs                                        66/1018     10/1010     15/2945.....(IN PROGRESS)


Hope this helps,
Dan


From eric.sproul at circonus.com  Mon May 18 21:04:22 2015
From: eric.sproul at circonus.com (Eric Sproul)
Date: Mon, 18 May 2015 17:04:22 -0400
Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX?
In-Reply-To: <21850.21004.799624.152739@glaurung.bb-c.de>
References: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com>
	<21850.21004.799624.152739@glaurung.bb-c.de>
Message-ID: <CAO8hXRBrtQfNJx1NVohxv-ELmL=uHb=3gowmZcqXbSt4k5kx3g@mail.gmail.com>

On Mon, May 18, 2015 at 4:56 PM, Volker A. Brandt <vab at bb-c.de> wrote:
> Na?vely, shouldn't the newer be better?  Less work during the the next
> version jump...

The very first thing on the 5-series changes list
(https://gcc.gnu.org/gcc-5/changes.html):

 * The default mode for C is now -std=gnu11 instead of -std=gnu89.

I have no problem with that, but it *may* cause us some heartburn.
Generally speaking though, I would vote for 5.1, mainly because of
support for new/upcoming CPU instructions and optimization
improvements.

Eric

From hasslerd at gmx.li  Mon May 18 21:09:01 2015
From: hasslerd at gmx.li (Dominik Hassler)
Date: Mon, 18 May 2015 23:09:01 +0200
Subject: [OmniOS-discuss] disk failure causing reboot?
In-Reply-To: <136C13E89D22BB468B2A7025993639732F527113@EXMCCMB.molalla.com>
References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com>	<B1B135ED-D5FF-4F78-81D4-2EEB8E5DFD81@omniti.com>
	<136C13E89D22BB468B2A7025993639732F527113@EXMCCMB.molalla.com>
Message-ID: <555A54ED.9080609@gmx.li>

Jeff,

I have them WD40001FYYG drives in my home server but just as a simple 
mirror. AFAIK those drives are equivalent to the SATA WD Re 4GB drives 
but just w/ a SAS controller instead a SATA controller on top and just a 
little more expensive than their SATA equivalents...

I have no real facts but I assume that these SAS drives (they call them 
"nearline SAS") are not 100% like "real" SAS drives... E.g. they don't 
run automated background scans, that's what I observed. In what extent 
they differ from "real" SAS drives, I don't know.

On 05/18/2015 09:01 PM, Jeff Stockett wrote:
> Hi Dan,
>
> The pool is made up of 36 disks - 6 x 6 raidz2 vdevs with some SSDs for l2arc and slog.  I already replaced the drive and the rebuild is nearly done, but I was mostly curious why a disk failure would cause a reboot?  I get that it was apparently hanging the pool up, and that according to some posts I read the developers seem to think it is better the panic/dump/reboot than leave it hung until someone notices, but wouldn't it really be better just to drop the failed drive out of the array? Is it because the system in question is using a SAS expander or is this only expected behavior sometimes depending on how the drive fails?  I guess I might expect this with consumer grade SATA drives, but wasn't expecting it with $$$ enterprise SAS drives.
>
> Thanks,  Jeff
>
> -----Original Message-----
> From: Dan McDonald [mailto:danmcd at omniti.com]
> Sent: Monday, May 18, 2015 11:33 AM
> To: Jeff Stockett
> Cc: omnios-discuss
> Subject: Re: [OmniOS-discuss] disk failure causing reboot?
>
>
>> On May 18, 2015, at 2:25 PM, Jeff Stockett <jstockett at molalla.com> wrote:
>>
>> A drive failed in one of our supermicro 5048R-E1CR36L servers running omnios r151012 last night, and somewhat unexpectedly, the whole system seems to have panicked.
>
> The panic was done for protection of your pool:
>
>> May 18 04:44:36 zfs01 genunix: [ID 918906 kern.notice] I/O to pool 'dpool' appears to be hung.
>
> <SNIP!>
>
>>
>> The disks are all 4TB WD40001FYYG enterprise SAS drives.  Googling seems to indicate it is a known problem with the way the various subsystems sometimes interact. Is there any way to fix/workaround this issue?
>
> Pull the drive.  I'm assuming you have a raidz or mirrored setup where you can do that, right?  Or is it a question of finding *which* drive failed?
>
> Dan
>
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>

From dain.bentley at gmail.com  Mon May 18 21:18:15 2015
From: dain.bentley at gmail.com (Dain Bentley)
Date: Mon, 18 May 2015 17:18:15 -0400
Subject: [OmniOS-discuss] Poor performance on writes Zraid
Message-ID: <CALthgees6vF-sa1TSQVVK1i0uV9DmpcwraZpmu71oBfryCofrg@mail.gmail.com>

Hello all,  I have a RaidZ setup with 5 disks and rad performance is good.
I have no ZIL pool and 8 GB or ECC Ram.  Writes are like 2 MB a second with
a 1GB network.  I'm pulling faster writes on a similar drive in a windows
VM over CIFS on VMware.  My OmniOS box is bare metal.  Any tips on speeding
this up?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150518/6fdd65f0/attachment-0001.html>

From tim at multitalents.net  Mon May 18 22:31:53 2015
From: tim at multitalents.net (Tim Rice)
Date: Mon, 18 May 2015 15:31:53 -0700 (PDT)
Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX?
In-Reply-To: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com>
References: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com>
Message-ID: <alpine.UW2.2.11.1505181528490.29002@server01.int.multitalents.net>

On Mon, 18 May 2015, Dan McDonald wrote:

> Now this isn't a gcc update for illumos/illumos-omnios... that way is full of pain, and I'll wait for now.
> 
> OTOH, we've transitioned gcc before going into r151008 with 4.8.1.
> 
> My question to you all is this:  To which gcc version should we jump?  I see two viable candidates:

Any reason not to consider CLANG instead of GCC?

> 
> 	- gcc 4.9.2 (last updated October 2014)
> or
> 	- gcc 5.1 (last updated April 2015)
> 
> The current gcc "development" is happening on 6.0, and we're not ready for that.
> 
> I appreciate feedback.  I'll be making a decision soon, as I hope to land a compiler upgrade as the major push for this bloody cycle and r151016.
> 
> Thanks,
> Dan
> 

-- 
Tim Rice				Multitalents
tim at multitalents.net



From danmcd at omniti.com  Mon May 18 22:46:24 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 18 May 2015 18:46:24 -0400
Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX?
In-Reply-To: <alpine.UW2.2.11.1505181528490.29002@server01.int.multitalents.net>
References: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com>
	<alpine.UW2.2.11.1505181528490.29002@server01.int.multitalents.net>
Message-ID: <902CDD21-2BED-4ADA-AFF6-84660DCF942D@omniti.com>


> On May 18, 2015, at 6:31 PM, Tim Rice <tim at multitalents.net> wrote:
>> 
>> My question to you all is this:  To which gcc version should we jump?  I see two viable candidates:
> 
> Any reason not to consider CLANG instead of GCC?

Completely new beast and potential for least-surprise.  I can imagine CLANG/LLVM showing up *alongside* gcc in some future release, but not as an outright replacement.  Not yet.

And remember -- these are just for building non-illumos stuff.  illumos is still being built with the specially-modified gcc4.4.4.  (Though I wouldn't mind if someone spent time bringing up LLVM/CLANG to build illumos... it would just be really REALLY hard.)

Dan


From jdg117 at elvis.arl.psu.edu  Tue May 19 01:23:04 2015
From: jdg117 at elvis.arl.psu.edu (John D Groenveld)
Date: Mon, 18 May 2015 21:23:04 -0400
Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX?
In-Reply-To: Your message of "Mon, 18 May 2015 15:31:53 PDT."
	<alpine.UW2.2.11.1505181528490.29002@server01.int.multitalents.net> 
References: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com>
	<alpine.UW2.2.11.1505181528490.29002@server01.int.multitalents.net>
Message-ID: <201505190123.t4J1N4qc029517@elvis.arl.psu.edu>

In message <alpine.UW2.2.11.1505181528490.29002 at server01.int.multitalents.net>,
 Tim Rice writes:
>Any reason not to consider CLANG instead of GCC?

Does anyone have a build recipe for LLVM/clang on OmniOS?
I'm about to try that path to build Google's V8.

John
groenveld at acm.org

From danmcd at omniti.com  Tue May 19 01:37:51 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Mon, 18 May 2015 21:37:51 -0400
Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX?
In-Reply-To: <201505190123.t4J1N4qc029517@elvis.arl.psu.edu>
References: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com>
	<alpine.UW2.2.11.1505181528490.29002@server01.int.multitalents.net>
	<201505190123.t4J1N4qc029517@elvis.arl.psu.edu>
Message-ID: <614F6199-6E65-492B-B5E7-3CDBB6208512@omniti.com>

ISTR there being an old pull request in omnios-build for it.  I can't just take something like that in, but it may serve your needs.

Dan

Sent from my iPhone (typos, autocorrect, and all)

> On May 18, 2015, at 9:23 PM, John D Groenveld <jdg117 at elvis.arl.psu.edu> wrote:
> 
> In message <alpine.UW2.2.11.1505181528490.29002 at server01.int.multitalents.net>,
> Tim Rice writes:
>> Any reason not to consider CLANG instead of GCC?
> 
> Does anyone have a build recipe for LLVM/clang on OmniOS?
> I'm about to try that path to build Google's V8.
> 
> John
> groenveld at acm.org
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

From jimklimov at cos.ru  Tue May 19 04:45:43 2015
From: jimklimov at cos.ru (Jim Klimov)
Date: Tue, 19 May 2015 06:45:43 +0200
Subject: [OmniOS-discuss] Poor performance on writes Zraid
In-Reply-To: <CALthgees6vF-sa1TSQVVK1i0uV9DmpcwraZpmu71oBfryCofrg@mail.gmail.com>
References: <CALthgees6vF-sa1TSQVVK1i0uV9DmpcwraZpmu71oBfryCofrg@mail.gmail.com>
Message-ID: <6A0DDFC7-B659-4A5F-B464-607AC6104006@cos.ru>

18 ??? 2015??. 23:18:15 CEST, Dain Bentley <dain.bentley at gmail.com> ?????:
>Hello all,  I have a RaidZ setup with 5 disks and rad performance is
>good.
>I have no ZIL pool and 8 GB or ECC Ram.  Writes are like 2 MB a second
>with
>a 1GB network.  I'm pulling faster writes on a similar drive in a
>windows
>VM over CIFS on VMware.  My OmniOS box is bare metal.  Any tips on
>speeding
>this up?
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>OmniOS-discuss mailing list
>OmniOS-discuss at lists.omniti.com
>http://lists.omniti.com/mailman/listinfo/omnios-discuss

Do you have dedup enabled? (This is pretty slow, and needs lots of metadata reads to make each write, and little RAM and no L2ARC is very bad with this)

Also, very full pools (vague definition based on history of the writes - generally 80% as a rule of thumb, though pathologies can be after 50% for some and 95%+ for others) - these can have very fragmented and small 'holes' in free space, which impacts write speeds (more random, and it takes more time to find the available location for a block).

You can also look at 'iostat -Xnz 1' output to see the i/o values per active device. Younare interested in reads/sec+writes/sec (hdds can serve about 200ops/sec total, unless they happen to be small requests to sequentially placed sector numbers - in theory you might be lucky to see even 20000iops in such favorable case, in practice about 500 is not uncommon since related block locations in zfs are often coalesced). In iostat you'd also worry about %b(usy), %w(rite-wait) to see if some disks have a very different performance than others (e.g. one has internal problems and sector relocations to spare areas, or flaky cabling and many protocol re-requests involved in succesful ops). svct (service times) and queue lengths can also be useful.

You can get similar info with 'zpool iostat -v 1' as well, though interactions between pool io's and component vdev io's may be tricky to compare between raidz and mirror for example. You might be more interested in averaged differences (maybe across larger time ranges) between these two iostats - e.g. if you have some other io's that those from the pool (say, a raw swap partition).

Finally, consider dtrace-toolkit's and Richard Elling's scripts to sniff what logical (file/vdev) operations you have - and see how these numbers compare to those in pool i/o's at least on the order of magnitude. The difference can be metadata ops, or something else.

Hooe this helps get you started,
Jim Klimov
--
Typos courtesy of K-9 Mail on my Samsung Android

From vab at bb-c.de  Tue May 19 06:16:39 2015
From: vab at bb-c.de (Volker A. Brandt)
Date: Tue, 19 May 2015 08:16:39 +0200
Subject: [OmniOS-discuss] Can't update bloody
In-Reply-To: <389521B5-A1BD-4FFB-A457-CD31F421625E@omniti.com>
References: <DUB405-EAS281DE3E6934796A0231BEC1E2D10@phx.gbl>
	<21832.58196.941714.304987@glaurung.bb-c.de>
	<5ECE2D45-CBD8-49AB-8F7A-DB138B6E9C3B@omniti.com>
	<21850.20883.162713.422803@glaurung.bb-c.de>
	<389521B5-A1BD-4FFB-A457-CD31F421625E@omniti.com>
Message-ID: <21850.54599.267449.146761@glaurung.bb-c.de>

> The only other thing I can recommend is the new for '014 and later
> "-m latest" option to pkgrecv, which only gets you the LATEST
> version(s) of the packages:

Good point.  Works.  Thanks!


Regards -- Volker
-- 
------------------------------------------------------------------------
Volker A. Brandt               Consulting and Support for Oracle Solaris
Brandt & Brandt Computer GmbH                   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim, GERMANY            Email: vab at bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513              Schuhgr??e: 46
Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt

"When logic and proportion have fallen sloppy dead"

From rt at steait.net  Tue May 19 09:19:17 2015
From: rt at steait.net (Rune Tipsmark)
Date: Tue, 19 May 2015 09:19:17 +0000
Subject: [OmniOS-discuss] disk failure causing reboot?
In-Reply-To: <CALeZrrQSa5Lx6Ld8ORT9OS2_=JEZ=FJSA4dgHfX_OkAyGogV8g@mail.gmail.com>
References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com>
	<20150518200848.GH3720@bender.unx.cpp.edu>
	<CALeZrrQSa5Lx6Ld8ORT9OS2_=JEZ=FJSA4dgHfX_OkAyGogV8g@mail.gmail.com>
Message-ID: <eef1137717304a1b86f5de854280b11a@EX1301.steait.net>

Same issue here around two months ago when a L2arc device failed? failmode was default and the device was actually an mSata SSD mounted in a PCI-E mSata card:

http://www.addonics.com/products/ad4mspx2.php  and the disk was one of four of these http://www.samsung.com/us/computer/memory-storage/MZ-MTE1T0BW

Can these reboots be avoided in any way?

Br,
Rune


From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] On Behalf Of Schweiss, Chip
Sent: Monday, May 18, 2015 10:31 PM
To: Paul B. Henson
Cc: omnios-discuss
Subject: Re: [OmniOS-discuss] disk failure causing reboot?

I had the exact same failure mode last week.  With over 1000 spindles I see this about once a month.

I can publish my dump also if anyone actually want's to try to fix this problem, but I think there are several of the same thing already linked to tickets in Illumos-gate.
Pools for the most part should be set to failmode=panic or wait, but a failed disk should not cause a panic.   The system this happened to me on failmode was set to wait.  It is also on r151012, waiting on a window to upgrade to r151014.  My pool is raidz3, so no reason not to kick a bad disk.
All my disks are SAS in DataON JBODs, dual connected across two LSI HBAs.    BTW, pull a SAS cable and you get a panic too, not degraded multipath.    Illumos seems to panic on just about any SAS event these days regardless of redundancy.
-Chip









On Mon, May 18, 2015 at 3:08 PM, Paul B. Henson <henson at acm.org<mailto:henson at acm.org>> wrote:
On Mon, May 18, 2015 at 06:25:34PM +0000, Jeff Stockett wrote:
> A drive failed in one of our supermicro 5048R-E1CR36L servers running
> omnios r151012 last night, and somewhat unexpectedly, the whole system
> seems to have panicked.

You don't happen to have failmode set to panic on the pool?

From the zpool manpage:

       failmode=wait | continue | panic
           Controls the system behavior in the event of catastrophic pool
           failure. This condition is typically a result of a loss of
           connectivity to the underlying storage device(s) or a failure of
           all devices within the pool. The behavior of such an event is
           determined as follows:

           wait
                       Blocks all I/O access until the device connectivity is
                       recovered and the errors are cleared. This is the
                       default behavior.

           continue
                       Returns EIO to any new write I/O requests but allows
                       reads to any of the remaining healthy devices. Any
                       write requests that have yet to be committed to disk
                       would be blocked.

           panic
                       Prints out a message to the console and generates a
                       system crash dump.

_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com<mailto:OmniOS-discuss at lists.omniti.com>
http://lists.omniti.com/mailman/listinfo/omnios-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150519/e80f57a9/attachment-0001.html>

From dain.bentley at gmail.com  Tue May 19 11:09:50 2015
From: dain.bentley at gmail.com (Dain Bentley)
Date: Tue, 19 May 2015 07:09:50 -0400
Subject: [OmniOS-discuss] Poor performance on writes Zraid
In-Reply-To: <6A0DDFC7-B659-4A5F-B464-607AC6104006@cos.ru>
References: <CALthgees6vF-sa1TSQVVK1i0uV9DmpcwraZpmu71oBfryCofrg@mail.gmail.com>
	<6A0DDFC7-B659-4A5F-B464-607AC6104006@cos.ru>
Message-ID: <CALthgec5nDcg7X0dzXp29bgYVZ17drfr5ApRDNJ3k0LQod_yZw@mail.gmail.com>

Thanks for the help guys.  Integrated CIFS.  Reads are fast.  The pool is
about 60% full only.

Thanks for the tips!  I'll try iostat to sniff this out

On Tuesday, May 19, 2015, Jim Klimov <jimklimov at cos.ru> wrote:

> 18 ??? 2015 ?. 23:18:15 CEST, Dain Bentley <dain.bentley at gmail.com
> <javascript:;>> ?????:
> >Hello all,  I have a RaidZ setup with 5 disks and rad performance is
> >good.
> >I have no ZIL pool and 8 GB or ECC Ram.  Writes are like 2 MB a second
> >with
> >a 1GB network.  I'm pulling faster writes on a similar drive in a
> >windows
> >VM over CIFS on VMware.  My OmniOS box is bare metal.  Any tips on
> >speeding
> >this up?
> >
> >
> >------------------------------------------------------------------------
> >
> >_______________________________________________
> >OmniOS-discuss mailing list
> >OmniOS-discuss at lists.omniti.com <javascript:;>
> >http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
> Do you have dedup enabled? (This is pretty slow, and needs lots of
> metadata reads to make each write, and little RAM and no L2ARC is very bad
> with this)
>
> Also, very full pools (vague definition based on history of the writes -
> generally 80% as a rule of thumb, though pathologies can be after 50% for
> some and 95%+ for others) - these can have very fragmented and small
> 'holes' in free space, which impacts write speeds (more random, and it
> takes more time to find the available location for a block).
>
> You can also look at 'iostat -Xnz 1' output to see the i/o values per
> active device. Younare interested in reads/sec+writes/sec (hdds can serve
> about 200ops/sec total, unless they happen to be small requests to
> sequentially placed sector numbers - in theory you might be lucky to see
> even 20000iops in such favorable case, in practice about 500 is not
> uncommon since related block locations in zfs are often coalesced). In
> iostat you'd also worry about %b(usy), %w(rite-wait) to see if some disks
> have a very different performance than others (e.g. one has internal
> problems and sector relocations to spare areas, or flaky cabling and many
> protocol re-requests involved in succesful ops). svct (service times) and
> queue lengths can also be useful.
>
> You can get similar info with 'zpool iostat -v 1' as well, though
> interactions between pool io's and component vdev io's may be tricky to
> compare between raidz and mirror for example. You might be more interested
> in averaged differences (maybe across larger time ranges) between these two
> iostats - e.g. if you have some other io's that those from the pool (say, a
> raw swap partition).
>
> Finally, consider dtrace-toolkit's and Richard Elling's scripts to sniff
> what logical (file/vdev) operations you have - and see how these numbers
> compare to those in pool i/o's at least on the order of magnitude. The
> difference can be metadata ops, or something else.
>
> Hooe this helps get you started,
> Jim Klimov
> --
> Typos courtesy of K-9 Mail on my Samsung Android
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150519/8095249c/attachment.html>

From dain.bentley at gmail.com  Tue May 19 11:10:27 2015
From: dain.bentley at gmail.com (Dain Bentley)
Date: Tue, 19 May 2015 07:10:27 -0400
Subject: [OmniOS-discuss] Poor performance on writes Zraid
In-Reply-To: <6A0DDFC7-B659-4A5F-B464-607AC6104006@cos.ru>
References: <CALthgees6vF-sa1TSQVVK1i0uV9DmpcwraZpmu71oBfryCofrg@mail.gmail.com>
	<6A0DDFC7-B659-4A5F-B464-607AC6104006@cos.ru>
Message-ID: <CALthgeckct1tsJDb0oKtvS-K794hp2vMA5xeqU-BaaJxhwozUA@mail.gmail.com>

And no dedup

On Tuesday, May 19, 2015, Jim Klimov <jimklimov at cos.ru> wrote:

> 18 ??? 2015 ?. 23:18:15 CEST, Dain Bentley <dain.bentley at gmail.com
> <javascript:;>> ?????:
> >Hello all,  I have a RaidZ setup with 5 disks and rad performance is
> >good.
> >I have no ZIL pool and 8 GB or ECC Ram.  Writes are like 2 MB a second
> >with
> >a 1GB network.  I'm pulling faster writes on a similar drive in a
> >windows
> >VM over CIFS on VMware.  My OmniOS box is bare metal.  Any tips on
> >speeding
> >this up?
> >
> >
> >------------------------------------------------------------------------
> >
> >_______________________________________________
> >OmniOS-discuss mailing list
> >OmniOS-discuss at lists.omniti.com <javascript:;>
> >http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
> Do you have dedup enabled? (This is pretty slow, and needs lots of
> metadata reads to make each write, and little RAM and no L2ARC is very bad
> with this)
>
> Also, very full pools (vague definition based on history of the writes -
> generally 80% as a rule of thumb, though pathologies can be after 50% for
> some and 95%+ for others) - these can have very fragmented and small
> 'holes' in free space, which impacts write speeds (more random, and it
> takes more time to find the available location for a block).
>
> You can also look at 'iostat -Xnz 1' output to see the i/o values per
> active device. Younare interested in reads/sec+writes/sec (hdds can serve
> about 200ops/sec total, unless they happen to be small requests to
> sequentially placed sector numbers - in theory you might be lucky to see
> even 20000iops in such favorable case, in practice about 500 is not
> uncommon since related block locations in zfs are often coalesced). In
> iostat you'd also worry about %b(usy), %w(rite-wait) to see if some disks
> have a very different performance than others (e.g. one has internal
> problems and sector relocations to spare areas, or flaky cabling and many
> protocol re-requests involved in succesful ops). svct (service times) and
> queue lengths can also be useful.
>
> You can get similar info with 'zpool iostat -v 1' as well, though
> interactions between pool io's and component vdev io's may be tricky to
> compare between raidz and mirror for example. You might be more interested
> in averaged differences (maybe across larger time ranges) between these two
> iostats - e.g. if you have some other io's that those from the pool (say, a
> raw swap partition).
>
> Finally, consider dtrace-toolkit's and Richard Elling's scripts to sniff
> what logical (file/vdev) operations you have - and see how these numbers
> compare to those in pool i/o's at least on the order of magnitude. The
> difference can be metadata ops, or something else.
>
> Hooe this helps get you started,
> Jim Klimov
> --
> Typos courtesy of K-9 Mail on my Samsung Android
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150519/d94449fc/attachment.html>

From jdg117 at elvis.arl.psu.edu  Tue May 19 16:28:18 2015
From: jdg117 at elvis.arl.psu.edu (John D Groenveld)
Date: Tue, 19 May 2015 12:28:18 -0400
Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX?
In-Reply-To: Your message of "Mon, 18 May 2015 21:23:04 EDT."
	<201505190123.t4J1N4qc029517@elvis.arl.psu.edu> 
References: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com>
	<alpine.UW2.2.11.1505181528490.29002@server01.int.multitalents.net>
	<201505190123.t4J1N4qc029517@elvis.arl.psu.edu> 
Message-ID: <201505191628.t4JGSISi015918@elvis.arl.psu.edu>

In message <201505190123.t4J1N4qc029517 at elvis.arl.psu.edu>, John D Groenveld writes:
>Does anyone have a build recipe for LLVM/clang on OmniOS?

LLVM depends on CMake and Python-2.7.9.
Both build easily with stock gcc-4.8.1.

John
groenveld at acm.org

From mtalbott at lji.org  Tue May 19 17:36:19 2015
From: mtalbott at lji.org (Michael Talbott)
Date: Tue, 19 May 2015 10:36:19 -0700
Subject: [OmniOS-discuss] Samba Performance
Message-ID: <46496B91-013E-4940-BECB-B167D979509E@lji.org>

Hi all. I've been transitioning a file server to OmniOS for many reasons (abandoning zfsonlinux). But I seem to have one last issue I'd like to resolve. It seems to be running into a performance issue with samba. I'm not using the built in zfs smb sharing because I need more flexibility in our environment than it offers such as sharing subfolders, shadow copies, etc.

I'm running the latest LTS version of Omni on a well equipped server with 2x Xeon E5-2630 v2 @ 2.60GHz, 128G ECC ram with dual intel 10g NICs, and two dual LSI 6G SAS cards. The zpool has 8 x raidz2s that have 10 4TB drives in each totaling 290TB. Disk performance is not an issue.

I've compiled samba 4.2.1 and netatalk from source, got winbind working nicely with our AD environment. Samba is even happily kerberized. Everything authenticates and functions correctly. But, while netatalk gives me line speed performance (120-150MB/s on a gigabit workstation), samba won't budge above 40-60MB/s (same speeds using MacOS and Windows 7,8,2012 clients).

Using the same hardware on CentOS 7 with zfsonlinux, samba gives me just about the same throughput as netatalk. In linux, I could tune it with socket options giving it a bigger buffer and it made a big difference. But using the same options on Omni doesn't seem to have any significant affect (actually seems to slow it down a bit).

In smb.conf on both OSs:
	socket options = TCP_NODELAY IPTOS_LOWDELAY SO_SNDBUF=2097152 SO_RCVBUF=2097152

In CentOS7 in sysctl.conf:
net.ipv4.tcp_rmem = 10000000 10000000 10000000
net.ipv4.tcp_wmem = 10000000 10000000 10000000
net.ipv4.tcp_mem = 10000000 10000000 10000000
net.ipv4.tcp_sack = 0
net.core.rmem_max = 524287
net.core.wmem_max = 524287
net.core.rmem_default = 524287
net.core.wmem_default = 524287
net.core.optmem_max = 524287
net.core.netdev_max_backlog = 300000

And then in Omni, I've set these ip properties
root at store3:# ipadm show-prop
PROTO PROPERTY              PERM CURRENT      PERSISTENT   DEFAULT      POSSIBLE
tcp   max_buf               rw   16777216     16777216     1048576      8192-1073741824
tcp   recv_buf              rw   16777216     16777216     128000       2048-16777216
tcp   send_buf              rw   16777216     16777216     49152        4096-16777216

I just can't get samba on Omni to go any faster than 60MB/s. I've tried adjusting those buffers, removing the socket options in smb.conf altogether, but, to no avail.

Anyone else out there running samba on Omni and getting faster throughput? Anyone have any ideas of how I could get more throughput with samba?

Thanks,

Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150519/a427c16e/attachment-0001.html>

From danmcd at omniti.com  Tue May 19 17:49:58 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 19 May 2015 13:49:58 -0400
Subject: [OmniOS-discuss] Samba Performance
In-Reply-To: <46496B91-013E-4940-BECB-B167D979509E@lji.org>
References: <46496B91-013E-4940-BECB-B167D979509E@lji.org>
Message-ID: <D899DB49-1706-4A9A-9271-9CEE54FDD98B@omniti.com>


> On May 19, 2015, at 1:36 PM, Michael Talbott <mtalbott at lji.org> wrote:
> 
> And then in Omni, I've set these ip properties
> root at store3:# ipadm show-prop
> PROTO PROPERTY              PERM CURRENT      PERSISTENT   DEFAULT      POSSIBLE
> tcp   max_buf               rw   16777216     16777216     1048576      8192-1073741824
> tcp   recv_buf              rw   16777216     16777216     128000       2048-16777216
> tcp   send_buf              rw   16777216     16777216     49152        4096-16777216
> 
> I just can't get samba on Omni to go any faster than 60MB/s. I've tried adjusting those buffers, removing the socket options in smb.conf altogether, but, to no avail.

I'd lower recv/send from 16MB down to 1MB unless you have a VERY HIGH DELAY network.  You just aren't buying much beyond 1MB.

I've heard Samba itself is the source of most of these problems.

As for the built-in smb sharing... there are improvements already starting to be upstreamed in illumos-gate (and are in the OmniOS bloody release), but it may not solve all of your problems that Samba will solve.  I'd suggest asking the illumos list your SMB questions as well -- maybe one of the Nexentians will be able to point the way toward what's coming.

Dan


From doug at will.to  Tue May 19 18:41:23 2015
From: doug at will.to (Doug Hughes)
Date: Tue, 19 May 2015 14:41:23 -0400
Subject: [OmniOS-discuss] Samba Performance
In-Reply-To: <D899DB49-1706-4A9A-9271-9CEE54FDD98B@omniti.com>
References: <46496B91-013E-4940-BECB-B167D979509E@lji.org>
	<D899DB49-1706-4A9A-9271-9CEE54FDD98B@omniti.com>
Message-ID: <CAOpmc6wGSb5iV1sY2D36VZRpvPh=1-ceB5SEGugCDSdFL0AQPA@mail.gmail.com>

The equivalent TCP raw tunings for Solaris based OS's

ndd -set /dev/tcp tcp_xmit_hiwat 1048576
ndd -set /dev/tcp tcp_recv_hiwat 1048576
ndd -set /dev/tcp tcp_max_buf 4194304

Those are the raw tunables and if you run get on those you'll see that they
are different than what's in ipadm. One is just buffers, but these are the
sliding window parameters.

How big is your latency? Agreed that 1Mb seems like plenty even for
east/west coast high-bandwidth WAN. You certainly wouldn't need to go about
4MB for that.


On Tue, May 19, 2015 at 1:49 PM, Dan McDonald <danmcd at omniti.com> wrote:

>
> > On May 19, 2015, at 1:36 PM, Michael Talbott <mtalbott at lji.org> wrote:
> >
> > And then in Omni, I've set these ip properties
> > root at store3:# ipadm show-prop
> > PROTO PROPERTY              PERM CURRENT      PERSISTENT   DEFAULT
> POSSIBLE
> > tcp   max_buf               rw   16777216     16777216     1048576
> 8192-1073741824
> > tcp   recv_buf              rw   16777216     16777216     128000
>  2048-16777216
> > tcp   send_buf              rw   16777216     16777216     49152
> 4096-16777216
> >
> > I just can't get samba on Omni to go any faster than 60MB/s. I've tried
> adjusting those buffers, removing the socket options in smb.conf
> altogether, but, to no avail.
>
> I'd lower recv/send from 16MB down to 1MB unless you have a VERY HIGH
> DELAY network.  You just aren't buying much beyond 1MB.
>
> I've heard Samba itself is the source of most of these problems.
>
> As for the built-in smb sharing... there are improvements already starting
> to be upstreamed in illumos-gate (and are in the OmniOS bloody release),
> but it may not solve all of your problems that Samba will solve.  I'd
> suggest asking the illumos list your SMB questions as well -- maybe one of
> the Nexentians will be able to point the way toward what's coming.
>
> Dan
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150519/1f9225cf/attachment.html>

From danmcd at omniti.com  Tue May 19 18:58:15 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 19 May 2015 14:58:15 -0400
Subject: [OmniOS-discuss] Samba Performance
In-Reply-To: <CAOpmc6wGSb5iV1sY2D36VZRpvPh=1-ceB5SEGugCDSdFL0AQPA@mail.gmail.com>
References: <46496B91-013E-4940-BECB-B167D979509E@lji.org>
	<D899DB49-1706-4A9A-9271-9CEE54FDD98B@omniti.com>
	<CAOpmc6wGSb5iV1sY2D36VZRpvPh=1-ceB5SEGugCDSdFL0AQPA@mail.gmail.com>
Message-ID: <863E2F65-B57C-46B2-8DC1-FFCBA9F2A723@omniti.com>


> On May 19, 2015, at 2:41 PM, Doug Hughes <doug at will.to> wrote:
> 
> 
> ndd -set /dev/tcp tcp_xmit_hiwat 1048576
> ndd -set /dev/tcp tcp_recv_hiwat 1048576
> ndd -set /dev/tcp tcp_max_buf 4194304

Umm... not as much now. The ipadm(1M) Michael showed is the moral equivalent, and better supported.

Dan


From doug at will.to  Tue May 19 19:11:09 2015
From: doug at will.to (Doug Hughes)
Date: Tue, 19 May 2015 15:11:09 -0400
Subject: [OmniOS-discuss] Samba Performance
In-Reply-To: <863E2F65-B57C-46B2-8DC1-FFCBA9F2A723@omniti.com>
References: <46496B91-013E-4940-BECB-B167D979509E@lji.org>
	<D899DB49-1706-4A9A-9271-9CEE54FDD98B@omniti.com>
	<CAOpmc6wGSb5iV1sY2D36VZRpvPh=1-ceB5SEGugCDSdFL0AQPA@mail.gmail.com>
	<863E2F65-B57C-46B2-8DC1-FFCBA9F2A723@omniti.com>
Message-ID: <CAOpmc6z24mE3_vgwsVcnLXR+8Ed1CiBkXgF6HQJKy+n_-PAy1A@mail.gmail.com>

Oops, when I was comparing numbers I didn't take the left column into
account. I didn't realize it was enumerated by protocol, and being in
'screen', I didn't see it correctly in scrollback.

mea culpa.


On Tue, May 19, 2015 at 2:58 PM, Dan McDonald <danmcd at omniti.com> wrote:

>
> > On May 19, 2015, at 2:41 PM, Doug Hughes <doug at will.to> wrote:
> >
> >
> > ndd -set /dev/tcp tcp_xmit_hiwat 1048576
> > ndd -set /dev/tcp tcp_recv_hiwat 1048576
> > ndd -set /dev/tcp tcp_max_buf 4194304
>
> Umm... not as much now. The ipadm(1M) Michael showed is the moral
> equivalent, and better supported.
>
> Dan
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150519/c3ad0d3e/attachment.html>

From danmcd at omniti.com  Tue May 19 21:08:50 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 19 May 2015 17:08:50 -0400
Subject: [OmniOS-discuss] New OmniOS bloody update
Message-ID: <9CA9EC60-D441-47F3-80C6-D24DDD09A563@omniti.com>

Based on omnios-build commit 155193f and illumos-omnios commit c4ba593.

This is a partial update, but includes the entirety of illumos-omnios, so expect a reboot.  Remember, if you're doing full-repo transfers, use the new "-m latest" argument in pkgrecv to prevent pulling old packages over.

Since last time:

- KVM has been updated per the r151014 update --> it's now up to date modulo the removal of VND stuff.  Once VND upstreams, bloody will be the first to see a fully synched with upstream KVM.

- The rest of the changes are in illumos, pulled down from upstream.

- Various bugfixes upstreamed from Delphix in mdb & zfs, and Joyent in other areas.

- Flow control is now in the NFS server, which prevents starvation when the network outperforms the disks.  (This will be backported to r151014.)

- zpool import speedup.  This should improved boot times on systems with many ZFS filesystems.

- More SMB bugfixes upstreamed from Nexenta.

Happy updating!
Dan


From danmcd at omniti.com  Tue May 19 21:10:14 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 19 May 2015 17:10:14 -0400
Subject: [OmniOS-discuss] New OmniOS bloody update
In-Reply-To: <B11C92F3-589F-49E6-8F39-283808E729A9@omniti.com>
References: <B11C92F3-589F-49E6-8F39-283808E729A9@omniti.com>
Message-ID: <4FC32FBB-A59C-4A25-B5BC-F8B6E74AECC4@omniti.com>


> On May 6, 2015, at 10:13 AM, Dan McDonald <danmcd at omniti.com> wrote:
> 
> Based on omnios-build commit 69a5016 and illumos-omnios commit 385735e.

Shoot.  The packages aren't out yet and I hit Send early.

Please wait about 30-60 minutes before upgrading.  Otherwise you'll only see the small changes outside illumos-omnios.

Sorry,
Dan


From tim at multitalents.net  Tue May 19 23:02:00 2015
From: tim at multitalents.net (Tim Rice)
Date: Tue, 19 May 2015 16:02:00 -0700 (PDT)
Subject: [OmniOS-discuss] Updating r151006 to r151014
Message-ID: <alpine.UW2.2.11.1505191538530.1822@server01.int.multitalents.net>


Last weekend I updated my r151006 VMs to r151014. One with a zone.

The notes at http://omnios.omniti.com/wiki.php/Upgrade_to_r151014
were quite good. One piece not mentioned (although obvious when
you think about it) was that for those of us that froze at r151006,
it is necessary to unfreeze to upgrade.

In zone and global zone,
# pkg unfreeze entire at 11-0.151006 \
  consolidation/osnet/osnet-incorporation at 0.5.11-0.151006 \
  incorporation/jeos/illumos-gate at 11-0.151006 \
  incorporation/jeos/omnios-userland at 11-0.151006

Since one of the VMs is the storage server on my all-in-one box it
had smartmontools loaded. I had to remove smartmontools for the
update to work and there is no smartmontools for r151014. :-(

I hope these notes save someone some time.

-- 
Tim Rice				Multitalents
tim at multitalents.net



From danmcd at omniti.com  Wed May 20 02:40:20 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 19 May 2015 22:40:20 -0400
Subject: [OmniOS-discuss] Updating r151006 to r151014
In-Reply-To: <alpine.UW2.2.11.1505191538530.1822@server01.int.multitalents.net>
References: <alpine.UW2.2.11.1505191538530.1822@server01.int.multitalents.net>
Message-ID: <38596CDA-5C4C-435D-8AC6-B70883B06F76@omniti.com>


> On May 19, 2015, at 7:02 PM, Tim Rice <tim at multitalents.net> wrote:
> 
> 
> Since one of the VMs is the storage server on my all-in-one box it
> had smartmontools loaded. I had to remove smartmontools for the
> update to work and there is no smartmontools for r151014. :-(

Which publisher provides smartmontools?  If its ms.omniti.com, it'll be up to our internal staff to update that package.  Remember that ms.omniti.com is not supported, it is a convenience offering of the tools our ops people use internally.

Dan


From tim at multitalents.net  Wed May 20 04:54:02 2015
From: tim at multitalents.net (Tim Rice)
Date: Tue, 19 May 2015 21:54:02 -0700 (PDT)
Subject: [OmniOS-discuss] Updating r151006 to r151014
In-Reply-To: <38596CDA-5C4C-435D-8AC6-B70883B06F76@omniti.com>
References: <alpine.UW2.2.11.1505191538530.1822@server01.int.multitalents.net>
	<38596CDA-5C4C-435D-8AC6-B70883B06F76@omniti.com>
Message-ID: <alpine.UW2.2.11.1505192144330.4829@server01.int.multitalents.net>

On Tue, 19 May 2015, Dan McDonald wrote:

| 
| > On May 19, 2015, at 7:02 PM, Tim Rice <tim at multitalents.net> wrote:
| > 
| > 
| > Since one of the VMs is the storage server on my all-in-one box it
| > had smartmontools loaded. I had to remove smartmontools for the
| > update to work and there is no smartmontools for r151014. :-(
| 
| Which publisher provides smartmontools?  If its ms.omniti.com, it'll be 
| up to our internal staff to update that package.  Remember that 
| ms.omniti.com is not supported, it is a convenience offering of the 
| tools our ops people use internally.

Yes it was ms.omniti.com. I fully understand that they are not supported.
Thanks for providing it for 006. While I would have liked to have had
it available for 014 and save me some time, I can roll my own.

Thank You to all the people at omniti for all that they do provide.

| 
| Dan
| 

-- 
Tim Rice				Multitalents
tim at multitalents.net



From jimklimov at cos.ru  Wed May 20 05:32:12 2015
From: jimklimov at cos.ru (Jim Klimov)
Date: Wed, 20 May 2015 07:32:12 +0200
Subject: [OmniOS-discuss] Updating r151006 to r151014
In-Reply-To: <alpine.UW2.2.11.1505192144330.4829@server01.int.multitalents.net>
References: <alpine.UW2.2.11.1505191538530.1822@server01.int.multitalents.net>
	<38596CDA-5C4C-435D-8AC6-B70883B06F76@omniti.com>
	<alpine.UW2.2.11.1505192144330.4829@server01.int.multitalents.net>
Message-ID: <A46BBECF-1CE1-4DD1-9102-8BD0559143B1@cos.ru>

20 ??? 2015??. 6:54:02 CEST, Tim Rice <tim at multitalents.net> ?????:
>On Tue, 19 May 2015, Dan McDonald wrote:
>
>| 
>| > On May 19, 2015, at 7:02 PM, Tim Rice <tim at multitalents.net> wrote:
>| > 
>| > 
>| > Since one of the VMs is the storage server on my all-in-one box it
>| > had smartmontools loaded. I had to remove smartmontools for the
>| > update to work and there is no smartmontools for r151014. :-(
>| 
>| Which publisher provides smartmontools?  If its ms.omniti.com, it'll
>be 
>| up to our internal staff to update that package.  Remember that 
>| ms.omniti.com is not supported, it is a convenience offering of the 
>| tools our ops people use internally.
>
>Yes it was ms.omniti.com. I fully understand that they are not
>supported.
>Thanks for providing it for 006. While I would have liked to have had
>it available for 014 and save me some time, I can roll my own.
>
>Thank You to all the people at omniti for all that they do provide.
>
>| 
>| Dan
>| 

You can also give a shot to pkgsrc, it is fairly easy to bootstrap and install, though they are usually updated only quarterly and an upgrade seems to require changing the repo used by your system. This is tricky to script for hands-off management. And for the past 2 upgrades (though disruptive with new packaging features) I was better off removing it all, bootstrapping and reinstalling what I remembered as needed (e.g. top or vnc+twm to occasionally head my virtualboxes) to get out of incompatible package version conflicts that I managed to get otherwise.

HTH,
Jim
--
Typos courtesy of K-9 Mail on my Samsung Android

From vab at bb-c.de  Wed May 20 07:53:32 2015
From: vab at bb-c.de (Volker A. Brandt)
Date: Wed, 20 May 2015 09:53:32 +0200
Subject: [OmniOS-discuss] Updating r151006 to r151014
In-Reply-To: <alpine.UW2.2.11.1505192144330.4829@server01.int.multitalents.net>
References: <alpine.UW2.2.11.1505191538530.1822@server01.int.multitalents.net>
	<38596CDA-5C4C-435D-8AC6-B70883B06F76@omniti.com>
	<alpine.UW2.2.11.1505192144330.4829@server01.int.multitalents.net>
Message-ID: <21852.15740.375385.805897@glaurung.bb-c.de>

> | Which publisher provides smartmontools?  If its ms.omniti.com,
> it'll be | up to our internal staff to update that package.
> Remember that | ms.omniti.com is not supported, it is a convenience
> offering of the | tools our ops people use internally.
> 
> Yes it was ms.omniti.com. I fully understand that they are not
> supported.  Thanks for providing it for 006. While I would have
> liked to have had it available for 014 and save me some time, I can
> roll my own.

I am using smartmontools from ms.o.c under 151014 with no problems.
The package version is:

pkg://ms.omniti.com/omniti/system/storage/smartmontools at 6.0-0.151004:20130113T222019Z

However, I installed it when the box was at 151010.  I since upgraded
to 151012 and then 151014.  Maybe you can reinstall smartmontools using
some IPS trickery?  Have you tried?

Having said that, I do know that some of the pkgs on ms.o.c have 
dependency problems.  Like you said I am thankful that there is a lot
of usable stuff there so I am not complaining. :-)


Regards -- Volker
-- 
------------------------------------------------------------------------
Volker A. Brandt               Consulting and Support for Oracle Solaris
Brandt & Brandt Computer GmbH                   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim, GERMANY            Email: vab at bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513              Schuhgr??e: 46
Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt

"When logic and proportion have fallen sloppy dead"

From danmcd at omniti.com  Wed May 20 12:44:57 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 20 May 2015 08:44:57 -0400
Subject: [OmniOS-discuss] Logjam & IKE
Message-ID: <F1D0B61C-D24E-4393-B8D0-B9CBEE689675@omniti.com>

Security researchers published this recently:

	https://weakdh.org/

This note (which should be forwarded to other illumos interest lists) briefly discusses how logjam affects the closed-source in.iked.


IKE can use one of many Diffie-Hellman groups both for establishing IKE's own security, and ALSO optionally for generating IPsec keying material.  The former is specified by the "oakley_group", and the latter by the "p2_pfs" keyword.  Now the ike.config(4) man page was recently updated to reflect the full range of available choices.  I did discover (and sorry Eric for not catching this in code review) that p2_pfs accepts the same choices as the now-updated oakley_group parameter does.  They follow, with markings around which ones I'd deprecate, and which ones I have naive questions about, were in.iked & libike.so open-source:

           oakley_group number
               The Oakley Diffie-Hellman group used for IKE SA key derivation.
               The group numbers are defined in RFC 2409, Appendix A, RFC
               3526, and RFC 5114, section 3.2. Acceptable values are
               currently:
                 1 (MODP 768-bit)      ****** DO NOT USE ******
                 2 (MODP 1024-bit)    ****** DO NOT USE ******
                 3 (EC2N 155-bit)      ****** NOT SURE ******
                 4 (EC2N 185-bit)      ****** NOT SURE ******
                 5 (MODP 1536-bit)
                 14 (MODP 2048-bit)
                 15 (MODP 3072-bit)
                 16 (MODP 4096-bit)
                 17 (MODP 6144-bit)
                 18 (MODP 8192-bit)
                 19 (ECP 256-bit)
                 20 (ECP 384-bit)
                 21 (ECP 521-bit)
                 22 (MODP 1024-bit, with 160-bit Prime Order Subgroup)  ***** NOT SURE, but more sure than 1-4 *****
                 23 (MODP 2048-bit, with 224-bit Prime Order Subgroup)
                 24 (MODP 2048-bit, with 256-bit Prime Order Subgroup)
                 25 (ECP 192-bit)
                 26 (ECP 224-bit)

I don't think anyone in the audience who uses IPsec & IKE uses groups 1-4 anymore anyway (people who remember punchin from Sun should know I never/rarely accepted anything less than group 5).

IF you happen to be using Oakley groups 1-4, STOP.  Had I access to the source, I'd compile these right out and set a flag day.

BTW, if you are using or providing SSL services, I'd highly recommend configuring them to avoid the weak DH groups mentioned in the URL above as well.

Thanks,
Dan McDonald -- OmniOS Engineering

p.s. I'm travelling today, so I won't be replying to mail until tonight at the earliest.


From danmcd at omniti.com  Wed May 20 17:03:45 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 20 May 2015 13:03:45 -0400
Subject: [OmniOS-discuss] [discuss] Logjam & IKE
In-Reply-To: <CAKr4wiSQ20a436VJLu8gyP+kpY6w9-HXZmCWCEcVKt0vSEv58A@mail.gmail.com>
References: <F1D0B61C-D24E-4393-B8D0-B9CBEE689675@omniti.com>
	<CAKr4wiSQ20a436VJLu8gyP+kpY6w9-HXZmCWCEcVKt0vSEv58A@mail.gmail.com>
Message-ID: <A6B6CE13-4ECD-431F-B842-BFB8D1EB9291@omniti.com>


> On May 20, 2015, at 10:06 AM, Jonathan Adams <t12nslookup at gmail.com> wrote:
> 
> Thanks for the heads up ... we have quite a few IKE/ipsec connections, although static ip addresses are used.  They've been in use since forever ...
> 
> fortunately we use 5 for all the connections.

You should really move up to 2048-bit MODP or use one of the 256-or-higher ECC groups.  Do you have legacy reasons not to?

Dan


From mtalbott at lji.org  Wed May 20 23:51:16 2015
From: mtalbott at lji.org (Michael Talbott)
Date: Wed, 20 May 2015 16:51:16 -0700
Subject: [OmniOS-discuss] Backing up HUGE zfs volumes
Message-ID: <C90FD9AB-D5A9-48C9-8A12-0F09F3D2DCC0@lji.org>

I'm trying to find ways of efficiently archiving up some huge (120TB and growing) zfs volumes with millions maybe billions of files of all sizes. I use zfs send/recv for replication to another box for tier 1/2 recovery. But, I'm trying to find a good open source solution that runs on Omni for archival purposes that doesn't have to crawl the filesystem or rely on any proprietary formats.

I was thinking I could use zfs diff to get a list of changed data, parse that into a usable format, create a tar and par of the data, and an accompanying plain text index file. From there, upload that set of data to a cloud provider. While I could probably script it all out myself to accomplish this, I'm hoping someone knows of an existing solution that can produce somewhat similar results.

Ideas anyone?

Thanks,

Michael

From chip at innovates.com  Thu May 21 12:24:56 2015
From: chip at innovates.com (Schweiss, Chip)
Date: Thu, 21 May 2015 07:24:56 -0500
Subject: [OmniOS-discuss] Backing up HUGE zfs volumes
In-Reply-To: <C90FD9AB-D5A9-48C9-8A12-0F09F3D2DCC0@lji.org>
References: <C90FD9AB-D5A9-48C9-8A12-0F09F3D2DCC0@lji.org>
Message-ID: <CALeZrrT-WKZe_GfnV976hD+spubY8bKt3okRiRYUWYrKfYKF3w@mail.gmail.com>

I would caution against anything using 'zfs diff'  It has been perpetually
broken, either not working at all, or returning incomplete information.

Avoiding crawling the directory is pretty much impossible unless you use
'zfs send'.   However, as long as there is enough cache on the system,
directory crawls can be very efficient.    I have daily rsync jobs that
crawl over 200 million files.   The impact of the crawl is not noticeable
to other users.

I has also used ZFS send to AWS Glacier.   This worked well until the data
change rate got high enough I need to start over too often to keep the
storage size reasonable on Glacier.

I also use CrashPlan on my home OmniOS server to back up about 5TB.  It
works very nicely.

-Chip

On Wed, May 20, 2015 at 6:51 PM, Michael Talbott <mtalbott at lji.org> wrote:

> I'm trying to find ways of efficiently archiving up some huge (120TB and
> growing) zfs volumes with millions maybe billions of files of all sizes. I
> use zfs send/recv for replication to another box for tier 1/2 recovery.
> But, I'm trying to find a good open source solution that runs on Omni for
> archival purposes that doesn't have to crawl the filesystem or rely on any
> proprietary formats.
>
> I was thinking I could use zfs diff to get a list of changed data, parse
> that into a usable format, create a tar and par of the data, and an
> accompanying plain text index file. From there, upload that set of data to
> a cloud provider. While I could probably script it all out myself to
> accomplish this, I'm hoping someone knows of an existing solution that can
> produce somewhat similar results.
>
> Ideas anyone?
>
> Thanks,
>
> Michael
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150521/5355c0c4/attachment-0001.html>

From richard.elling at richardelling.com  Thu May 21 23:58:28 2015
From: richard.elling at richardelling.com (Richard Elling)
Date: Thu, 21 May 2015 16:58:28 -0700
Subject: [OmniOS-discuss] disk failure causing reboot?
In-Reply-To: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com>
References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com>
Message-ID: <FF3E1107-756E-4A0A-A0D0-92C5F63CE81F@richardelling.com>


> On May 18, 2015, at 11:25 AM, Jeff Stockett <jstockett at molalla.com> wrote:
> 
> A drive failed in one of our supermicro 5048R-E1CR36L servers running omnios r151012 last night, and somewhat unexpectedly, the whole system seems to have panicked.
>  
> May 18 04:43:08 zfs01 scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,2f02 at 1/pci15d9,808 at 0 (mpt_sas0):
> May 18 04:43:08 zfs01         Log info 0x31140000 received for target 29 w50000c0f01f1bf06.
> May 18 04:43:08 zfs01         scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc

[forward reference]

> May 18 04:44:36 zfs01 genunix: [ID 843051 kern.info] NOTICE: SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major
> May 18 04:44:36 zfs01 unix: [ID 836849 kern.notice]
> May 18 04:44:36 zfs01 ^Mpanic[cpu0]/thread=ffffff00f3ecbc40:
> May 18 04:44:36 zfs01 genunix: [ID 918906 kern.notice] I/O to pool 'dpool' appears to be hung.
> May 18 04:44:36 zfs01 unix: [ID 100000 kern.notice]
> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecba20 zfs:vdev_deadman+10b ()

Bugs notwithstanding, the ZFS deadman timer occurs when a ZFS I/O does not
complete in 10,000 seconds (by default). The problem likely lies below ZFS. For this
reason, the deadman timer was invented -- don't blame ZFS for a problem below ZFS.

> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecba70 zfs:vdev_deadman+4a ()
> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbac0 zfs:vdev_deadman+4a ()
> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbaf0 zfs:spa_deadman+ad ()
> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbb90 genunix:cyclic_softint+fd ()
> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbba0 unix:cbe_low_level+14 ()
> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbbf0 unix:av_dispatch_softvect+78 ()
> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbc20 apix:apix_dispatch_softint+35 ()
> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05990 unix:switch_sp_and_call+13 ()
> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e059e0 apix:apix_do_softint+6c ()
> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05a40 apix:apix_do_interrupt+34a ()
> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05a50 unix:cmnint+ba ()
> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05bc0 unix:acpi_cpu_cstate+11b ()
> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05bf0 unix:cpu_acpi_idle+8d ()
> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05c00 unix:cpu_idle_adaptive+13 ()
> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05c20 unix:idle+a7 ()
> May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05c30 unix:thread_start+8 ()
> May 18 04:44:36 zfs01 unix: [ID 100000 kern.notice]
> May 18 04:44:36 zfs01 genunix: [ID 672855 kern.notice] syncing file systems...
> May 18 04:44:38 zfs01 genunix: [ID 904073 kern.notice]  done
> May 18 04:44:39 zfs01 genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
> May 18 04:44:39 zfs01 ahci: [ID 405573 kern.info] NOTICE: ahci0: ahci_tran_reset_dport port 1 reset port
> May 18 05:17:56 zfs01 genunix: [ID 100000 kern.notice]
> May 18 05:17:56 zfs01 genunix: [ID 665016 kern.notice] ^M100% done: 8607621 pages dumped,
> May 18 05:17:56 zfs01 genunix: [ID 851671 kern.notice] dump succeeded
>  
> The disks are all 4TB WD40001FYYG enterprise SAS drives.
> 

I've had such bad luck with that model, IMNSHO I recommend replacing with anything else :-(

That said, I don't think it is a root cause for this panic. To get the trail of tears, you'll need to
look at the FMA ereports for the 10,000 seconds prior to the panic. fmdump has a -t option you'll
find useful. The [foreward reference] is the result of a SCSI reset of the target, LUN, or HBA.
These occur when the sd driver has not had a reply and issues one of those types of resets *or*
the device or something in the data path resets.

HTH,
 -- richard

>   Googling seems to indicate it is a known problem with the way the various subsystems sometimes interact. Is there any way to fix/workaround this issue?
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150521/b92ab23d/attachment.html>

From guorong.koh at gmail.com  Fri May 22 01:12:03 2015
From: guorong.koh at gmail.com (Guo-Rong Koh)
Date: Fri, 22 May 2015 10:42:03 +0930
Subject: [OmniOS-discuss] Crashplan alternatives?
Message-ID: <1432257123.13727.16.camel@gmail.com>

Hello everyone,

I know others here are running Crashplan for Solaris on OmniOS.
However, given the (not so recent) retirement announcement:
https://helpdesk.code42.com/entries/53070937-Solaris-Platform
-Retirement-Announcement
I'm seeking some discussion and advice on possibilities.

My original strategy for supporting Linux and Windows clients in a home
server 
environment is starting to
disintegrate.
Due to this issue:
http://support.code42.com/CrashPlan/Latest/Troubleshooting/Computer-To
-Computer_Backups_Between_CrashPlan_App_4.2.0_And_Earlier_Versions_Cont
inuously_Synchronize
my Linux client is now no longer backing up the way it used to (OmniOS server is on Crashplan 3.7, Linux client 
automatically upgraded to 4.2.0).
Than
kfully, the Windows clients seem to be OK for now.

Eventually however, I expect the whole solution to fail when Code42 EOL all Solaris support.

My current options are:
1. Migrate Crashplan to a Linux KVM instance
 - this seems like the least effort for now
2. Find an alternate, multiplatform solution
 - thus far I have found nothing suitable

Do others here have a migration plan?

regards,
Guo-Rong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150522/fbcac7f8/attachment.html>

From matej at zunaj.si  Fri May 22 09:50:58 2015
From: matej at zunaj.si (Matej Zerovnik)
Date: Fri, 22 May 2015 11:50:58 +0200
Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and
 then resumes
In-Reply-To: <55518BFF.6080608@zunaj.si>
References: <55487539.6030408@zunaj.si>	<a1e7408b7a614dc4c3e96a85459bad62@miras.org>	<CABweQmLrOgD_utd_8HV4u5KzTXOH2xTjpG4KhgiNQUt+R2_LvA@mail.gmail.com>	<201505051648.t45GmpA4025308@lists-il.int.omniti.net>	<40C78E86-F32D-4588-AF98-EB9820019960@richardelling.com>
	<55518BFF.6080608@zunaj.si>
Message-ID: <555EFC02.5070802@zunaj.si>

After having troubles almost every week and missing the time frame to 
catch the bastard, today I finally had the opportunity to catch it in 
action:)

As it turns out, it looks like a ZFS(not likely) or HW(probably) 
problem. When in "hangup" state, iscsi and network worked flawlessly and 
I was able to connect to iSCSI(but mounting the FS and issuing 
commands(show lvm volume,..) worked really slow). I was also able to 
work on the server, so it wasn't locked up.

Then I decided to check the ZFS FS. I tried to create a file in ZFS 
mount directory by issuing 'touch test-file' and command froze. I tried 
to kill it with CTRL+C to no success. I tried to kill the process with 
kill -9, but that did not help either. Looking at iostat output, there 
was some reading happening, but absolutely no writes (0, nada).

I used 'lsiutils' to connect to my LSI HBA and issued port reset, 
following a hard SAS link reset in a hope it will come back, but it was 
still frozen. I also checked 'phy counters' in lsiutils, and there were 
some devices with errors, but that could be due to port / link reset.

Long story short, after 30min, everything returned to normal, without an 
errors message in logs or anywhere else. Bad thing is, iSCSI target 
froze a few minutes later and only way to resolve the trouble was to 
restart the server:(

Matej

On 12. 05. 2015 07:13, Matej Zerovnik wrote:
> I know building a single 50 drives RaidZ2 is a bad idea. As I said, 
> it's a legacy that I can't easily change. I already have a backup pool 
> with 7x10 drives RaidZ2 to which I hope I will be able to switch this 
> week. I hope to get some better results and less crashing...
>
> What is interesting is that when the 'event' happens, server works 
> normaly, ZFS is accessable and writable(at least, there is no errors 
> in log files), only iscsi reports errors and drops the connection. 
> Another interesting thing is that after the 'event', all write stops, 
> only read continues for another 30min. After 30min all traffic stops 
> for half an hour. After that, everything starts to coming back up... 
> Weird?!
>
> Matej
>
> On 09. 05. 2015 02:49, Richard Elling wrote:
>>
>>> On May 5, 2015, at 9:48 AM, Matej Zerovnik <matej at zunaj.si 
>>> <mailto:matej at zunaj.si>> wrote:
>>>
>>> I will replace the hardwarw in about 4 months with all SAS drives, 
>>> but I would love to have a working setup for the time being as well;)
>>>
>>> I looked at smart stats and there doesnt seem to be any errors. 
>>> Also, no hard/soft/transfer error reported by any drive. Will take a 
>>> look at service time tomorrow, maybe put the drives to graphite and 
>>> look at them over a longer period.
>>>
>>> I looked at iostat -x status today and stats for pool itself 
>>> reported 100% busy most of the time, 98-100% wait, 500-1300 
>>> transactions in queue, around 500 active,... First line, that is 
>>> average from boot, says avg service time.is <http://time.is> around 
>>> 1600ms which seems like aaaalot. Can it be due to really big queue?
>>>
>>> Would it help to create 5 10drives raidz pools instead of one with 
>>> 50 drives?
>>
>> It is a bad idea to build a single raidz set with 50 drives. Very 
>> bad. Hence the zpool
>> man page says, "The recommended number is between 3 and 9 to help 
>> increase performance."
>> But this recommendation applies to reliability, too.
>>  -- richard
>>
>
>
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150522/0b05a61f/attachment-0001.html>

From richard at netbsd.org  Fri May 22 15:47:45 2015
From: richard at netbsd.org (Richard PALO)
Date: Fri, 22 May 2015 17:47:45 +0200
Subject: [OmniOS-discuss] [developer] FLAG DAY - 4719 affects nightly,
	package, and poold
In-Reply-To: <1BAFD21A-EF91-4C6E-8A2A-4D2AB691574E@omniti.com>
References: <1BAFD21A-EF91-4C6E-8A2A-4D2AB691574E@omniti.com>
Message-ID: <555F4FA1.1000205@netbsd.org>

Le 05/05/15 19:32, Dan McDonald a ?crit :
> Illumos #4719 introduces a flag day for people who build illumos-gate.
> Starting now, you will need a Java Developers Kit (JDK) 7 or later.
> OpenIndiana 151a9 does NOT have this by default.  Builders must either set
> JAVA_ROOT to a source of JDK7, or must have /usr/java populated with JDK7.
> 
> Users still on JDK6 will see build errors in the packaging portions like
> such:

Kind reminder about the build-time/run-time issue for poold (https://www.illumos.org/issues/5851)
with its latest incantation: https://www.illumos.org/rb/r/34/ 

-- 
Richard PALO


From Josh.Barton at usurf.usu.edu  Fri May 22 16:03:40 2015
From: Josh.Barton at usurf.usu.edu (Josh Barton)
Date: Fri, 22 May 2015 16:03:40 +0000
Subject: [OmniOS-discuss] HP Proliant Gen9
Message-ID: <9595d4e2ca4b4569ac0d51ffc4c061b9@Perses.usurf.usu.edu>

I have been trying to install OmniOS on a HP Proliant Gen9 server (r151014) but it will only boot in Legacy boot mode. R151012 will boot but no disks are found when I try to install. Has anyone experienced these issues? R151012 worked with our Proliant Gen8, is this a driver issue or something else?

Thanks!

Josh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150522/6f760705/attachment.html>

From johan.kragsterman at capvert.se  Fri May 22 16:48:08 2015
From: johan.kragsterman at capvert.se (Johan Kragsterman)
Date: Fri, 22 May 2015 18:48:08 +0200
Subject: [OmniOS-discuss] Ang:  HP Proliant Gen9
In-Reply-To: <9595d4e2ca4b4569ac0d51ffc4c061b9@Perses.usurf.usu.edu>
References: <9595d4e2ca4b4569ac0d51ffc4c061b9@Perses.usurf.usu.edu>
Message-ID: <OF555AAE19.3B3A9FC5-ONC1257E4D.005C4C65-C1257E4D.005C4C68@inse.com>


Hi!


-----"OmniOS-discuss" <omnios-discuss-bounces at lists.omniti.com> skrev: -----
Till: "omnios-discuss at lists.omniti.com" <omnios-discuss at lists.omniti.com>
Fr?n: Josh Barton 
S?nt av: "OmniOS-discuss" 
Datum: 2015-05-22 18:04
?rende: [OmniOS-discuss] HP Proliant Gen9

I have been trying to install OmniOS on a HP Proliant Gen9 server (r151014) but it will only boot in Legacy boot mode. R151012 will boot but no disks are found when I try to install. Has anyone experienced these issues? R151012 worked with our Proliant Gen8, is this a driver issue or something else?

?




WHat type of controller du you use? Do you use the HP provided raid(what is it, P430i...?), or du you use something else?

Rgrds Johan




Thanks!

?

Josh

_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss



From danmcd at omniti.com  Fri May 22 17:39:34 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Fri, 22 May 2015 13:39:34 -0400
Subject: [OmniOS-discuss] HP Proliant Gen9
In-Reply-To: <9595d4e2ca4b4569ac0d51ffc4c061b9@Perses.usurf.usu.edu>
References: <9595d4e2ca4b4569ac0d51ffc4c061b9@Perses.usurf.usu.edu>
Message-ID: <3C4DCF0D-769D-498F-9FC9-55D0D187A844@omniti.com>


> On May 22, 2015, at 12:03 PM, Josh Barton <Josh.Barton at usurf.usu.edu> wrote:
> 
> I have been trying to install OmniOS on a HP Proliant Gen9 server (r151014) but it will only boot in Legacy boot mode. R151012 will boot but no disks are found when I try to install. Has anyone experienced these issues? R151012 worked with our Proliant Gen8, is this a driver issue or something else?

014 should also work on your Gen8.

I don't know enough about the HW characteristics of Gen9 to tell you what exactly is wrong.  Is "Legacy boot mode" using BIOS as opposed to EFI?  OmniOS doesn't support EFI boot, just BIOS.

Dan


From danmcd at omniti.com  Fri May 22 17:57:14 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Fri, 22 May 2015 13:57:14 -0400
Subject: [OmniOS-discuss] HP Proliant Gen9
In-Reply-To: <e166b575139f4599b5ca255b48985a66@Perses.usurf.usu.edu>
References: <9595d4e2ca4b4569ac0d51ffc4c061b9@Perses.usurf.usu.edu>
	<3C4DCF0D-769D-498F-9FC9-55D0D187A844@omniti.com>
	<e166b575139f4599b5ca255b48985a66@Perses.usurf.usu.edu>
Message-ID: <4F470FC2-2D9C-4C64-AF3A-8027C4113E7F@omniti.com>


Keeping this on the list so people know.

> On May 22, 2015, at 1:52 PM, Josh Barton <Josh.Barton at usurf.usu.edu> wrote:
> 
> Legacy Boot is just BIOS. 

So make sure you use that.

> I am using: HP Smart Array P440ar Controller

	? vendor: 103c ("Hewlett-Packard Company"), device: 3239 ("Smart Array Gen9 Controllers"), subvendor: 103c, subdevice: 21c0 ("P440ar")

That entry isn't in /etc/driver_aliases for OmniOS.  It is *possible* that the cpqary3 driver will work on this, but we would need to test it.

Can you get to the shell from the r151014 install media?  I can't remember if "lspci" is there, but if it isn't, "prtconf -d" output might be useful to share.

Dan


From danmcd at omniti.com  Fri May 22 21:05:07 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Fri, 22 May 2015 17:05:07 -0400
Subject: [OmniOS-discuss] Kayak post illumos 5896-5897
Message-ID: <FAF5026A-B401-4429-A7A4-BC04A71E0302@omniti.com>

A recent illumos bugfix broke the Kayak build (only on bloody for now).  Kayak assumes that svccfg in an alternate root only requires a background svc.configd.  It ALSO requires a background svc.startd.

I discussed this with the author of 5896-7, and he recommended that Kayak use the "-native" versions of svccfg in illumos-{gate,omnios}, because that's how the ON build populates SMF repositories as well.  The following webrev:

	http://kebe.com/~danmcd/webrevs/kayak-svccfg/

illustrates the changes.  I don't know how many people use kayak to BUILD images, but if you do, you should please take a look at this before the next stable release comes out.

Thanks,
Dan


From Josh.Barton at usurf.usu.edu  Fri May 22 23:09:30 2015
From: Josh.Barton at usurf.usu.edu (Josh Barton)
Date: Fri, 22 May 2015 23:09:30 +0000
Subject: [OmniOS-discuss] Proliant gen9
Message-ID: <ae377e07626648a592341c93f555c45b@Perses.usurf.usu.edu>

An update to  my previous message:
I am using only  Legacy BIOS boot mode now and skipping UEFI entirely. Using the changes to the grub menu found in the link below I was able to  get to the install screen using the r151014 image however I still get a no disk found error. The controller is : HP Smart Array P440ar Controller

I have been trying to install OmniOS on a HP Proliant Gen9 server (r151014) but it will only boot in Legacy boot mode. R151012 will boot but no disks are found when I try to install. Has anyone experienced these issues? R151012 worked with our Proliant Gen8, is this a driver issue or something else?



See:
http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c04633840

Thanks,

Josh Barton
Utah Stage University Research Foundation
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150522/481df32a/attachment.html>

From nsmith at careyweb.com  Fri May 22 23:37:57 2015
From: nsmith at careyweb.com (Nate Smith)
Date: Fri, 22 May 2015 19:37:57 -0400
Subject: [OmniOS-discuss] =?utf-8?q?Proliant_gen9?=
Message-ID: <1104717796-2240@mail.careyweb.com>

What is the on board storage controller?  That's  probably the unsupported hardware.

-Nate


-----Original Message-----

From: Josh Barton [Josh.Barton at usurf.usu.edu]
Received: Friday, 22 May 2015, 7:10PM
To: omnios-discuss at lists.omniti.com [omnios-discuss at lists.omniti.com]
Subject: [OmniOS-discuss] Proliant gen9






An update to  my previous message:

I am using only  Legacy BIOS boot mode now and skipping UEFI entirely. Using the changes to the grub menu found in the link below I was able to  get to the install screen using the r151014 image however I still get a no disk found error.
 The controller is : HP Smart Array P440ar Controller

 

I have been trying to install OmniOS on a HP Proliant Gen9 server (r151014) but it will only boot in Legacy boot mode. R151012 will boot but no disks are found when I try to install. Has anyone experienced these issues? R151012 worked with
 our Proliant Gen8, is this a driver issue or something else?

 

 

 

See:

http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c04633840

 

Thanks,

 

Josh Barton

Utah Stage University Research Foundation
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150522/5347f862/attachment-0001.html>

From mmabis at vmware.com  Sat May 23 08:08:31 2015
From: mmabis at vmware.com (Matthew Mabis)
Date: Sat, 23 May 2015 08:08:31 +0000
Subject: [OmniOS-discuss] SMB version with NTLM authentication version.
In-Reply-To: <1BA5BE7A-5BEF-4882-8A85-0EAC713E2C80@omniti.com>
References: <57413703-4516@mail.careyweb.com>,
	<1BA5BE7A-5BEF-4882-8A85-0EAC713E2C80@omniti.com>
Message-ID: <8d6f305890664c13bcf43f30fb752067@EX13-MBX-017.vmware.com>

Hey all,

Wondering if you could verify with me the following information about the SMB protocol within OmniOs..   Is it currently using SMBv1 as a standard i saw a blog that discussed that doing further research it looks like SMBv3 or v2 is not built into Omni at this time (you can tell me if i am wrong)

I had an issue today with my OSX 10.10.3 build where i have been connecting to my OmniOS NAS with no issues until today where it complained to me that i was using NLTMv1 and i had to use a workaround plist to get it working... 

Just wondering if this has been seen or is a known issue?  Just seems to have popped up within the last month or so...   Any other workarounds you might have on this would greatly be appreciated just wondering if this is known and if the SMB version has something to do with it?

Thanks for your time

Matt Mabis

From danmcd at omniti.com  Tue May 26 15:49:30 2015
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 26 May 2015 11:49:30 -0400
Subject: [OmniOS-discuss] FLAG DAY for people who build Kayak images
Message-ID: <F7E821FB-4066-4146-A43C-ABE88FD4A74A@omniti.com>

You may ignore this note if you do not use Kayak to create your own custom images.  If you merely download the images from us, you're in good shape.

This upstream illumos change:

commit 2ba6d2b94a398caab9e751c277f0acbd1cc22c77
Author: Robert Mustacchi <rm at joyent.com>
Date:   Thu Apr 30 15:25:12 2015 -0700

    5896 svccfg import returns before service can be used by svcadm
    5897 improve comments for svc.startd
    Reviewed by: Jerry Jelinek <jerry.jelinek at joyent.com>
    Approved by: Dan McDonald <danmcd at omniti.com>

introduces a flag day for users of OmniOS bloody who wish to build custom images using Kayak.

The kayak build system uses svccfg(1M) and an alternate root to pre-populate disk images prior to their snapshotting and compression.  Using the build machine's svccfg(1M) like this will cause a freeze in kayak.

If you use native Kayak make, you merely need this update in your kayak repo:

commit d8b5bbd76a85b6d54a471fc3021df27dd7b2e51e
Author: Dan McDonald <danmcd at omniti.com>
Date:   Fri May 22 14:24:21 2015 -0400

    Use PREBUILT_ILLUMOS's svccfg-native to stop lockups post-5896/5897

and a pre-built illumos-omnios whose path needs to be in the environment, or as an argument to gmake:

	gmake PREBUILT_ILLUMOS=<path>

I have a fix in omnios-build to export PREBUILT_ILLUMOS.  If that does not work, then I will further alter omnios-builds kayak/build.sh to push PREBUILT_ILLUMOS right into gmake.

Thanks,
Dan


From nsmith at careyweb.com  Tue May 26 19:05:09 2015
From: nsmith at careyweb.com (Nate Smith)
Date: Tue, 26 May 2015 15:05:09 -0400
Subject: [OmniOS-discuss] Proliant gen9
In-Reply-To: <3415d08010da42d2872b035f1c298dcf@Perses.usurf.usu.edu>
References: <1104717796-2240@mail.careyweb.com>
	<3415d08010da42d2872b035f1c298dcf@Perses.usurf.usu.edu>
Message-ID: <f5dd0bf7-3c11-4dea-aa5b-352121a0c900@careyweb.com>

Sorry. I missed the P440ar the first time I read through the thread.

https://www.illumos.org/issues/5390
 
Details this issue. It doesn?t look like the patch to get this driver supported has been tested or upstreamed. (someone else will have to answer).

Have you tried switching to HBA mode as detailed below?

http://h20564.www2.hp.com/hpsc/doc/public/display?docId=c03909334
 
I know that HP lists it as supporting Solaris 11, but that doesn?t mean it?s OmniOS compatible
 
http://h20564.www2.hp.com/hpsc/swd/public/readIndex?sp4ts.oid=7274897&swLangOid=8&swEnvOid=4167
 
This thread may be of some help. It looks like, at least for now, you have some bleeding edge hardware with unknown omnios support.
 
https://forums.freenas.org/index.php?threads/hp-gen9-server-w-p840-hba-mode-no-drives-visible.28620/
 
You might have to get another compatible storage controller, or pay some support bucks to get it tested/integrated. 
 
The supported storage controllers for the HP Gen 9s are here.
 
http://www.hp.com/hpinfo/newsroom/press_kits/2014/ComputeEra/HP_SmartStorage_ProLiantGen9_DataSheet.pdf
 
The P441 and P840 might have better luck, but I?m not sure. Ideally, you can run with a storage controller listed in the HCL
 
http://illumos.org/hcl
 
Hope this helps.
 
-Nate
 
 
 
 
From: Josh Barton [mailto:Josh.Barton at usurf.usu.edu] 
Sent: Tuesday, May 26, 2015 1:11 PM
To: Nate Smith
Subject: RE: [OmniOS-discuss] Proliant gen9
 
All I can find for the Storage controller is: HP Smart Array P440ar Controller
 
Thanks for taking the time to look at this
 
Josh
 
From: Nate Smith [mailto:nsmith at careyweb.com] 
Sent: Friday, May 22, 2015 5:38 PM
To: omnios-discuss at lists.omniti.com; Josh Barton
Subject: RE: [OmniOS-discuss] Proliant gen9
 
What is the on board storage controller? That's probably the unsupported hardware.

-Nate

-----Original Message----- 
From: Josh Barton [Josh.Barton at usurf.usu.edu]
Received: Friday, 22 May 2015, 7:10PM
To: omnios-discuss at lists.omniti.com [omnios-discuss at lists.omniti.com]
Subject: [OmniOS-discuss] Proliant gen9
An update to  my previous message:
I am using only  Legacy BIOS boot mode now and skipping UEFI entirely. Using the changes to the grub menu found in the link below I was able to  get to the install screen using the r151014 image however I still get a no disk found error. The controller is : HP Smart Array P440ar Controller
 
I have been trying to install OmniOS on a HP Proliant Gen9 server (r151014) but it will only boot in Legacy boot mode. R151012 will boot but no disks are found when I try to install. Has anyone experienced these issues? R151012 worked with our Proliant Gen8, is this a driver issue or something else?
 
 
 
See:
http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c04633840
 
Thanks,
 
Josh Barton
Utah Stage University Research Foundation
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150526/74f61f50/attachment.html>

From anon at omniti.com  Tue May 26 20:18:50 2015
From: anon at omniti.com (Anon)
Date: Tue, 26 May 2015 16:18:50 -0400
Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then
	resumes
Message-ID: <CADa4Us9AZdmbAnwqE_4jaNO4DZHuGxYXOhWcSga8op+y+0LmJw@mail.gmail.com>

Hi Matej,

Do you have sar running on your system? I'd recommend maybe running it at a
short interval so that you can get historical disk statistics. You can use
this info to rule out if its the disks or not. You can also use iotop -P to
get a real time view of %IO to see if it's the disks. You can also use
zpool iostat -v 1.

Also, do you have baseline benchmark of performance and know if you're
meeting/exceeding it? The baseline should be for random and sequential IO;
you can use bonnie++ to get this information.

Are you able to share your ZFS configuration and iSCSI configuration?

For iSCSI, can you take a look at this:
http://docs.oracle.com/cd/E23824_01/html/821-1459/fpjwy.html#fsume

Do you have detailed logs for the clients experiencing the issues? If not
are you able to enable verbose logging (such as debug level logs)?

Regards,
Anon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150526/f7e021b1/attachment.html>

From matej at zunaj.si  Wed May 27 06:58:16 2015
From: matej at zunaj.si (Matej Zerovnik)
Date: Wed, 27 May 2015 08:58:16 +0200
Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and
	then resumes
In-Reply-To: <CADa4Us9AZdmbAnwqE_4jaNO4DZHuGxYXOhWcSga8op+y+0LmJw@mail.gmail.com>
References: <CADa4Us9AZdmbAnwqE_4jaNO4DZHuGxYXOhWcSga8op+y+0LmJw@mail.gmail.com>
Message-ID: <D230D9E3-602D-4C36-A64B-2B3D7EE5281C@zunaj.si>

Hello Josten,


> On 26 May 2015, at 22:18, Anon <anon at omniti.com> wrote:
> 
> Hi Matej,
> 
> Do you have sar running on your system? I'd recommend maybe running it at a short interval so that you can get historical disk statistics. You can use this info to rule out if its the disks or not. You can also use iotop -P to get a real time view of %IO to see if it's the disks. You can also use zpool iostat -v 1.

I didn?t have sar or iotop running, but I had 'iostat -xn' and 'zpool iostat -v 1' running when things stopped working, but there is nothing unusual in there. Write ops suddenly fall to 0 and that?s it. Reads are still happening and according to network traffic, there is outgoing traffic when I?m unable to write to the ZFS FS (even locally on the server). I created a simple text file, so next time system hangs, I will be able to check if system is readable (currently, I only have iscsi volumes, so I?m unable to check that locally on server).

> 
> Also, do you have baseline benchmark of performance and know if you're meeting/exceeding it? The baseline should be for random and sequential IO; you can use bonnie++ to get this information.

I can, with 99,99% say, I?m exceeding performance of the pool itself. It?s a single raidz2 vdev with 50 hard drives and 70 connected clients. some are idling, but 10-20 clients are pushing data to server. I know zpool configuration is very bad, but that?s a legacy I can?t change easily. I?m already syncing data to another 7 vdev server, but since this server is so busy, transfers are happening VERY SLOW (read, zfs sync doing 10MB/s).

> 
> Are you able to share your ZFS configuration and iSCSI configuration?

Sure! Here are zfs settings:

zfs get all data:
NAME  PROPERTY              VALUE                  SOURCE
data  type                  filesystem             -
data  creation              Fri Oct 25 20:26 2013  -
data  used                  104T                   -
data  available             61.6T                  -
data  referenced            1.09M                  -
data  compressratio         1.08x                  -
data  mounted               yes                    -
data  quota                 none                   default
data  reservation           none                   default
data  recordsize            128K                   default
data  mountpoint            /volumes/data          received
data  sharenfs              off                    default
data  checksum              on                     default
data  compression           off                    received
data  atime                 off                    local
data  devices               on                     default
data  exec                  on                     default
data  setuid                on                     default
data  readonly              off                    local
data  zoned                 off                    default
data  snapdir               hidden                 default
data  aclmode               discard                default
data  aclinherit            restricted             default
data  canmount              on                     default
data  xattr                 on                     default
data  copies                1                      default
data  version               5                      -
data  utf8only              off                    -
data  normalization         none                   -
data  casesensitivity       sensitive              -
data  vscan                 off                    default
data  nbmand                off                    default
data  sharesmb              off                    default
data  refquota              none                   default
data  refreservation        none                   default
data  primarycache          all                    default
data  secondarycache        all                    default
data  usedbysnapshots       0                      -
data  usedbydataset         1.09M                  -
data  usedbychildren        104T                   -
data  usedbyrefreservation  0                      -
data  logbias               latency                default
data  dedup                 off                    local
data  mlslabel              none                   default
data  sync                  standard               default
data  refcompressratio      1.00x                  -
data  written               1.09M                  -
data  logicalused           98.1T                  -
data  logicalreferenced     398K                   -
data  filesystem_limit      none                   default
data  snapshot_limit        none                   default
data  filesystem_count      none                   default
data  snapshot_count        none                   default
data  redundant_metadata    all                    default
data  nms:dedup-dirty       on                     received
data  nms:description       datauporabnikov        received

I?m not sure what iSCSI configuration do you want/need? But as far as I figured out in the last 'freeze', iSCSI is not the problem, since I?m unable to write to ZFS volume even if I?m local on the server itself.

> 
> For iSCSI, can you take a look at this: http://docs.oracle.com/cd/E23824_01/html/821-1459/fpjwy.html#fsume <http://docs.oracle.com/cd/E23824_01/html/821-1459/fpjwy.html#fsume>

Interesting. I tried running 'iscsiadm list target' but it doesn?t return anything. There is also nothing in /var/adm/messages as usual:) But target service is online (according to svcs), clients are connected and having traffic.

> 
> Do you have detailed logs for the clients experiencing the issues? If not are you able to enable verbose logging (such as debug level logs)?

I have clients logs, but they mostly just report loosing connections and reconnecting:

Example 1:
Apr 29 10:33:53 eee kernel: connection1:0: detected conn error (1021)
Apr 29 10:33:54 eee iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3)
Apr 29 10:33:56 eee iscsid: connection1:0 is operational after recovery (1 attempts)
Apr 29 10:36:37 eee kernel: connection1:0: detected conn error (1021)
Apr 29 10:36:37 eee iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3)
Apr 29 10:36:40 eee iscsid: connection1:0 is operational after recovery (1 attempts)
Apr 29 10:36:50 eee kernel: sd 3:0:0:0: Device offlined - not ready after error recovery
Apr 29 10:36:51 eee kernel: sd 3:0:0:0: Device offlined - not ready after error recovery
Apr 29 10:36:51 eee kernel: sd 3:0:0:0: Device offlined - not ready after error recovery

Example 2:
Apr 16 08:41:40 vf kernel: connection1:0: pdu (op 0x5e itt 0x1) rejected. Reason code 0x7
Apr 16 08:43:11 vf kernel: connection1:0: pdu (op 0x5e itt 0x1) rejected. Reason code 0x7
Apr 16 08:44:13 vf kernel: connection1:0: pdu (op 0x5e itt 0x1) rejected. Reason code 0x7
Apr 16 08:45:51 vf kernel: connection1:0: detected conn error (1021) Apr 16 08:45:51 317 iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3)
Apr 16 08:45:53 vf iscsid: connection1:0 is operational after recovery (1 attempts)


I?m already in contact with OmniTI regarding our new build, but in the mean time, I would love for our clients to be able to use the storage so I?m trying to resolve the current issue somehow?

Matej


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150527/1c170f52/attachment-0001.html>

From matej at zunaj.si  Fri May 29 11:09:56 2015
From: matej at zunaj.si (Matej Zerovnik)
Date: Fri, 29 May 2015 13:09:56 +0200
Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and
	then resumes
In-Reply-To: <CADa4Us9KH7BTtrkNy_uAXUEFUU97-6BT2U8kvjnS-=BNry7EhA@mail.gmail.com>
References: <CADa4Us9AZdmbAnwqE_4jaNO4DZHuGxYXOhWcSga8op+y+0LmJw@mail.gmail.com>
	<D230D9E3-602D-4C36-A64B-2B3D7EE5281C@zunaj.si>
	<CADa4Us9KH7BTtrkNy_uAXUEFUU97-6BT2U8kvjnS-=BNry7EhA@mail.gmail.com>
Message-ID: <6C56A6D6-A5BB-46DA-A10B-510F11FEF7BE@zunaj.si>

Today the server crashed again. I?m not sure if it?s because I was running SMART short self-tests or not, but it looks like it started around that time. 

I?m still running smart tests, but it looks like there are no errors on the drives, although some tests take up to 30min to finish? iostat -E also reports no errors.

When it froze, I started iostat and tried to write a file to ZFS pool. As usual, it froze, but I left iostat running, hoping it will give me some infos? After 30 or something minutes, system become responsible again and this is how my iostat output looks like:
http://pastebin.com/W4EWgnzq <http://pastebin.com/W4EWgnzq>

System got responsible at 'Fri May 29 11:38:45 CEST 2015'.

It?s weird to say the least. It looks like there is something in write buffer that hogs the ZFS for quite some time and gets released or times-out after a certain time. But I?m not sure that it is and what thing has such a long timeout. It looks like freeze lasted for 15 minutes.

Matej

> On 28 May 2015, at 18:30, Anon <anon at omniti.com> wrote:
> 
> Have you verified that your disks are not having any issues with smartctl and iostat -E ?
> 
> I'd suggest running a short test on the disks: smartctl -d sat,12 -t short /path/to/disk (note: you may need to append s2 to the physical disk name).
> 
> I built a test target and iSCSI initiator and wrote 1G from /dev/zero and ended up crashing the sesssion; are your sessions under load?
> 
> On Wed, May 27, 2015 at 2:58 AM, Matej Zerovnik <matej at zunaj.si <mailto:matej at zunaj.si>> wrote:
> Hello Josten,
> 
> 
>> On 26 May 2015, at 22:18, Anon <anon at omniti.com <mailto:anon at omniti.com>> wrote:
>> 
>> Hi Matej,
>> 
>> Do you have sar running on your system? I'd recommend maybe running it at a short interval so that you can get historical disk statistics. You can use this info to rule out if its the disks or not. You can also use iotop -P to get a real time view of %IO to see if it's the disks. You can also use zpool iostat -v 1.
> 
> I didn?t have sar or iotop running, but I had 'iostat -xn' and 'zpool iostat -v 1' running when things stopped working, but there is nothing unusual in there. Write ops suddenly fall to 0 and that?s it. Reads are still happening and according to network traffic, there is outgoing traffic when I?m unable to write to the ZFS FS (even locally on the server). I created a simple text file, so next time system hangs, I will be able to check if system is readable (currently, I only have iscsi volumes, so I?m unable to check that locally on server).
> 
>> 
>> Also, do you have baseline benchmark of performance and know if you're meeting/exceeding it? The baseline should be for random and sequential IO; you can use bonnie++ to get this information.
> 
> I can, with 99,99% say, I?m exceeding performance of the pool itself. It?s a single raidz2 vdev with 50 hard drives and 70 connected clients. some are idling, but 10-20 clients are pushing data to server. I know zpool configuration is very bad, but that?s a legacy I can?t change easily. I?m already syncing data to another 7 vdev server, but since this server is so busy, transfers are happening VERY SLOW (read, zfs sync doing 10MB/s).
> 
>> 
>> Are you able to share your ZFS configuration and iSCSI configuration?
> 
> Sure! Here are zfs settings:
> 
> zfs get all data:
> NAME  PROPERTY              VALUE                  SOURCE
> data  type                  filesystem             -
> data  creation              Fri Oct 25 20:26 2013  -
> data  used                  104T                   -
> data  available             61.6T                  -
> data  referenced            1.09M                  -
> data  compressratio         1.08x                  -
> data  mounted               yes                    -
> data  quota                 none                   default
> data  reservation           none                   default
> data  recordsize            128K                   default
> data  mountpoint            /volumes/data          received
> data  sharenfs              off                    default
> data  checksum              on                     default
> data  compression           off                    received
> data  atime                 off                    local
> data  devices               on                     default
> data  exec                  on                     default
> data  setuid                on                     default
> data  readonly              off                    local
> data  zoned                 off                    default
> data  snapdir               hidden                 default
> data  aclmode               discard                default
> data  aclinherit            restricted             default
> data  canmount              on                     default
> data  xattr                 on                     default
> data  copies                1                      default
> data  version               5                      -
> data  utf8only              off                    -
> data  normalization         none                   -
> data  casesensitivity       sensitive              -
> data  vscan                 off                    default
> data  nbmand                off                    default
> data  sharesmb              off                    default
> data  refquota              none                   default
> data  refreservation        none                   default
> data  primarycache          all                    default
> data  secondarycache        all                    default
> data  usedbysnapshots       0                      -
> data  usedbydataset         1.09M                  -
> data  usedbychildren        104T                   -
> data  usedbyrefreservation  0                      -
> data  logbias               latency                default
> data  dedup                 off                    local
> data  mlslabel              none                   default
> data  sync                  standard               default
> data  refcompressratio      1.00x                  -
> data  written               1.09M                  -
> data  logicalused           98.1T                  -
> data  logicalreferenced     398K                   -
> data  filesystem_limit      none                   default
> data  snapshot_limit        none                   default
> data  filesystem_count      none                   default
> data  snapshot_count        none                   default
> data  redundant_metadata    all                    default
> data  nms:dedup-dirty       on                     received
> data  nms:description       datauporabnikov        received
> 
> I?m not sure what iSCSI configuration do you want/need? But as far as I figured out in the last 'freeze', iSCSI is not the problem, since I?m unable to write to ZFS volume even if I?m local on the server itself.
> 
>> 
>> For iSCSI, can you take a look at this: http://docs.oracle.com/cd/E23824_01/html/821-1459/fpjwy.html#fsume <http://docs.oracle.com/cd/E23824_01/html/821-1459/fpjwy.html#fsume>
> 
> Interesting. I tried running 'iscsiadm list target' but it doesn?t return anything. There is also nothing in /var/adm/messages as usual:) But target service is online (according to svcs), clients are connected and having traffic.
> 
>> 
>> Do you have detailed logs for the clients experiencing the issues? If not are you able to enable verbose logging (such as debug level logs)?
> 
> I have clients logs, but they mostly just report loosing connections and reconnecting:
> 
> Example 1:
> Apr 29 10:33:53 eee kernel: connection1:0: detected conn error (1021)
> Apr 29 10:33:54 eee iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3)
> Apr 29 10:33:56 eee iscsid: connection1:0 is operational after recovery (1 attempts)
> Apr 29 10:36:37 eee kernel: connection1:0: detected conn error (1021)
> Apr 29 10:36:37 eee iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3)
> Apr 29 10:36:40 eee iscsid: connection1:0 is operational after recovery (1 attempts)
> Apr 29 10:36:50 eee kernel: sd 3:0:0:0: Device offlined - not ready after error recovery
> Apr 29 10:36:51 eee kernel: sd 3:0:0:0: Device offlined - not ready after error recovery
> Apr 29 10:36:51 eee kernel: sd 3:0:0:0: Device offlined - not ready after error recovery
> 
> Example 2:
> Apr 16 08:41:40 vf kernel: connection1:0: pdu (op 0x5e itt 0x1) rejected. Reason code 0x7
> Apr 16 08:43:11 vf kernel: connection1:0: pdu (op 0x5e itt 0x1) rejected. Reason code 0x7
> Apr 16 08:44:13 vf kernel: connection1:0: pdu (op 0x5e itt 0x1) rejected. Reason code 0x7
> Apr 16 08:45:51 vf kernel: connection1:0: detected conn error (1021) Apr 16 08:45:51 317 iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3)
> Apr 16 08:45:53 vf iscsid: connection1:0 is operational after recovery (1 attempts)
> 
> 
> I?m already in contact with OmniTI regarding our new build, but in the mean time, I would love for our clients to be able to use the storage so I?m trying to resolve the current issue somehow?
> 
> Matej
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150529/95bd0b1c/attachment-0001.html>

From heinz at licenser.net  Sat May 30 15:58:57 2015
From: heinz at licenser.net (Heinz Nikolaus Gies)
Date: Sat, 30 May 2015 17:58:57 +0200
Subject: [OmniOS-discuss] zpool list -p
Message-ID: <1E73E845-FD72-496D-9E7C-650580B7E305@licenser.net>

I was looking at the output of zpool list today, comparing it with what I?d get on SmartOS and noticed that when using the -p flag for parable output the deduplication factor is still presented as a string (or floatish type) instead of a integer vlaue. It seems to me a bit wrong for parable output.

If there is a reason behind that decision it?s fine and I?ll gladly work around it, but it feels like an oversight.


Cheers,
Heinz

Here a quick glance:

OmniOS:
/usr/sbin/zpool list -pH -oname,size,alloc,free,dedup,health
data    7971459301376   6405101887488   1566357413888   1.00x   ONLINE
rpool   249108103168    121560741376    127547361792    1.00x   ONLINE

SmartOS:
list -pH -oname,size,alloc,free,dedup,health
zones	319975063552	51935040512	268040023040	100	ONLINE
---
Cheers,
Heinz Nikolaus Gies
heinz at licenser.net



-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://omniosce.org/ml-archive/attachments/20150530/e7556e70/attachment.bin>

From heinz at licenser.net  Sat May 30 17:42:55 2015
From: heinz at licenser.net (Heinz Nikolaus Gies)
Date: Sat, 30 May 2015 19:42:55 +0200
Subject: [OmniOS-discuss] zpool list -p
In-Reply-To: <CAED-Y_DQx0QE0VRqjOe6Cr4Xr2AffXPu63iCHv2_kDVif-mpYA@mail.gmail.com>
References: <1E73E845-FD72-496D-9E7C-650580B7E305@licenser.net>
	<CAED-Y_DQx0QE0VRqjOe6Cr4Xr2AffXPu63iCHv2_kDVif-mpYA@mail.gmail.com>
Message-ID: <22AE6795-CD87-4FAA-B9BB-3C173012C7B7@licenser.net>

zpool upgrade -v shows the same version on both systems.

I would suspect that Joyent has modified the zpool utility, but it seems like a sensible change.

---
Cheers,
Heinz Nikolaus Gies
heinz at licenser.net



> On May 30, 2015, at 19:20, Krzysztof Grzempa <grzempek at gmail.com> wrote:
> 
> Did you compare ZFS versions on both OS'es ? This might changed over some new version..
> 
> 2015-05-30 17:58 GMT+02:00 Heinz Nikolaus Gies <heinz at licenser.net <mailto:heinz at licenser.net>>:
> I was looking at the output of zpool list today, comparing it with what I?d get on SmartOS and noticed that when using the -p flag for parable output the deduplication factor is still presented as a string (or floatish type) instead of a integer vlaue. It seems to me a bit wrong for parable output.
> 
> If there is a reason behind that decision it?s fine and I?ll gladly work around it, but it feels like an oversight.
> 
> 
> Cheers,
> Heinz
> 
> Here a quick glance:
> 
> OmniOS:
> /usr/sbin/zpool list -pH -oname,size,alloc,free,dedup,health
> data    7971459301376   6405101887488   1566357413888   1.00x   ONLINE
> rpool   249108103168    121560741376    127547361792    1.00x   ONLINE
> 
> SmartOS:
> list -pH -oname,size,alloc,free,dedup,health
> zones   319975063552    51935040512     268040023040    100     ONLINE
> ---
> Cheers,
> Heinz Nikolaus Gies
> heinz at licenser.net <mailto:heinz at licenser.net>
> 
> 
> 
> 
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com <mailto:OmniOS-discuss at lists.omniti.com>
> http://lists.omniti.com/mailman/listinfo/omnios-discuss <http://lists.omniti.com/mailman/listinfo/omnios-discuss>
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150530/37fd37a4/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://omniosce.org/ml-archive/attachments/20150530/37fd37a4/attachment.bin>

From heinz at licenser.net  Sat May 30 18:55:44 2015
From: heinz at licenser.net (Heinz Nikolaus Gies)
Date: Sat, 30 May 2015 20:55:44 +0200
Subject: [OmniOS-discuss] I think I broke part of the network stack
Message-ID: <098DC8FA-29CD-4DA0-A87F-58DF4629A092@licenser.net>

Hi,

I got the feeling that I did break part of the networks tack on a server of mine. The list of commands I executed are attached, right now no diadem show-* command returns output (return code is still 0). I do suspect that was caused by adding a vmic with the same name twice but I am not sure and don?t want to reboot at this point so no evidence is destroyed.

Cheers,
Heinz

  397  dladm show-vnic
  398  dladm show-vnic -v
  399  dladm show-phys
  401  man dladm
  402  dladm create-vnic -l bge0 net0
  403  dladm show-phys
  404  dladm show-vnic
  405  dladm destroy-vnic -l bge0 net0
  406  dladm delete-vnic -l bge0 net0
  407  dladm delete-vnic  net0
  408  dladm create-vnic -l bge0 net0 -p zone=2398fe7c-032f-11e5-abb0-b33f9f953915
  412  dladm create-vnic -l bge0 net0
  416  dladm show-vnic
  417  dladm create-vnic -l bge0 net0
  419  dladm show-vnic
  490  dladm show-vnic
  491  dladm show-phys
  492  dladm
  493  dladm show-phys
  494  dladm show-phys -v
  495  dladm
  496  dladm show-link
  497  dladm show-link
  499  dladm show-link

---
Cheers,
Heinz Nikolaus Gies
heinz at licenser.net



-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://omniosce.org/ml-archive/attachments/20150530/d178c0f0/attachment.bin>

From heinz at licenser.net  Sat May 30 18:57:49 2015
From: heinz at licenser.net (Heinz Nikolaus Gies)
Date: Sat, 30 May 2015 20:57:49 +0200
Subject: [OmniOS-discuss] I think I broke part of the network stack
In-Reply-To: <098DC8FA-29CD-4DA0-A87F-58DF4629A092@licenser.net>
References: <098DC8FA-29CD-4DA0-A87F-58DF4629A092@licenser.net>
Message-ID: <9486DC7C-75A1-45EF-A57B-9660444495FD@licenser.net>

adding to that the only content of /etc/dladm/*.conf files is

bge0    class=int,1;media=int,4;phyinst=int,1;phymaj=int,120;devname=string,bge0;
net0    class=int,8;media=int,4;linkover=string,bge0;maddrtype=int,1;vrid=int,0;vraf=int,0;macaddr=string,2:8:20:c0:a7:c0;

in /etc/dladm/datalink.conf (comments excluded)
---
Cheers,
Heinz Nikolaus Gies
heinz at licenser.net



> On May 30, 2015, at 20:55, Heinz Nikolaus Gies <heinz at licenser.net> wrote:
> 
> Hi,
> 
> I got the feeling that I did break part of the networks tack on a server of mine. The list of commands I executed are attached, right now no diadem show-* command returns output (return code is still 0). I do suspect that was caused by adding a vmic with the same name twice but I am not sure and don?t want to reboot at this point so no evidence is destroyed.
> 
> Cheers,
> Heinz
> 
>  397  dladm show-vnic
>  398  dladm show-vnic -v
>  399  dladm show-phys
>  401  man dladm
>  402  dladm create-vnic -l bge0 net0
>  403  dladm show-phys
>  404  dladm show-vnic
>  405  dladm destroy-vnic -l bge0 net0
>  406  dladm delete-vnic -l bge0 net0
>  407  dladm delete-vnic  net0
>  408  dladm create-vnic -l bge0 net0 -p zone=2398fe7c-032f-11e5-abb0-b33f9f953915
>  412  dladm create-vnic -l bge0 net0
>  416  dladm show-vnic
>  417  dladm create-vnic -l bge0 net0
>  419  dladm show-vnic
>  490  dladm show-vnic
>  491  dladm show-phys
>  492  dladm
>  493  dladm show-phys
>  494  dladm show-phys -v
>  495  dladm
>  496  dladm show-link
>  497  dladm show-link
>  499  dladm show-link
> 
> ---
> Cheers,
> Heinz Nikolaus Gies
> heinz at licenser.net
> 
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://omniosce.org/ml-archive/attachments/20150530/78f564e1/attachment.bin>

From heinz at licenser.net  Sat May 30 20:55:55 2015
From: heinz at licenser.net (Heinz Nikolaus Gies)
Date: Sat, 30 May 2015 22:55:55 +0200
Subject: [OmniOS-discuss] I think I broke part of the network stack
In-Reply-To: <9486DC7C-75A1-45EF-A57B-9660444495FD@licenser.net>
References: <098DC8FA-29CD-4DA0-A87F-58DF4629A092@licenser.net>
	<9486DC7C-75A1-45EF-A57B-9660444495FD@licenser.net>
Message-ID: <0B916418-7689-465E-AB6E-0BC8C2F4C83F@licenser.net>

Had to restart to get the system back into working condition, solved the issue but probably lost the state that caused it. Sorry
---
Cheers,
Heinz Nikolaus Gies
heinz at licenser.net



> On May 30, 2015, at 20:57, Heinz Nikolaus Gies <heinz at licenser.net> wrote:
> 
> adding to that the only content of /etc/dladm/*.conf files is
> 
> bge0    class=int,1;media=int,4;phyinst=int,1;phymaj=int,120;devname=string,bge0;
> net0    class=int,8;media=int,4;linkover=string,bge0;maddrtype=int,1;vrid=int,0;vraf=int,0;macaddr=string,2:8:20:c0:a7:c0;
> 
> in /etc/dladm/datalink.conf (comments excluded)
> ---
> Cheers,
> Heinz Nikolaus Gies
> heinz at licenser.net
> 
> 
> 
>> On May 30, 2015, at 20:55, Heinz Nikolaus Gies <heinz at licenser.net> wrote:
>> 
>> Hi,
>> 
>> I got the feeling that I did break part of the networks tack on a server of mine. The list of commands I executed are attached, right now no diadem show-* command returns output (return code is still 0). I do suspect that was caused by adding a vmic with the same name twice but I am not sure and don?t want to reboot at this point so no evidence is destroyed.
>> 
>> Cheers,
>> Heinz
>> 
>> 397  dladm show-vnic
>> 398  dladm show-vnic -v
>> 399  dladm show-phys
>> 401  man dladm
>> 402  dladm create-vnic -l bge0 net0
>> 403  dladm show-phys
>> 404  dladm show-vnic
>> 405  dladm destroy-vnic -l bge0 net0
>> 406  dladm delete-vnic -l bge0 net0
>> 407  dladm delete-vnic  net0
>> 408  dladm create-vnic -l bge0 net0 -p zone=2398fe7c-032f-11e5-abb0-b33f9f953915
>> 412  dladm create-vnic -l bge0 net0
>> 416  dladm show-vnic
>> 417  dladm create-vnic -l bge0 net0
>> 419  dladm show-vnic
>> 490  dladm show-vnic
>> 491  dladm show-phys
>> 492  dladm
>> 493  dladm show-phys
>> 494  dladm show-phys -v
>> 495  dladm
>> 496  dladm show-link
>> 497  dladm show-link
>> 499  dladm show-link
>> 
>> ---
>> Cheers,
>> Heinz Nikolaus Gies
>> heinz at licenser.net
>> 
>> 
>> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://omniosce.org/ml-archive/attachments/20150530/d13eeb28/attachment-0001.bin>