From yavoritomov at gmail.com Fri May 1 16:42:29 2015 From: yavoritomov at gmail.com (Yavor Tomov) Date: Fri, 1 May 2015 11:42:29 -0500 Subject: [OmniOS-discuss] ZFS ACL Solaris CIFS and Windows client In-Reply-To: <5541D215.2020603@gmx.net> References: <5541D215.2020603@gmx.net> Message-ID: This is an old guide I made long time ago should help you connect and set permissions. On Thu, Apr 30, 2015 at 1:56 AM, Sebastian Gabler wrote: > Am 29.04.2015 um 20:07 schrieb omnios-discuss-request at lists.omniti.com: > >> Message: 3 >> Date: Tue, 28 Apr 2015 19:22:34 +0200 >> From: G?nther Alka >> To: omnios-discuss >> Subject: Re: [OmniOS-discuss] ZFS ACL Solaris CIFS and Windows client >> Message-ID: <9D064AA0-0C34-444F-9FF0-900F32EFF5B9 at hfg-gmuend.de> >> Content-Type: text/plain; charset=utf-8 >> >> Lets?s begin with ZFS properties >> - aclinhert: passthrough >> > Thanks. It was on "restricted". I applied the change, but that makes no > difference to my original problem. > >> - aclmode: does not matter for CIFS >> > Thanks. Do you have any sources for that for futher studies? > >> >> Next, set idmappings >> - in Workgroup mode: do not set any user mappings (only group mappings) >> - in Domain mode: set domainadmins => root >> > That's already the case. On that occasion: how would one delegate operator > permissions for ACL assignment to other users. i.e. if I want certain > Domain Users to change ACLs, permissions, and privileges, on shares of the > illumos machine, who are not member of the domain admin group? > >> >> Next: join AD Domain (for domain mode) >> >> Next: SMB connect >> - use root (requires a passwd root to generate s SMB password) or >> - use an Domain Admin account (requires the idmapping to root) >> > I am using the domain admin account. Note: what specifically is not > working is to set ownership on behalf of a different domain user. > >> >> Windows version: >> - you need Windows Pro or Windows server (no home edition) >> > Known. > >> >> Now you should be able to set ownership and ACL on files and folders. >> >> If you want to set ACL on shares, you must >> - SMB connect as a user that is a member of the Administrators group >> - use Computer Management on Windows and connect OmniOS >> > Trying the latter ends up in "access denied". > Maybe there is something broken with the user mapping. (i.e., the domain > admin >root mapping was done, but how do I check if it is in effect, how do > I check if root (who is in my understanding the provider of the permissions > to domain admin, right?) has the required privs? > >> >> >> Gea >> >> >> Am 28.04.2015 um 14:09 schrieb Sebastian Gabler : >>> >>> Hi, >>> >>> I am a bit stuck in getting my ACL management straight for the CIFS >>> shares I run. What I would like to do is to set all the ACLs from Windows. >>> What does not work right now is to assign ownership to a sharepoint or an >>> object below it to a different user, i.e. to set ownership as the Domain >>> Administrator to a specific user. I get an error message that a "Restore" >>> privilege would be missing, but the error message is unclear if that >>> applies to the current context (Domain Administrator), or the prospective >>> owner. I can set full control for that user, however. >>> Specifically, >>> 1. I am wondering how to get, from my illumos machine, the privileges >>> applicable on an object for a certain user. >>> 2. finding out what is required to take/provide ownership, specifically >>> of a sharepoint, from Windows, (ACLs, idmap, ZFS acl modes and inhertiance >>> modes, etc), and in what hierarchy things apply. >>> I am aware that this may be a FAQ, but I didn't find comprehensive >>> documentation on the matter. The Oracle docs are focussed to explain how >>> things work from the Solaris side, most HowTos that include the Windows >>> side are not deep enough. >>> >>> Thanks for any hints. >>> >>> With best regards, >>> >>> Sebastian >>> _______________________________________________ >>> OmniOS-discuss mailing list >>> OmniOS-discuss at lists.omniti.com >>> http://lists.omniti.com/mailman/listinfo/omnios-discuss >>> >> >> > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenIndiana Windows 2008 R2 AD.pdf Type: application/pdf Size: 526095 bytes Desc: not available URL: From alka at hfg-gmuend.de Fri May 1 17:32:49 2015 From: alka at hfg-gmuend.de (=?utf-8?Q?G=C3=BCnther_Alka?=) Date: Fri, 1 May 2015 19:32:49 +0200 Subject: [OmniOS-discuss] ZFS ACL Solaris CIFS and Windows client In-Reply-To: <5541D215.2020603@gmx.net> References: <5541D215.2020603@gmx.net> Message-ID: <725A130D-701C-4DAE-95A0-928089C1CA56@hfg-gmuend.de> ZFS properties, see Oracke docs ex http://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/gbaaz/index.html If you want full permissions on files on an SMB share, you must either connect as user root or as an AD user that is idmapped to Unix root Adding a user to the SMB group administrators is needed for some administration tasks (ex remote computer management) but root permission is the key for any file permission problems. Gea > Am 30.04.2015 um 08:56 schrieb Sebastian Gabler : > > Am 29.04.2015 um 20:07 schrieb omnios-discuss-request at lists.omniti.com: >> Message: 3 >> Date: Tue, 28 Apr 2015 19:22:34 +0200 >> From: G?nther Alka >> To: omnios-discuss >> Subject: Re: [OmniOS-discuss] ZFS ACL Solaris CIFS and Windows client >> Message-ID: <9D064AA0-0C34-444F-9FF0-900F32EFF5B9 at hfg-gmuend.de> >> Content-Type: text/plain; charset=utf-8 >> >> Lets?s begin with ZFS properties >> - aclinhert: passthrough > Thanks. It was on "restricted". I applied the change, but that makes no difference to my original problem. >> - aclmode: does not matter for CIFS > Thanks. Do you have any sources for that for futher studies? >> >> Next, set idmappings >> - in Workgroup mode: do not set any user mappings (only group mappings) >> - in Domain mode: set domainadmins => root > That's already the case. On that occasion: how would one delegate operator permissions for ACL assignment to other users. i.e. if I want certain Domain Users to change ACLs, permissions, and privileges, on shares of the illumos machine, who are not member of the domain admin group? >> >> Next: join AD Domain (for domain mode) >> >> Next: SMB connect >> - use root (requires a passwd root to generate s SMB password) or >> - use an Domain Admin account (requires the idmapping to root) > I am using the domain admin account. Note: what specifically is not working is to set ownership on behalf of a different domain user. >> >> Windows version: >> - you need Windows Pro or Windows server (no home edition) > Known. >> >> Now you should be able to set ownership and ACL on files and folders. >> >> If you want to set ACL on shares, you must >> - SMB connect as a user that is a member of the Administrators group >> - use Computer Management on Windows and connect OmniOS > Trying the latter ends up in "access denied". > Maybe there is something broken with the user mapping. (i.e., the domain admin >root mapping was done, but how do I check if it is in effect, how do I check if root (who is in my understanding the provider of the permissions to domain admin, right?) has the required privs? >> >> >> Gea >> >> >>> Am 28.04.2015 um 14:09 schrieb Sebastian Gabler : >>> >>> Hi, >>> >>> I am a bit stuck in getting my ACL management straight for the CIFS shares I run. What I would like to do is to set all the ACLs from Windows. What does not work right now is to assign ownership to a sharepoint or an object below it to a different user, i.e. to set ownership as the Domain Administrator to a specific user. I get an error message that a "Restore" privilege would be missing, but the error message is unclear if that applies to the current context (Domain Administrator), or the prospective owner. I can set full control for that user, however. >>> Specifically, >>> 1. I am wondering how to get, from my illumos machine, the privileges applicable on an object for a certain user. >>> 2. finding out what is required to take/provide ownership, specifically of a sharepoint, from Windows, (ACLs, idmap, ZFS acl modes and inhertiance modes, etc), and in what hierarchy things apply. >>> I am aware that this may be a FAQ, but I didn't find comprehensive documentation on the matter. The Oracle docs are focussed to explain how things work from the Solaris side, most HowTos that include the Windows side are not deep enough. >>> >>> Thanks for any hints. >>> >>> With best regards, >>> >>> Sebastian >>> _______________________________________________ >>> OmniOS-discuss mailing list >>> OmniOS-discuss at lists.omniti.com >>> http://lists.omniti.com/mailman/listinfo/omnios-discuss >> > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at will.to Fri May 1 21:40:04 2015 From: doug at will.to (Doug Hughes) Date: Fri, 1 May 2015 17:40:04 -0400 Subject: [OmniOS-discuss] r151012 fresh install to r151014 Message-ID: haven't had time to upgrade all my kayak stuff to r151014 yet, so going to r151012 and then doing the pkg upgrade with alt BE, but running into a consistent problem After the install and making sure I have all of the latest r151012 there, I do the unset publisher, the set publisher, and then upgrade (as normal, and as on the web page) and get an error. It's a very persistent error and the same every time. I know it's not an actual problem connecting to the website because I could upgrade the r151012 just fine (several times) # pkg unset-publisher omnios # /usr/bin/pkg set-publisher -P --set-property signature-policy=require-signatures -g http://pkg.omniti.com/omnios/r151014/ omnios # /usr/bin/pkg update --be-name=omnios-r151014 entire at 11,5.11-0.151014 Packages to install: 4 Packages to update: 391 Mediators to change: 1 Create boot environment: Yes Create backup boot environment: No DOWNLOAD PKGS FILES XFER (MB) library/python-2/lxml-26 14/395 85/12717 2.7/260.4 Errors were encountered while attempting to retrieve package or file data for the requested operation. Details follow: Framework error: code: 56 reason: Recv failure: Connection reset by peer URL: ' http://pkg.omniti.com/omnios/r151014/omnios/file/1/e11e44f204ee81611903399694ce0ed20d6ade9c'. (happened 4 times) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdg117 at elvis.arl.psu.edu Fri May 1 21:55:57 2015 From: jdg117 at elvis.arl.psu.edu (John D Groenveld) Date: Fri, 01 May 2015 17:55:57 -0400 Subject: [OmniOS-discuss] r151012 fresh install to r151014 In-Reply-To: Your message of "Fri, 01 May 2015 17:40:04 EDT." References: Message-ID: <201505012155.t41LtvFF028687@elvis.arl.psu.edu> In message , Doug Hughes writes: >DOWNLOAD PKGS FILES XFER (MB) >library/python-2/lxml-26 14/395 85/12717 2.7/260.4 > > >Errors were encountered while attempting to retrieve package or file data >for >the requested operation. >Details follow: > >Framework error: code: 56 reason: Recv failure: Connection reset by peer >URL: ' >http://pkg.omniti.com/omnios/r151014/omnios/file/1/e11e44f204ee81611903399694c >e0ed20d6ade9c'. >(happened 4 times) Between your host and pkg.omniti.com exists a transparent web proxy with the hash of that file in its malware signatures database. Sneakernet that file and drop it in /var/pkg [subdirectory I can't remember but you'll easily find(1) -name e1] John groenveld at acm.org From eric.sproul at circonus.com Fri May 1 22:05:42 2015 From: eric.sproul at circonus.com (Eric Sproul) Date: Fri, 1 May 2015 18:05:42 -0400 Subject: [OmniOS-discuss] r151012 fresh install to r151014 In-Reply-To: <201505012155.t41LtvFF028687@elvis.arl.psu.edu> References: <201505012155.t41LtvFF028687@elvis.arl.psu.edu> Message-ID: On Fri, May 1, 2015 at 5:55 PM, John D Groenveld wrote: > In message > , Doug Hughes writes: >>DOWNLOAD PKGS FILES XFER (MB) >>library/python-2/lxml-26 14/395 85/12717 2.7/260.4 >> >> >>Errors were encountered while attempting to retrieve package or file data >>for >>the requested operation. >>Details follow: >> >>Framework error: code: 56 reason: Recv failure: Connection reset by peer >>URL: ' >>http://pkg.omniti.com/omnios/r151014/omnios/file/1/e11e44f204ee81611903399694c >>e0ed20d6ade9c'. >>(happened 4 times) > > Between your host and pkg.omniti.com exists a transparent web > proxy with the hash of that file in its malware signatures > database. Wow. That's, um, fun. For those playing along at home, the "offending" file is: /usr/lib/python2.6/vendor-packages/lxml/html/clean.py http://lxml.de/api/lxml.html.clean.Cleaner-class.html I'm guessing that's because it is often bundled with malware? Talk about collateral damage. From doug at will.to Mon May 4 03:39:54 2015 From: doug at will.to (Doug Hughes) Date: Sun, 03 May 2015 23:39:54 -0400 Subject: [OmniOS-discuss] r151012 fresh install to r151014 In-Reply-To: <201505012155.t41LtvFF028687@elvis.arl.psu.edu> References: <201505012155.t41LtvFF028687@elvis.arl.psu.edu> Message-ID: <5546EA0A.2030308@will.to> Good catch! I take if you've run into this before. Luckily, I also admin the firewall so I added an exception for the outbound threat trigger. On 5/1/2015 5:55 PM, John D Groenveld wrote: > In message > , Doug Hughes writes: >> DOWNLOAD PKGS FILES XFER (MB) >> library/python-2/lxml-26 14/395 85/12717 2.7/260.4 >> >> >> Errors were encountered while attempting to retrieve package or file data >> for >> the requested operation. >> Details follow: >> >> Framework error: code: 56 reason: Recv failure: Connection reset by peer >> URL: ' >> http://pkg.omniti.com/omnios/r151014/omnios/file/1/e11e44f204ee81611903399694c >> e0ed20d6ade9c'. >> (happened 4 times) > Between your host and pkg.omniti.com exists a transparent web > proxy with the hash of that file in its malware signatures > database. > > Sneakernet that file and drop it in /var/pkg [subdirectory > I can't remember but you'll easily find(1) -name e1] > > John > groenveld at acm.org > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss From davide.poletto at gmail.com Mon May 4 17:10:13 2015 From: davide.poletto at gmail.com (Davide Poletto) Date: Mon, 4 May 2015 19:10:13 +0200 Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and uname Message-ID: Just to say I've noticed that uname -v reports "illumos-omnios" on a OmniOS 151012 which was "omnios-10b9c79" after I updated it today (packages released on 17.04.2015 at official repository): OmniOS 5.11 omnios-10b9c79 September 2014 root at nas:/root# OmniOS 5.11 illumos-omnios April 2015 root at nas:/root# Is that OK/by Design? Just for reference on OmniOS 151014, after the same big set of updates (released the same day, 17.04.2015), the uname -v changed from "omnios-a708424" (from its ISO install) to "omnios-170cea2". Regards, Davide. From danmcd at omniti.com Mon May 4 17:43:18 2015 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 4 May 2015 13:43:18 -0400 Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and uname In-Reply-To: References: Message-ID: <4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com> > On May 4, 2015, at 1:10 PM, Davide Poletto wrote: > > Just to say I've noticed that uname -v reports "illumos-omnios" on a > OmniOS 151012 which was "omnios-10b9c79" after I updated it today > (packages released on 17.04.2015 at official repository): > > OmniOS 5.11 omnios-10b9c79 September 2014 > root at nas:/root# > > OmniOS 5.11 illumos-omnios April 2015 > root at nas:/root# > > Is that OK/by Design? That was my fault during the kernel build. I had the wrong variable set in my .env file. > Just for reference on OmniOS 151014, after the same big set of updates > (released the same day, 17.04.2015), the uname -v changed from > "omnios-a708424" (from its ISO install) to "omnios-170cea2". Yes, I believe only r151012 was affected poorly by this. Since 012 is in its last 6 months of support life, I'm not particularly concerned. Dan From davide.poletto at gmail.com Mon May 4 19:35:46 2015 From: davide.poletto at gmail.com (Davide Poletto) Date: Mon, 4 May 2015 21:35:46 +0200 Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and uname In-Reply-To: <4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com> References: <4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com> Message-ID: Yeah, me too now! Thanks Dan. On Mon, May 4, 2015 at 7:43 PM, Dan McDonald wrote: > >> On May 4, 2015, at 1:10 PM, Davide Poletto wrote: >> >> Just to say I've noticed that uname -v reports "illumos-omnios" on a >> OmniOS 151012 which was "omnios-10b9c79" after I updated it today >> (packages released on 17.04.2015 at official repository): >> >> OmniOS 5.11 omnios-10b9c79 September 2014 >> root at nas:/root# >> >> OmniOS 5.11 illumos-omnios April 2015 >> root at nas:/root# >> >> Is that OK/by Design? > > That was my fault during the kernel build. I had the wrong variable set in my .env file. > >> Just for reference on OmniOS 151014, after the same big set of updates >> (released the same day, 17.04.2015), the uname -v changed from >> "omnios-a708424" (from its ISO install) to "omnios-170cea2". > > Yes, I believe only r151012 was affected poorly by this. Since 012 is in its last 6 months of support life, I'm not particularly concerned. > > Dan > From cks at cs.toronto.edu Mon May 4 21:45:27 2015 From: cks at cs.toronto.edu (Chris Siebenmann) Date: Mon, 04 May 2015 17:45:27 -0400 Subject: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained high write loads Message-ID: <20150504214527.583867A061E@apps0.cs.toronto.edu> We now have a reproducable setup with OmniOS r151014 where an OmniOS NFS fileserver will experience memory exhaustion and then hang in the kernel if it receives sustained NFS write traffic from multiple clients at a rate faster than its local disks can sustain. The machine will run okay for a while but with mdb -k's ::memstat showing steadily increasing 'Kernel' memory usage; after a while it tips over the edge, the ZFS ARC starts shrinking, free RAM reported by 'vmstat' goes basically to nothing (eg 182 MB), and the system locks hard. (We have not at this point tried to make a crash dump, but past attempts to do so in similar situations have been failures.) A fairly reliable signal that the system is about to lock up very soon is that '::svc_pool nfs' will report a steadily increasing and often very large number of 'Pending requests' (as well as all configured threads being active). Our most recent lockup reported over 270,000 pending requests. Our working hypothesis is that something in the NFS server code is accepting (too many) incoming requests and filling all memory with them, which then leads to the hard lock. (It's possible that lower levels are also involved, eg TCP socket receive buffers.) Our current simplified test setup: the OmniOS machine has 64 GB RAM with 2x 1G Ethernet for incoming NFS writes, writing to a single pool of a mirrored pair of 2 TB WD SE SATA drives. There are six client machines on one network, 25 on the other, and all client machines are running multiple processes that are writing files of various sizes (from 50 MB through several GB); all client machines are Ubuntu Linux. We believe (but have not tested) that multiple clients and possibly multiple processes are required to provoke this behavior. All NFS traffic is NFS v3 over TCP. Has anyone seen or heard of anything like this before? Is there any way to limit the number of pending NFS requests that the system will accept? Allowing 270,000 strikes me as kind of absurd. (I don't suppose anyone with a test environment wants to take a shot at reproducing this. For us, this happens within an hour or three of running at this load, and generally happens faster with smaller number of NFS server threads.) - cks From doug at will.to Tue May 5 00:50:23 2015 From: doug at will.to (Doug Hughes) Date: Mon, 04 May 2015 20:50:23 -0400 Subject: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained high write loads In-Reply-To: <20150504214527.583867A061E@apps0.cs.toronto.edu> References: <20150504214527.583867A061E@apps0.cs.toronto.edu> Message-ID: <554813CF.9070800@will.to> Yes, absolutely. We've run into this same problem, exactly as you describe, in Solaris10 (all versions) You can catch it with a kernel dump, but you have to be wary and quick. keep a vmstat 3 open (or similar), and when free mem drops below 5GB or so, be ready. As soon you start seeing PO or DE, that's when to take your crash dump. Basically, what happens (from my understanding previously talking with an Oracle kernel engineer) is that the kernel just allocates tons of NFS buffers that keep building up and building up and there's no mechanism for getting rid of them in sufficient time. There really ought to be a RED or some sort of back pressure, but it doesn't seem to be there. You can make this problem less likely to occur by decreasing the client side rsize and wsize. Linux centos/rhel6 (and similar 2.6+ kernel) exacerbates the problem by using 1MB rsize and wsize, which makes the server burn through big NFS buffers, but if you force the clients to 32k or perhaps even smaller, then you can push off the problem a bit. Do you have a synthetic load test to reproduce it? On 5/4/2015 5:45 PM, Chris Siebenmann wrote: > We now have a reproducable setup with OmniOS r151014 where an OmniOS > NFS fileserver will experience memory exhaustion and then hang in the > kernel if it receives sustained NFS write traffic from multiple clients > at a rate faster than its local disks can sustain. The machine will run > okay for a while but with mdb -k's ::memstat showing steadily increasing > 'Kernel' memory usage; after a while it tips over the edge, the ZFS ARC > starts shrinking, free RAM reported by 'vmstat' goes basically to nothing > (eg 182 MB), and the system locks hard. > > (We have not at this point tried to make a crash dump, but past attempts > to do so in similar situations have been failures.) > > A fairly reliable signal that the system is about to lock up very > soon is that '::svc_pool nfs' will report a steadily increasing and often > very large number of 'Pending requests' (as well as all configured threads > being active). Our most recent lockup reported over 270,000 pending > requests. Our working hypothesis is that something in the NFS server code > is accepting (too many) incoming requests and filling all memory with them, > which then leads to the hard lock. > > (It's possible that lower levels are also involved, eg TCP socket > receive buffers.) > > Our current simplified test setup: the OmniOS machine has 64 GB RAM > with 2x 1G Ethernet for incoming NFS writes, writing to a single pool of > a mirrored pair of 2 TB WD SE SATA drives. There are six client machines > on one network, 25 on the other, and all client machines are running > multiple processes that are writing files of various sizes (from 50 MB > through several GB); all client machines are Ubuntu Linux. We believe > (but have not tested) that multiple clients and possibly multiple > processes are required to provoke this behavior. All NFS traffic is > NFS v3 over TCP. > > Has anyone seen or heard of anything like this before? > > Is there any way to limit the number of pending NFS requests that the > system will accept? Allowing 270,000 strikes me as kind of absurd. > > (I don't suppose anyone with a test environment wants to take a shot > at reproducing this. For us, this happens within an hour or three of > running at this load, and generally happens faster with smaller number > of NFS server threads.) > > - cks > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss From danmcd at omniti.com Tue May 5 01:03:32 2015 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 4 May 2015 21:03:32 -0400 Subject: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained high write loads In-Reply-To: <20150504214527.583867A061E@apps0.cs.toronto.edu> References: <20150504214527.583867A061E@apps0.cs.toronto.edu> Message-ID: > On May 4, 2015, at 5:45 PM, Chris Siebenmann wrote: > > > Is there any way to limit the number of pending NFS requests that the > system will accept? Allowing 270,000 strikes me as kind of absurd. I swear I've seen someone try to address this before. Maybe it's from my Nexenta days. I will be querying the illumos developer's list (as I suspect this affects the other distros as well if they haven't fixed it in their local illumos children). Thanks, Dan From matej at zunaj.si Tue May 5 07:46:01 2015 From: matej at zunaj.si (Matej Zerovnik) Date: Tue, 05 May 2015 09:46:01 +0200 Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then resumes Message-ID: <55487539.6030408@zunaj.si> Hello! Back again with a follow up from 'iSCSI target hang, no way to restart but server reboot', where we had troubles with random iSCSI target freezing and only reboot helped. Once we had enough, we switch to a new gear and software: - new server - IBM xServer 3550 M4 with 265GB memory and SAS HBA LSI Logic SAS2308 controller - installed the latest OmniOS LTS(r151014) - updated the firmware on LSI controller to version P19. We still kept our SATA hard drives in Supermicro JBOD with SAS expander and SATA drives. After the upgrade, things worked smooth for about a week with no errors in logs. After a week, some clients reported that their iSCSI drive failed and remounted as read-only. Weirdly, Nagios on our end did not report any anomaly. I looked at OmniOS logs, and there was nothing connected with iscsi in them at all. After a while, all clients connected back, so iscsi target did not crash like it used to. Looking at the clients logs, it seems like there was a connection error: Apr 29 10:33:53 317 kernel: connection1:0: detected conn error (1021) Apr 29 10:33:54 317 iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3) Apr 29 10:33:56 317 iscsid: connection1:0 is operational after recovery (1 attempts) Apr 29 10:36:37 317 kernel: connection1:0: detected conn error (1021) Apr 29 10:36:37 317 iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3) Apr 29 10:36:40 317 iscsid: connection1:0 is operational after recovery (1 attempts) Apr 29 10:36:50 317 kernel: sd 3:0:0:0: Device offlined - not ready after error recovery For test, I set up a ping from my workstation to clients server and our iscsi target, to see if there is a network problem when iscsi drops. A week later it happened again. I looked at ping requests and ping was going through without a problem and nagios check on iscsi port was also working, yet our traffic graph shows 100% drop: http://i59.tinypic.com/59vl10.png I failed to catch the server in 'down' state to investigate. Looking up on the internet about the error that client gets, it looks like there could be too many commands sent and iscsi timed out. Our pool is made out of cca 40 drives in one RAIDZ vdev, so we can't do many IOPS, so I suspect clients send too many IO requests, it takes server too long to respond and iscsi crashes. Does that sounds like a possible option? Is there a way to measure how many iscsi commands are sent to drives, to see if there is a peak when it crashes? Is there a way to measure how busy are disks and if they really cant return data that fast? What else should/can I check/monitor to find out what our problem it? Matej From dwq at xmweixun.com Tue May 5 07:54:18 2015 From: dwq at xmweixun.com (dwq at xmweixun.com) Date: Tue, 5 May 2015 15:54:18 +0800 Subject: [OmniOS-discuss] Writeback Cache Auto disabled Message-ID: <001901d08708$bff6b910$3fe42b30$@xmweixun.com> Hi All, When I present lu to hpux or aix, lu writeback cache auto disabled,why? LU Name: 600144F00000000000005548DC360005 Operational Status: Online Provider Name : sbd Alias : /dev/zvol/rdsk/wxnas/hpuxtest03 View Entry Count : 1 Data File : /dev/zvol/rdsk/wxnas/hpuxtest03 Meta File : not set Size : 21474836480 Block Size : 512 Management URL : not set Vendor ID : SUN Product ID : COMSTAR Serial Num : not set Write Protect : Disabled Writeback Cache : Disabled Access State : Active Thanks. Version: SunOS wxos1 5.11 omnios-b281e50 i86pc i386 i86pc Deng -------------- next part -------------- An HTML attachment was scrubbed... URL: From mir at miras.org Tue May 5 09:21:11 2015 From: mir at miras.org (mir at miras.org) Date: Tue, 05 May 2015 11:21:11 +0200 Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then resumes In-Reply-To: <55487539.6030408@zunaj.si> References: <55487539.6030408@zunaj.si> Message-ID: On 2015-05-05 09:46, Matej Zerovnik wrote: > > We still kept our SATA hard drives in Supermicro JBOD with SAS > expander and SATA drives. > Your problem boils down to using SATA disks in a SAS expander. Search omnios user list and you will find numerous proofs that using SATA disks in a SAS expander causes weird behaviors and instability. The fact is that SATA disks is unsupported in a SAS expander due to incompatibility between command sets in SAS and SATA. As an example SATA NCQ is not passed through the SAS expander which might could be the cause of your strange iSCSI disconnects experienced on the client side. ---- This mail was virus scanned and spam checked before delivery. This mail is also DKIM signed. See header dkim-signature. From narayan.desai at gmail.com Tue May 5 14:32:19 2015 From: narayan.desai at gmail.com (Narayan Desai) Date: Tue, 5 May 2015 09:32:19 -0500 Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then resumes In-Reply-To: References: <55487539.6030408@zunaj.si> Message-ID: And, if you don't have the luxury of discarding hardware and replacing it with a supported configuration, you might look at finding marginal drives, either via error counters displayed in iostat -En, or drives with really high service times (in iostat -xnz output). We found (on a similar setup), that being really aggressive about drive replacement helped a lot. If you have desktop sata drives, then the drive firmware is part of the problem. Desktop drives retry for quite a long time when they encounter errors, which produce really inconsistent performance profiles. When you aggregate into a raid set (including in ZFS) tail latencies really start to matter for performance, and the pool just starts going out to lunch for a long time. If you can figure out and replace the drive is causing the problem (even if it isn't causing any hard errors), the pool performance goes back to normal. -nld On Tue, May 5, 2015 at 4:21 AM, wrote: > On 2015-05-05 09:46, Matej Zerovnik wrote: > >> >> We still kept our SATA hard drives in Supermicro JBOD with SAS >> expander and SATA drives. >> >> Your problem boils down to using SATA disks in a SAS expander. Search > omnios user list and you will find numerous proofs that using SATA disks in > a SAS expander causes weird behaviors and instability. > > The fact is that SATA disks is unsupported in a SAS expander due to > incompatibility between command sets in SAS and SATA. As an example SATA > NCQ is not passed through the SAS expander which might could be the cause > of your strange iSCSI disconnects experienced on the client side. > > ---- > > This mail was virus scanned and spam checked before delivery. > This mail is also DKIM signed. See header dkim-signature. > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.elling at richardelling.com Tue May 5 15:17:02 2015 From: richard.elling at richardelling.com (Richard Elling) Date: Tue, 5 May 2015 08:17:02 -0700 Subject: [OmniOS-discuss] Writeback Cache Auto disabled In-Reply-To: <001901d08708$bff6b910$3fe42b30$@xmweixun.com> References: <001901d08708$bff6b910$3fe42b30$@xmweixun.com> Message-ID: <70165454-5855-455D-BE88-8AB444934C45@RichardElling.com> > On May 5, 2015, at 12:54 AM, wrote: > > Hi All, > When I present lu to hpux or aix, lu writeback cache auto disabled,why? In SCSI, initiators can change the write cache policy. ? richard > > LU Name: 600144F00000000000005548DC360005 > Operational Status: Online > Provider Name : sbd > Alias : /dev/zvol/rdsk/wxnas/hpuxtest03 > View Entry Count : 1 > Data File : /dev/zvol/rdsk/wxnas/hpuxtest03 > Meta File : not set > Size : 21474836480 > Block Size : 512 > Management URL : not set > Vendor ID : SUN > Product ID : COMSTAR > Serial Num : not set > Write Protect : Disabled > Writeback Cache : Disabled > Access State : Active > > > Thanks. > > Version: > SunOS wxos1 5.11 omnios-b281e50 i86pc i386 i86pc > Deng > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss -- Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ci4 at outlook.com Tue May 5 15:20:18 2015 From: ci4 at outlook.com (Chavdar Ivanov) Date: Tue, 5 May 2015 15:20:18 +0000 Subject: [OmniOS-discuss] =?utf-8?q?Can=27t_update_bloody?= Message-ID: Hi, I tried updating one of my VMs running omnios bloody. Full refresh goes through, update fails because it can't find the manifest for package/pkg - confirmed via the web view. Is there any present problem with the bloody repo? Chavdar Ivanov Sent from Windows Mail -------------- next part -------------- An HTML attachment was scrubbed... URL: From vab at bb-c.de Tue May 5 15:35:48 2015 From: vab at bb-c.de (Volker A. Brandt) Date: Tue, 5 May 2015 17:35:48 +0200 Subject: [OmniOS-discuss] Can't update bloody In-Reply-To: References: Message-ID: <21832.58196.941714.304987@glaurung.bb-c.de> > I tried updating one of my VMs running omnios bloody. Full refresh > goes through, update fails because it can't find the manifest for > package/pkg - confirmed via the web view. FWIW, I update my local copy of the bloody repo each morning (the dead of night in the US :-), and here is what I have been seeing for a few days now: Processing packages for publisher omnios ... Retrieving and evaluating 2035 package(s)... Download Manifests (1087/2035) \pkgrecv: http protocol error: code: 404 reason: Not Found URL: 'http://pkg.omniti.com/omnios/bloody/omnios/manifest/0/package%2Fpkg at 0.5.11%2C5.11-0.151015%3A20150422T144502Z' (happened 4 times) So I can confirm that the manifest file for package/pkg is physically missing. Regards -- Volker -- ------------------------------------------------------------------------ Volker A. Brandt Consulting and Support for Oracle Solaris Brandt & Brandt Computer GmbH WWW: http://www.bb-c.de/ Am Wiesenpfad 6, 53340 Meckenheim, GERMANY Email: vab at bb-c.de Handelsregister: Amtsgericht Bonn, HRB 10513 Schuhgr??e: 46 Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt "When logic and proportion have fallen sloppy dead" From danmcd at omniti.com Tue May 5 15:41:53 2015 From: danmcd at omniti.com (Dan McDonald) Date: Tue, 5 May 2015 11:41:53 -0400 Subject: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained high write loads In-Reply-To: References: <20150504214527.583867A061E@apps0.cs.toronto.edu> Message-ID: <704EC1DD-4C8D-4B17-A535-03BF737DB12F@omniti.com> > On May 4, 2015, at 9:03 PM, Dan McDonald wrote: > > I swear I've seen someone try to address this before. Maybe it's from my Nexenta days. I will be querying the illumos developer's list (as I suspect this affects the other distros as well if they haven't fixed it in their local illumos children). Folks who aren't Chris, see here: http://www.listbox.com/member/archive/182179/2015/05/sort/time_rev/page/1/entry/4:58/20150505065805:98F5E7D6-F315-11E4-87B6-F3BD9F3176C1/ The hard part will be testing this. I'm not sure I have the HW in-house to do it. I may need illumos community help. FYI, Dan From cks at cs.toronto.edu Tue May 5 15:48:05 2015 From: cks at cs.toronto.edu (Chris Siebenmann) Date: Tue, 05 May 2015 11:48:05 -0400 Subject: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained high write loads In-Reply-To: danmcd's message of Tue, 05 May 2015 11:41:53 -0400. <704EC1DD-4C8D-4B17-A535-03BF737DB12F@omniti.com> Message-ID: <20150505154805.BC7367A05A8@apps0.cs.toronto.edu> > > On May 4, 2015, at 9:03 PM, Dan McDonald wrote: > > I swear I've seen someone try to address this before. Maybe it's from = > my Nexenta days. I will be querying the illumos developer's list (as I = > suspect this affects the other distros as well if they haven't fixed it = > in their local illumos children). > > Folks who aren't Chris, see here: > > http://www.listbox.com/member/archive/182179/2015/05/sort/time_rev/page/1/entry/4:58/20150505065805:98F5E7D6-F315-11E4-87B6-F3BD9F3176C1/ > > The hard part will be testing this. I'm not sure I have the HW in-house > to do it. I may need illumos community help. Since we have a test environment where we can reproduce this and a high interest in seeing it fixed, we can test new kernel packages and so on. (If given specific howto instructions we can probably build test kernels from source, but we've never tried to do any OmniOS source building before so it may take us some time to get up to speed on that. It'd be much easier to take a prebuilt test kernel, drop it in, and go.) - cks From danmcd at omniti.com Tue May 5 15:55:31 2015 From: danmcd at omniti.com (Dan McDonald) Date: Tue, 5 May 2015 11:55:31 -0400 Subject: [OmniOS-discuss] Small update - new lint libraries for some userland packages Message-ID: <47BB3308-83A6-4742-8B03-9E59DA2B8D72@omniti.com> I've just updated the openssl, zlib, trousers, and libxml2 packages to include lint libraries. This is a non-reboot update, but you may need to restart services requiring any of the aforementioned packages that are non-system-related. I'm making this change because later today, I'll be pushing back changes in illumos-gate that allow people to build stock illumos-gate on OmniOS r151014 or later. Technically, you can do it on 012 as well, but with a ton of lint. 014 and later will be able to build stock illumos-gate, which will make OmniOS a more attractive platform to illumos developers. This list will be Cc:ed on some of those illumos announcements. Thank you OmniOS community! Dan From danmcd at omniti.com Tue May 5 15:58:12 2015 From: danmcd at omniti.com (Dan McDonald) Date: Tue, 5 May 2015 11:58:12 -0400 Subject: [OmniOS-discuss] Can't update bloody In-Reply-To: <21832.58196.941714.304987@glaurung.bb-c.de> References: <21832.58196.941714.304987@glaurung.bb-c.de> Message-ID: <5ECE2D45-CBD8-49AB-8F7A-DB138B6E9C3B@omniti.com> Hmmm. I'll be updating the whole wad of bloody later this week. Can y'all wait a couple of days? I want to include some illumos updates that I'm about to push this afternoon. Thanks, Dan From danmcd at omniti.com Tue May 5 16:02:54 2015 From: danmcd at omniti.com (Dan McDonald) Date: Tue, 5 May 2015 12:02:54 -0400 Subject: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained high write loads In-Reply-To: <20150505154805.BC7367A05A8@apps0.cs.toronto.edu> References: <20150505154805.BC7367A05A8@apps0.cs.toronto.edu> Message-ID: > On May 5, 2015, at 11:48 AM, Chris Siebenmann wrote: > >>> On May 4, 2015, at 9:03 PM, Dan McDonald wrote: >>> I swear I've seen someone try to address this before. Maybe it's from = >> my Nexenta days. I will be querying the illumos developer's list (as I = >> suspect this affects the other distros as well if they haven't fixed it = >> in their local illumos children). >> >> Folks who aren't Chris, see here: >> >> http://www.listbox.com/member/archive/182179/2015/05/sort/time_rev/page/1/entry/4:58/20150505065805:98F5E7D6-F315-11E4-87B6-F3BD9F3176C1/ >> >> The hard part will be testing this. I'm not sure I have the HW in-house >> to do it. I may need illumos community help. > > Since we have a test environment where we can reproduce this and a high > interest in seeing it fixed, we can test new kernel packages and so on. > > (If given specific howto instructions we can probably build test kernels > from source, but we've never tried to do any OmniOS source building > before so it may take us some time to get up to speed on that. It'd be > much easier to take a prebuilt test kernel, drop it in, and go.) I can turn around the whole world in an hour or less and provide ONU images if your'e on 012 or 014. What revision are you running currently? I can also help you get a build-illumos-omnios up and running as well. Pick your favorite. I won't be able to do this until later this afternoon, however. I've some pressing illumos things first. Thanks, Dan From cks at cs.toronto.edu Tue May 5 16:15:54 2015 From: cks at cs.toronto.edu (Chris Siebenmann) Date: Tue, 05 May 2015 12:15:54 -0400 Subject: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained high write loads In-Reply-To: danmcd's message of Tue, 05 May 2015 12:02:54 -0400. Message-ID: <20150505161554.CFA547A05A8@apps0.cs.toronto.edu> > >> The hard part will be testing this. I'm not sure I have the HW > >> in-house to do it. I may need illumos community help. > > > > Since we have a test environment where we can reproduce this and a > > high interest in seeing it fixed, we can test new kernel packages > > and so on. > > > > (If given specific howto instructions we can probably build test > > kernels from source, but we've never tried to do any OmniOS source > > building before so it may take us some time to get up to speed on > > that. It'd be much easier to take a prebuilt test kernel, drop it > > in, and go.) > > I can turn around the whole world in an hour or less and provide > ONU images if your'e on 012 or 014. What revision are you running > currently? I can also help you get a build-illumos-omnios up and > running as well. Pick your favorite. For now, the simplest thing is installable kernel images (I assume that's ONU images) for r151014, which is what our test environment is using now and what we'd wind up on with all of our production fileservers[*]. I won't be able to start any testing with the images until this afternoon at the earliest, so I don't think it's urgent to build them right away. Thanks for all of this! - cks [*: our production fileservers are currently at r151010 but we're already looking at an r151014 upgrade. having this fix as part of r151014 would make that upgrade definite, and there's other things in 14 that we want, eg >16 group support over NFS. ] From vab at bb-c.de Tue May 5 16:35:36 2015 From: vab at bb-c.de (Volker A. Brandt) Date: Tue, 5 May 2015 18:35:36 +0200 Subject: [OmniOS-discuss] Can't update bloody In-Reply-To: <5ECE2D45-CBD8-49AB-8F7A-DB138B6E9C3B@omniti.com> References: <21832.58196.941714.304987@glaurung.bb-c.de> <5ECE2D45-CBD8-49AB-8F7A-DB138B6E9C3B@omniti.com> Message-ID: <21832.61784.40290.77774@glaurung.bb-c.de> > I'll be updating the whole wad of bloody later this week. Can y'all > wait a couple of days? I want to include some illumos updates that > I'm about to push this afternoon. Sure. Thanks for all your good work!! Regards -- Volker -- ------------------------------------------------------------------------ Volker A. Brandt Consulting and Support for Oracle Solaris Brandt & Brandt Computer GmbH WWW: http://www.bb-c.de/ Am Wiesenpfad 6, 53340 Meckenheim, GERMANY Email: vab at bb-c.de Handelsregister: Amtsgericht Bonn, HRB 10513 Schuhgr??e: 46 Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt "When logic and proportion have fallen sloppy dead" From matej at zunaj.si Tue May 5 16:48:28 2015 From: matej at zunaj.si (Matej Zerovnik) Date: Tue, 5 May 2015 18:48:28 +0200 Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then resumes In-Reply-To: References: <55487539.6030408@zunaj.si> Message-ID: <201505051648.t45GmpA4025308@lists-il.int.omniti.net> I will replace the hardwarw in about 4 months with all SAS drives, but I would love to have a working setup for the time being as well;) I looked at smart stats and there doesnt seem to be any errors. Also, no hard/soft/transfer error reported by any drive. Will take a look at service time tomorrow, maybe put the drives to graphite and look at them over a longer period. I looked at iostat -x status today and stats for pool itself reported 100% busy most of the time, 98-100% wait, 500-1300 transactions in queue, around 500 active,... First line, that is average from boot, says avg service time.is around 1600ms which seems like aaaalot. Can it be due to really big queue? Would it help to create 5 10drives raidz pools instead of one with 50 drives? Matej -----Original Message----- From: "Narayan Desai" Sent: ?5.?5.?2015 16:32 To: "Michael Rasmussen" Cc: "Matej Zerovnik" ; "omnios-discuss" Subject: Re: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then resumes And, if you don't have the luxury of discarding hardware and replacing it with a supported configuration, you might look at finding marginal drives, either via error counters displayed in iostat -En, or drives with really high service times (in iostat -xnz output). We found (on a similar setup), that being really aggressive about drive replacement helped a lot. If you have desktop sata drives, then the drive firmware is part of the problem. Desktop drives retry for quite a long time when they encounter errors, which produce really inconsistent performance profiles. When you aggregate into a raid set (including in ZFS) tail latencies really start to matter for performance, and the pool just starts going out to lunch for a long time. If you can figure out and replace the drive is causing the problem (even if it isn't causing any hard errors), the pool performance goes back to normal. -nld On Tue, May 5, 2015 at 4:21 AM, wrote: On 2015-05-05 09:46, Matej Zerovnik wrote: We still kept our SATA hard drives in Supermicro JBOD with SAS expander and SATA drives. Your problem boils down to using SATA disks in a SAS expander. Search omnios user list and you will find numerous proofs that using SATA disks in a SAS expander causes weird behaviors and instability. The fact is that SATA disks is unsupported in a SAS expander due to incompatibility between command sets in SAS and SATA. As an example SATA NCQ is not passed through the SAS expander which might could be the cause of your strange iSCSI disconnects experienced on the client side. ---- This mail was virus scanned and spam checked before delivery. This mail is also DKIM signed. See header dkim-signature. _______________________________________________ OmniOS-discuss mailing list OmniOS-discuss at lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From narayan.desai at gmail.com Tue May 5 17:24:05 2015 From: narayan.desai at gmail.com (Narayan Desai) Date: Tue, 5 May 2015 12:24:05 -0500 Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then resumes In-Reply-To: <5548f474.24cab40a.12a2.7329SMTPIN_ADDED_MISSING@mx.google.com> References: <55487539.6030408@zunaj.si> <5548f474.24cab40a.12a2.7329SMTPIN_ADDED_MISSING@mx.google.com> Message-ID: If the theory is that you have a small number of drives causing trouble, then smaller raid sets would probably help, depending on the number of marginal devices you have. I bet that you see a few drives pegged when you start looking at device level service times. -nld On Tue, May 5, 2015 at 11:48 AM, Matej Zerovnik wrote: > I will replace the hardwarw in about 4 months with all SAS drives, but I > would love to have a working setup for the time being as well;) > > I looked at smart stats and there doesnt seem to be any errors. Also, no > hard/soft/transfer error reported by any drive. Will take a look at service > time tomorrow, maybe put the drives to graphite and look at them over a > longer period. > > I looked at iostat -x status today and stats for pool itself reported 100% > busy most of the time, 98-100% wait, 500-1300 transactions in queue, around > 500 active,... First line, that is average from boot, says avg service > time.is around 1600ms which seems like aaaalot. Can it be due to really > big queue? > > Would it help to create 5 10drives raidz pools instead of one with 50 > drives? > > Matej > ------------------------------ > From: Narayan Desai > Sent: ?5.?5.?2015 16:32 > To: Michael Rasmussen > Cc: Matej Zerovnik ; omnios-discuss > > Subject: Re: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and > then resumes > > And, if you don't have the luxury of discarding hardware and replacing it > with a supported configuration, you might look at finding marginal drives, > either via error counters displayed in iostat -En, or drives with really > high service times (in iostat -xnz output). We found (on a similar setup), > that being really aggressive about drive replacement helped a lot. > > If you have desktop sata drives, then the drive firmware is part of the > problem. Desktop drives retry for quite a long time when they encounter > errors, which produce really inconsistent performance profiles. When you > aggregate into a raid set (including in ZFS) tail latencies really start to > matter for performance, and the pool just starts going out to lunch for a > long time. If you can figure out and replace the drive is causing the > problem (even if it isn't causing any hard errors), the pool performance > goes back to normal. > -nld > > On Tue, May 5, 2015 at 4:21 AM, wrote: > >> On 2015-05-05 09:46, Matej Zerovnik wrote: >> >>> >>> We still kept our SATA hard drives in Supermicro JBOD with SAS >>> expander and SATA drives. >>> >>> Your problem boils down to using SATA disks in a SAS expander. Search >> omnios user list and you will find numerous proofs that using SATA disks in >> a SAS expander causes weird behaviors and instability. >> >> The fact is that SATA disks is unsupported in a SAS expander due to >> incompatibility between command sets in SAS and SATA. As an example SATA >> NCQ is not passed through the SAS expander which might could be the cause >> of your strange iSCSI disconnects experienced on the client side. >> >> ---- >> >> This mail was virus scanned and spam checked before delivery. >> This mail is also DKIM signed. See header dkim-signature. >> >> >> _______________________________________________ >> OmniOS-discuss mailing list >> OmniOS-discuss at lists.omniti.com >> http://lists.omniti.com/mailman/listinfo/omnios-discuss >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Tue May 5 17:32:17 2015 From: danmcd at omniti.com (Dan McDonald) Date: Tue, 5 May 2015 13:32:17 -0400 Subject: [OmniOS-discuss] FLAG DAY - 4719 affects nightly, package, and poold Message-ID: <1BAFD21A-EF91-4C6E-8A2A-4D2AB691574E@omniti.com> Illumos #4719 introduces a flag day for people who build illumos-gate. Starting now, you will need a Java Developers Kit (JDK) 7 or later. OpenIndiana 151a9 does NOT have this by default. Builders must either set JAVA_ROOT to a source of JDK7, or must have /usr/java populated with JDK7. Users still on JDK6 will see build errors in the packaging portions like such: ==== package build errors (non-DEBUG) ==== dmake: Warning: Command failed for target `packages.i386/developer-dtrace.dep' dmake: Warning: Command failed for target `packages.i386/service-network-dns-mdns.dep' dmake: Warning: Target `install' not remade because of errors . . . These are due to javadoc changes between 6 and 7. The dtrace and mdns packages generate javadoc, so their packaging manifests are updated to the 7 versions. ALSO, because poold defines JAVA_ROOT in its binaries, you must set JAVA_ROOT when building poold to match the runtime java on your ONU or otherwised-packaged target. For example, on my OI 151a9 test builder, I untarred an openjdk7 in /usr/jdk/instances/openjdk7/ and set JAVA_ROOT=/usr/jdk/instances/openjdk7/. IMPORTANT --> If you are an OI 151a9 user, and wish to use poold, installing openjdk7 in instances is not sufficient. You will need to set /usr/java to point to the openjdk7 instance as well. Illumos bug 5851 tracks this. This change is the last of several steps that will allow other platforms (like OmniOS, e.g.) to build stock illumos-gate. I will post a separate note on building illumos-gate on OmniOS. Thanks to Richard PALO for creating these diffs in the first place. Thanks! Dan From danmcd at omniti.com Tue May 5 17:32:21 2015 From: danmcd at omniti.com (Dan McDonald) Date: Tue, 5 May 2015 13:32:21 -0400 Subject: [OmniOS-discuss] HEADS UP -- illumos-gate can now be built on OmniOS r151014 or later Message-ID: With the pushes of 4719, 5878, and 5879, one may now build stock illumos-gate on OmniOS, revisions r151014 or later. An OmniOS .env file will need certain variables set. I'm attaching a sample one I use, but I will go over the critical variables here. To build on OmniOS, you must: 1.) Use "gcc only" build. # GCC-only, REQUIRED for building on OmniOS. __GNUC=""; export __GNUC CW_NO_SHADOW=1; export CW_NO_SHADO 2.) Use ONLY_LINT_DEFS along with the sunstudio12.1 version of lint you can get as a binary with OmniOS: # Lint if you have the OmniOS-supplied usable-for-linting-only sunstudio12.1. ONLY_LINT_DEFS=-I${SPRO_ROOT}/sunstudio12.1/prod/include/lint; export ONLY_LINT_DEFS 3.) Change the GCC_ROOT to OmniOS's. You have to do this for illumos-omnios as well, so this shouldn't be shocking: GCC_ROOT=/opt/gcc-4.4.4/; export GCC_ROOT 4.) Set the PERL_* variables to cope with OmniOS using perl 5.16.1: # These are required for building on OmniOS. export PERL_VERSION=5.16.1 export PERL_PKGVERS=-5161 export PERL_ARCH=i86pc-solaris-thread-multi-64int 5.) Like with illumos-omnios, set ONNV_BUILDNUM to THE SAME release as you wish to ONU from. So if you're building mid-2015's bloody, set it to 151015, if you're ONUing the current stable, use 151014: # SET ONNV_BUILDNUM appropriately - to ONU r151014, set this to 151014. export ONNV_BUILDNUM=151014 Please note that if you build illumos-gate on OmniOS, you cannot ONU a non-OmniOS machine with the generated packages. You CAN ONU an OmniOS machine, however (just make sure ONNV_BUILDNUM matches the release you wish to ONU from). Thanks! Dan -------------- next part -------------- A non-text attachment was scrubbed... Name: illumos-gate-omnios.env.sh Type: application/octet-stream Size: 8379 bytes Desc: not available URL: From doug at will.to Tue May 5 20:28:27 2015 From: doug at will.to (Doug Hughes) Date: Tue, 5 May 2015 16:28:27 -0400 Subject: [OmniOS-discuss] OmniOS NFS fileserver hanging under sustained high write loads In-Reply-To: <20150505161554.CFA547A05A8@apps0.cs.toronto.edu> References: <20150505161554.CFA547A05A8@apps0.cs.toronto.edu> Message-ID: I managed to get my system in a state with dd test across a bunch of client nodes (4k writes, many nodes in parallel, all to the same file -- by mistake, I meant to do many files), that all of the ttys except for /dev/console are stuck. It was showing signs of desparation swapping a few times, but it seems to have recovered from that. I have killed all of the write-intensive I/O and the host is mostly fine. Load has fallen, no residual I/O to disks, but the ttys that are not console are still stuck. I had quite a few pauses in my vmstat output while the memory exhaustion from write load took place. In contrast, just can't bring the machine down with read load, as you might expect. The arc does an admiral job with the 72GB ram and can totally fill up the 10g pipes outbound. It didn't lock up completely, but it came close, and there's some residual damage lingering with respect to the ttys. (config = 2xquad core Intel Sandybridge CPU in Sun X4275 with 72GB ram and 12x4TB disks) On Tue, May 5, 2015 at 12:15 PM, Chris Siebenmann wrote: > > >> The hard part will be testing this. I'm not sure I have the HW > > >> in-house to do it. I may need illumos community help. > > > > > > Since we have a test environment where we can reproduce this and a > > > high interest in seeing it fixed, we can test new kernel packages > > > and so on. > > > > > > (If given specific howto instructions we can probably build test > > > kernels from source, but we've never tried to do any OmniOS source > > > building before so it may take us some time to get up to speed on > > > that. It'd be much easier to take a prebuilt test kernel, drop it > > > in, and go.) > > > > I can turn around the whole world in an hour or less and provide > > ONU images if your'e on 012 or 014. What revision are you running > > currently? I can also help you get a build-illumos-omnios up and > > running as well. Pick your favorite. > > For now, the simplest thing is installable kernel images (I assume > that's ONU images) for r151014, which is what our test environment > is using now and what we'd wind up on with all of our production > fileservers[*]. I won't be able to start any testing with the images > until this afternoon at the earliest, so I don't think it's urgent to > build them right away. > > Thanks for all of this! > > - cks > [*: our production fileservers are currently at r151010 but we're > already looking at an r151014 upgrade. having this fix as part > of r151014 would make that upgrade definite, and there's other > things in 14 that we want, eg >16 group support over NFS. > ] > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwq at xmweixun.com Wed May 6 07:49:48 2015 From: dwq at xmweixun.com (dwq at xmweixun.com) Date: Wed, 6 May 2015 15:49:48 +0800 Subject: [OmniOS-discuss] =?utf-8?b?562U5aSNOiAgIFdyaXRlYmFjayBDYWNoZSBB?= =?utf-8?q?uto_disabled?= In-Reply-To: <70165454-5855-455D-BE88-8AB444934C45@RichardElling.com> References: <001901d08708$bff6b910$3fe42b30$@xmweixun.com> <70165454-5855-455D-BE88-8AB444934C45@RichardElling.com> Message-ID: <002801d087d1$3a8454d0$af8cfe70$@xmweixun.com> Hi Richard I use stmfadm modify-lu ?p wcd=false LU Name, change write cache to enable,but when client read or wirte io from lu,lu status (writeback cache) change to disable again. Best Regards, Deng Wei Quan / ??? Mob: +86 13906055059 Mail: dwq at xmweixun.com ???????????? ???: dwq+auto_=dengweiquan=139.com at xmweixun.com [mailto:dwq+auto_=dengweiquan=139.com at xmweixun.com] ?? Richard Elling ????: 2015?5?5? 23:17 ???: dwq at xmweixun.com ??: omnios-discuss at lists.omniti.com ??: Re: [OmniOS-discuss] Writeback Cache Auto disabled On May 5, 2015, at 12:54 AM, > > wrote: Hi All, When I present lu to hpux or aix, lu writeback cache auto disabled,why? In SCSI, initiators can change the write cache policy. ? richard LU Name: 600144F00000000000005548DC360005 Operational Status: Online Provider Name : sbd Alias : /dev/zvol/rdsk/wxnas/hpuxtest03 View Entry Count : 1 Data File : /dev/zvol/rdsk/wxnas/hpuxtest03 Meta File : not set Size : 21474836480 Block Size : 512 Management URL : not set Vendor ID : SUN Product ID : COMSTAR Serial Num : not set Write Protect : Disabled Writeback Cache : Disabled Access State : Active Thanks. Version: SunOS wxos1 5.11 omnios-b281e50 i86pc i386 i86pc Deng _______________________________________________ OmniOS-discuss mailing list OmniOS-discuss at lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss -- Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Wed May 6 14:13:39 2015 From: danmcd at omniti.com (Dan McDonald) Date: Wed, 6 May 2015 10:13:39 -0400 Subject: [OmniOS-discuss] New OmniOS bloody update Message-ID: Based on omnios-build commit 69a5016 and illumos-omnios commit 385735e. This is another whole-wad update, even package/pkg. I've fixed the repo to address the corruption a few of you were seeing. I think this is the result of me trying to avoid a "pkg update pkg" pre-step last time, but forgetting to rebuild the repository index afterwards. Anyway, you shouldn't see that now. Since last time: - You may now build stock illumos-gate on bloody thanks to the inclusion of lint libraries for some userland packages. (Same ones in r151014.) - curl is now at 7.42.1. - ISC DHCP is now at 4.3.2 - Upstream illumos-gate now has a few things we've had in OmniOS for a while. These also enable the building of stock illumos-gate on OmniOS. - A few networking bugfixes are now upstreamed compliments of Joyent. - A longstanding tar(1) bug where "tar -xzf" can fail has been fixed. - You can now host an SMB/CIFS server in a non-global zone. (NOTE: sharemgr(1M) isn't zone-aware, so you will have to do it the old-fashioned way, see http://www.listbox.com/member/archive/182179/2015/04/sort/time_rev/page/4/entry/7:552/20150428134823:C190ED2C-EDCE-11E4-98D2-8987C5A0D07F/ for details.) - Some miscellaneous bugfixes. Happy updating! Dan From dain.bentley at gmail.com Wed May 6 17:12:09 2015 From: dain.bentley at gmail.com (Dain Bentley) Date: Wed, 6 May 2015 13:12:09 -0400 Subject: [OmniOS-discuss] restarting ssh on omnios doesn't load new parameters Message-ID: So I enabled root on SSH to do some work and then disabled root login with PermitRootLogin no in my sshd_config and used the following: svcadm restart network/ssh:default Thing is root can still log in...is this a bug? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdg117 at elvis.arl.psu.edu Wed May 6 17:50:43 2015 From: jdg117 at elvis.arl.psu.edu (John D Groenveld) Date: Wed, 06 May 2015 13:50:43 -0400 Subject: [OmniOS-discuss] restarting ssh on omnios doesn't load new parameters In-Reply-To: Your message of "Wed, 06 May 2015 13:12:09 EDT." References: Message-ID: <201505061750.t46Hoh3m002331@elvis.arl.psu.edu> In message , Dain Bentley writes: >So I enabled root on SSH to do some work and then disabled root login with >PermitRootLogin no in my sshd_config and used the following: >svcadm restart network/ssh:default > >Thing is root can still log in...is this a bug? Did you SIGHUP the right PID? John groenveld at acm.org From dain.bentley at gmail.com Wed May 6 20:29:52 2015 From: dain.bentley at gmail.com (Dain Bentley) Date: Wed, 6 May 2015 16:29:52 -0400 Subject: [OmniOS-discuss] restarting ssh on omnios doesn't load new parameters In-Reply-To: <201505061750.t46Hoh3m002331@elvis.arl.psu.edu> References: <201505061750.t46Hoh3m002331@elvis.arl.psu.edu> Message-ID: I used svcadm On Wed, May 6, 2015 at 1:50 PM, John D Groenveld wrote: > In message < > CALthgeddT8zEskGqKL+VYWU6_mjBQCCKZgeO6J4xJjLhjniG9g at mail.gmail.com>, Dain > Bentley writes: > >So I enabled root on SSH to do some work and then disabled root login with > >PermitRootLogin no in my sshd_config and used the following: > >svcadm restart network/ssh:default > > > >Thing is root can still log in...is this a bug? > > Did you SIGHUP the right PID? > > John > groenveld at acm.org > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimklimov at cos.ru Wed May 6 21:35:52 2015 From: jimklimov at cos.ru (Jim Klimov) Date: Wed, 06 May 2015 23:35:52 +0200 Subject: [OmniOS-discuss] [developer] FLAG DAY - 4719 affects nightly, package, and poold In-Reply-To: <1BAFD21A-EF91-4C6E-8A2A-4D2AB691574E@omniti.com> References: <1BAFD21A-EF91-4C6E-8A2A-4D2AB691574E@omniti.com> Message-ID: <99D121CC-C484-4A74-8453-E8B42F5BF57E@cos.ru> 5 ??? 2015??. 19:32:17 CEST, Dan McDonald ?????: >Illumos #4719 introduces a flag day for people who build illumos-gate. >Starting now, you will need a Java Developers Kit (JDK) 7 or later. >OpenIndiana 151a9 does NOT have this by default. Builders must either >set >JAVA_ROOT to a source of JDK7, or must have /usr/java populated with >JDK7. > >Users still on JDK6 will see build errors in the packaging portions >like >such: > >==== package build errors (non-DEBUG) ==== > >dmake: Warning: Command failed for target >`packages.i386/developer-dtrace.dep' >dmake: Warning: Command failed for target >`packages.i386/service-network-dns-mdns.dep' >dmake: Warning: Target `install' not remade because of errors > >. . . > > >These are due to javadoc changes between 6 and 7. The dtrace and mdns >packages generate javadoc, so their packaging manifests are updated to >the 7 >versions. > > >ALSO, because poold defines JAVA_ROOT in its binaries, you must set >JAVA_ROOT >when building poold to match the runtime java on your ONU or >otherwised-packaged target. For example, on my OI 151a9 test builder, >I >untarred an openjdk7 in /usr/jdk/instances/openjdk7/ and set >JAVA_ROOT=/usr/jdk/instances/openjdk7/. > > >IMPORTANT --> If you are an OI 151a9 user, and wish to use poold, >installing >openjdk7 in instances is not sufficient. You will need to set >/usr/java to >point to the openjdk7 instance as well. Illumos bug 5851 tracks this. > > >This change is the last of several steps that will allow other >platforms >(like OmniOS, e.g.) to build stock illumos-gate. I will post a >separate note >on building illumos-gate on OmniOS. > > >Thanks to Richard PALO for creating these diffs in the first place. > >Thanks! >Dan > > > >------------------------------------------- >illumos-developer >Archives: https://www.listbox.com/member/archive/182179/=now >RSS Feed: >https://www.listbox.com/member/archive/rss/182179/22416750-c03c8c44 >Modify Your Subscription: >https://www.listbox.com/member/?member_id=22416750&id_secret=22416750-eb7e3ed7 >Powered by Listbox: http://www.listbox.com Out of curiosity: Did you happen to check if the newer JDK magically solves the problem with Sun DHCP builds not producing functional bits of software for the past year or two? Jim -- Typos courtesy of K-9 Mail on my Samsung Android From doug at will.to Thu May 7 01:53:48 2015 From: doug at will.to (Doug Hughes) Date: Wed, 06 May 2015 21:53:48 -0400 Subject: [OmniOS-discuss] strange local repository corruption Message-ID: <554AC5AC.2080409@will.to> this is a relatively fresh install and not much going on on the machine, and I see somehow that the pkg repo got corrupted relatively recently on r14 t at x4275-3-15-20:/usr/local/orca-r535# pkg refresh An error was encountered while attempting to read image state information to perform the requested operation. Details follow: Catalog file '/var/pkg/state/installed/catalog.attrs' is invalid. Use 'pkgrepo rebuild' to create a new package catalog. root at x4275-3-15-20:/usr/local/orca-r535# ot at x4275-3-15-20:/usr/local/orca-r535# pkgrepo rebuild pkgrepo rebuild: A package repository location must be provided using -s. Try `pkgrepo --help or -?' for more information. root at x4275-3-15-20:/usr/local/orca-r535# ls -l /var/pkg/state/known/ total 2 -rw-r--r-- 1 root root 0 May 6 01:50 catalog.attrs -rw-r--r-- 1 root root 0 May 6 01:50 catalog.base.C -rw-r--r-- 1 root root 0 May 6 01:50 catalog.dependency.C -rw-r--r-- 1 root root 0 May 6 01:50 catalog.summary.C root at x4275-3-15-20:/usr/local/orca-r535# weird huh? any easy way to recover from this? From danmcd at omniti.com Thu May 7 03:00:39 2015 From: danmcd at omniti.com (Dan McDonald) Date: Wed, 6 May 2015 23:00:39 -0400 Subject: [OmniOS-discuss] strange local repository corruption In-Reply-To: <554AC5AC.2080409@will.to> References: <554AC5AC.2080409@will.to> Message-ID: pkgrepo rebuild -s Dan Sent from my iPhone (typos, autocorrect, and all) > On May 6, 2015, at 9:53 PM, Doug Hughes wrote: > > this is a relatively fresh install and not much going on on the machine, and I see somehow that the pkg repo got corrupted relatively recently on r14 > > t at x4275-3-15-20:/usr/local/orca-r535# pkg refresh > > An error was encountered while attempting to read image state information > to perform the requested operation. Details follow: > > Catalog file '/var/pkg/state/installed/catalog.attrs' is invalid. > Use 'pkgrepo rebuild' to create a new package catalog. > root at x4275-3-15-20:/usr/local/orca-r535# > ot at x4275-3-15-20:/usr/local/orca-r535# pkgrepo rebuild > pkgrepo rebuild: A package repository location must be provided using -s. > Try `pkgrepo --help or -?' for more information. > root at x4275-3-15-20:/usr/local/orca-r535# ls -l /var/pkg/state/known/ > total 2 > -rw-r--r-- 1 root root 0 May 6 01:50 catalog.attrs > -rw-r--r-- 1 root root 0 May 6 01:50 catalog.base.C > -rw-r--r-- 1 root root 0 May 6 01:50 catalog.dependency.C > -rw-r--r-- 1 root root 0 May 6 01:50 catalog.summary.C > root at x4275-3-15-20:/usr/local/orca-r535# > > weird huh? any easy way to recover from this? > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss From doug at will.to Thu May 7 03:08:31 2015 From: doug at will.to (Doug Hughes) Date: Wed, 06 May 2015 23:08:31 -0400 Subject: [OmniOS-discuss] strange local repository corruption In-Reply-To: References: <554AC5AC.2080409@will.to> Message-ID: <554AD72F.8090503@will.to> Didn't work. still 0. I ended up copying the repo from another machine built at the same time and then doing a refresh. On 5/6/2015 11:00 PM, Dan McDonald wrote: > pkgrepo rebuild -s > > Dan > > Sent from my iPhone (typos, autocorrect, and all) > >> On May 6, 2015, at 9:53 PM, Doug Hughes wrote: >> >> this is a relatively fresh install and not much going on on the machine, and I see somehow that the pkg repo got corrupted relatively recently on r14 >> >> t at x4275-3-15-20:/usr/local/orca-r535# pkg refresh >> >> An error was encountered while attempting to read image state information >> to perform the requested operation. Details follow: >> >> Catalog file '/var/pkg/state/installed/catalog.attrs' is invalid. >> Use 'pkgrepo rebuild' to create a new package catalog. >> root at x4275-3-15-20:/usr/local/orca-r535# >> ot at x4275-3-15-20:/usr/local/orca-r535# pkgrepo rebuild >> pkgrepo rebuild: A package repository location must be provided using -s. >> Try `pkgrepo --help or -?' for more information. >> root at x4275-3-15-20:/usr/local/orca-r535# ls -l /var/pkg/state/known/ >> total 2 >> -rw-r--r-- 1 root root 0 May 6 01:50 catalog.attrs >> -rw-r--r-- 1 root root 0 May 6 01:50 catalog.base.C >> -rw-r--r-- 1 root root 0 May 6 01:50 catalog.dependency.C >> -rw-r--r-- 1 root root 0 May 6 01:50 catalog.summary.C >> root at x4275-3-15-20:/usr/local/orca-r535# >> >> weird huh? any easy way to recover from this? >> >> _______________________________________________ >> OmniOS-discuss mailing list >> OmniOS-discuss at lists.omniti.com >> http://lists.omniti.com/mailman/listinfo/omnios-discuss From paladinemishakal at gmail.com Thu May 7 10:36:29 2015 From: paladinemishakal at gmail.com (Lawrence Giam) Date: Thu, 7 May 2015 18:36:29 +0800 Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and uname In-Reply-To: <4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com> References: <4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com> Message-ID: With the current R151014 having NFS issue, I may decide to stay on R151012 for a while but people are forgetful, so is there a way to update a file to reflect the build version? On Tue, May 5, 2015 at 1:43 AM, Dan McDonald wrote: > > > On May 4, 2015, at 1:10 PM, Davide Poletto > wrote: > > > > Just to say I've noticed that uname -v reports "illumos-omnios" on a > > OmniOS 151012 which was "omnios-10b9c79" after I updated it today > > (packages released on 17.04.2015 at official repository): > > > > OmniOS 5.11 omnios-10b9c79 September 2014 > > root at nas:/root# > > > > OmniOS 5.11 illumos-omnios April 2015 > > root at nas:/root# > > > > Is that OK/by Design? > > That was my fault during the kernel build. I had the wrong variable set > in my .env file. > > > Just for reference on OmniOS 151014, after the same big set of updates > > (released the same day, 17.04.2015), the uname -v changed from > > "omnios-a708424" (from its ISO install) to "omnios-170cea2". > > Yes, I believe only r151012 was affected poorly by this. Since 012 is in > its last 6 months of support life, I'm not particularly concerned. > > Dan > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Thu May 7 14:00:22 2015 From: danmcd at omniti.com (Dan McDonald) Date: Thu, 7 May 2015 10:00:22 -0400 Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and uname In-Reply-To: References: <4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com> Message-ID: <565EE100-E1CF-4F07-AF0E-A6101F20EFDF@omniti.com> > On May 7, 2015, at 6:36 AM, Lawrence Giam wrote: > > With the current R151014 having NFS issue, What NFS issue? Are you confusing '012 and '014? '012 had some lock manager corner-cases, but '014 has fixed that. Dan From danmcd at omniti.com Thu May 7 14:01:13 2015 From: danmcd at omniti.com (Dan McDonald) Date: Thu, 7 May 2015 10:01:13 -0400 Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and uname In-Reply-To: <565EE100-E1CF-4F07-AF0E-A6101F20EFDF@omniti.com> References: <4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com> <565EE100-E1CF-4F07-AF0E-A6101F20EFDF@omniti.com> Message-ID: <598573DD-D37C-45C2-873B-6252CD729775@omniti.com> > On May 7, 2015, at 10:00 AM, Dan McDonald wrote: > > >> On May 7, 2015, at 6:36 AM, Lawrence Giam wrote: >> >> With the current R151014 having NFS issue, > > What NFS issue? Are you confusing '012 and '014? '012 had some lock manager corner-cases, but '014 has fixed that. And if you're talking about the one Chris S. has reported --> it's also present in 010 and 012, and likely earlier as well. Dan From john.barfield at bissinc.com Thu May 7 19:22:56 2015 From: john.barfield at bissinc.com (John Barfield) Date: Thu, 7 May 2015 19:22:56 +0000 Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and uname In-Reply-To: <598573DD-D37C-45C2-873B-6252CD729775@omniti.com> References: <4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com> <565EE100-E1CF-4F07-AF0E-A6101F20EFDF@omniti.com> <598573DD-D37C-45C2-873B-6252CD729775@omniti.com> Message-ID: <05A50C77-7549-4DD0-9886-BF30D5904131@bissinc.com> Hey Dan has the statd NFS bug been resolved in this release? https://www.illumos.org/issues/4518 If not I already built the the binary if it would be helpful to anyone. John Barfield / Sr Principal Engineer +1 (214) 425-0783/ john.barfield at bissinc.com BISS, Inc. Office: +1 (214) 506-8354 4925 Greenville Ave Suite 900 Dallas, TX 75206 support.bissinc.com This e-mail message may contain confidential or legally privileged information and is intended only for the use of the intended recipient(s). Any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is prohibited. E-mails are not secure and cannot be guaranteed to be error free as they can be intercepted, amended, or contain viruses. Anyone who communicates with us by e-mail is deemed to have accepted these risks. Company Name is not responsible for errors or omissions in this message and denies any responsibility for any damage arising from the use of e-mail. Any opinion and other statement contained in this message and any attachment are solely those of the author and do not necessarily represent those of the company. On 5/7/15, 9:01 AM, "Dan McDonald" wrote: > >> On May 7, 2015, at 10:00 AM, Dan McDonald wrote: >> >> >>> On May 7, 2015, at 6:36 AM, Lawrence Giam >>>wrote: >>> >>> With the current R151014 having NFS issue, >> >> What NFS issue? Are you confusing '012 and '014? '012 had some lock >>manager corner-cases, but '014 has fixed that. > >And if you're talking about the one Chris S. has reported --> it's also >present in 010 and 012, and likely earlier as well. > >Dan > >_______________________________________________ >OmniOS-discuss mailing list >OmniOS-discuss at lists.omniti.com >http://lists.omniti.com/mailman/listinfo/omnios-discuss From danmcd at omniti.com Thu May 7 19:26:04 2015 From: danmcd at omniti.com (Dan McDonald) Date: Thu, 7 May 2015 15:26:04 -0400 Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and uname In-Reply-To: <05A50C77-7549-4DD0-9886-BF30D5904131@bissinc.com> References: <4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com> <565EE100-E1CF-4F07-AF0E-A6101F20EFDF@omniti.com> <598573DD-D37C-45C2-873B-6252CD729775@omniti.com> <05A50C77-7549-4DD0-9886-BF30D5904131@bissinc.com> Message-ID: <91E2CF48-BD99-4D1F-AA17-D21C48BE5F2A@omniti.com> > On May 7, 2015, at 3:22 PM, John Barfield wrote: > > Hey Dan has the statd NFS bug been resolved in this release? > > https://www.illumos.org/issues/4518 4518 is in OmniOS r151014: https://github.com/omniti-labs/illumos-omnios/commit/98573c1925f3692d1e8ea9eb018cb915fc0becc5 And: bloody(~/ws/illumos-omnios)[0]% git branch -r --contains 98573c1925f3692d1e8ea9eb018cb915fc0becc5 origin/HEAD -> origin/master origin/master origin/r151014 origin/upstream bloody(~/ws/illumos-omnios)[0]% Dan From john.barfield at bissinc.com Thu May 7 19:26:53 2015 From: john.barfield at bissinc.com (John Barfield) Date: Thu, 7 May 2015 19:26:53 +0000 Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and uname In-Reply-To: <91E2CF48-BD99-4D1F-AA17-D21C48BE5F2A@omniti.com> References: <4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com> <565EE100-E1CF-4F07-AF0E-A6101F20EFDF@omniti.com> <598573DD-D37C-45C2-873B-6252CD729775@omniti.com> <05A50C77-7549-4DD0-9886-BF30D5904131@bissinc.com> <91E2CF48-BD99-4D1F-AA17-D21C48BE5F2A@omniti.com> Message-ID: <99021AD4-5AE3-41D0-BB68-494DD57059FD@bissinc.com> Okay awesome! I?ve been holding back on upgrading but now I probably will. John Barfield / Sr Principal Engineer +1 (214) 425-0783/ john.barfield at bissinc.com BISS, Inc. Office: +1 (214) 506-8354 4925 Greenville Ave Suite 900 Dallas, TX 75206 support.bissinc.com This e-mail message may contain confidential or legally privileged information and is intended only for the use of the intended recipient(s). Any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is prohibited. E-mails are not secure and cannot be guaranteed to be error free as they can be intercepted, amended, or contain viruses. Anyone who communicates with us by e-mail is deemed to have accepted these risks. Company Name is not responsible for errors or omissions in this message and denies any responsibility for any damage arising from the use of e-mail. Any opinion and other statement contained in this message and any attachment are solely those of the author and do not necessarily represent those of the company. On 5/7/15, 2:26 PM, "Dan McDonald" wrote: > >> On May 7, 2015, at 3:22 PM, John Barfield >>wrote: >> >> Hey Dan has the statd NFS bug been resolved in this release? >> >> https://www.illumos.org/issues/4518 > >4518 is in OmniOS r151014: > >https://github.com/omniti-labs/illumos-omnios/commit/98573c1925f3692d1e8ea >9eb018cb915fc0becc5 > >And: > >bloody(~/ws/illumos-omnios)[0]% git branch -r --contains >98573c1925f3692d1e8ea9eb018cb915fc0becc5 > origin/HEAD -> origin/master > origin/master > origin/r151014 > origin/upstream >bloody(~/ws/illumos-omnios)[0]% > > >Dan > > From doug at will.to Thu May 7 20:33:30 2015 From: doug at will.to (Doug Hughes) Date: Thu, 7 May 2015 16:33:30 -0400 Subject: [OmniOS-discuss] OmniOS 151012 latest 17.04.2015 updates and uname In-Reply-To: <99021AD4-5AE3-41D0-BB68-494DD57059FD@bissinc.com> References: <4D8E4333-31F3-4B55-B6D3-98901CB57F0D@omniti.com> <565EE100-E1CF-4F07-AF0E-A6101F20EFDF@omniti.com> <598573DD-D37C-45C2-873B-6252CD729775@omniti.com> <05A50C77-7549-4DD0-9886-BF30D5904131@bissinc.com> <91E2CF48-BD99-4D1F-AA17-D21C48BE5F2A@omniti.com> <99021AD4-5AE3-41D0-BB68-494DD57059FD@bissinc.com> Message-ID: I can confirm that no more nlockmgr's for us since going with r14! On Thu, May 7, 2015 at 3:26 PM, John Barfield wrote: > Okay awesome! I?ve been holding back on upgrading but now I probably will. > > > John Barfield / Sr Principal Engineer > +1 (214) 425-0783/ john.barfield at bissinc.com > BISS, Inc. Office: +1 (214) 506-8354 > > 4925 Greenville Ave Suite 900 > Dallas, TX 75206 > support.bissinc.com > This e-mail message may contain confidential or legally privileged > information and is intended only for the use of the intended recipient(s). > Any unauthorized disclosure, dissemination, distribution, copying or the > taking of any action in reliance on the information herein is prohibited. > E-mails are not secure and cannot be guaranteed to be error free as they > can be intercepted, amended, or contain viruses. Anyone who communicates > with us by e-mail is deemed to have accepted these risks. Company Name is > not responsible for errors or omissions in this message and denies any > responsibility for any damage arising from the use of e-mail. Any opinion > and other statement contained in this message and any attachment are > solely those of the author and do not necessarily represent those of the > company. > > > > > > > On 5/7/15, 2:26 PM, "Dan McDonald" wrote: > > > > >> On May 7, 2015, at 3:22 PM, John Barfield > >>wrote: > >> > >> Hey Dan has the statd NFS bug been resolved in this release? > >> > >> https://www.illumos.org/issues/4518 > > > >4518 is in OmniOS r151014: > > > > > https://github.com/omniti-labs/illumos-omnios/commit/98573c1925f3692d1e8ea > >9eb018cb915fc0becc5 > > > >And: > > > >bloody(~/ws/illumos-omnios)[0]% git branch -r --contains > >98573c1925f3692d1e8ea9eb018cb915fc0becc5 > > origin/HEAD -> origin/master > > origin/master > > origin/r151014 > > origin/upstream > >bloody(~/ws/illumos-omnios)[0]% > > > > > >Dan > > > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skiselkov.ml at gmail.com Fri May 8 16:48:26 2015 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Fri, 08 May 2015 18:48:26 +0200 Subject: [OmniOS-discuss] What repos do people use to build a *AMP server? Message-ID: <554CE8DA.6010504@gmail.com> I've decided to try and update my r151006 box to something newer, seeing as r151014 just came out and it's supposed to be LTS. Trouble is, I'm trying to build a *AMP box and I can't find any prebuilt packages for it in any of these repos: http://omnios.omniti.com/wiki.php/Packaging What do you guys use for getting pre-built software? Do all people here just roll their own? Also, allow me to say, I *hate* consolidations and the way they lock accessible package versions. Where are the days when OSes used to be backwards-compatible? Cheers, -- Saso From chip at innovates.com Fri May 8 16:56:30 2015 From: chip at innovates.com (Schweiss, Chip) Date: Fri, 8 May 2015 11:56:30 -0500 Subject: [OmniOS-discuss] What repos do people use to build a *AMP server? In-Reply-To: <554CE8DA.6010504@gmail.com> References: <554CE8DA.6010504@gmail.com> Message-ID: I've done really well with the OpenCSW packages on OmniOS. -Chip On May 8, 2015 11:50 AM, "Saso Kiselkov" wrote: > I've decided to try and update my r151006 box to something newer, seeing > as r151014 just came out and it's supposed to be LTS. Trouble is, I'm > trying to build a *AMP box and I can't find any prebuilt packages for it > in any of these repos: > http://omnios.omniti.com/wiki.php/Packaging > What do you guys use for getting pre-built software? Do all people here > just roll their own? > > Also, allow me to say, I *hate* consolidations and the way they lock > accessible package versions. Where are the days when OSes used to be > backwards-compatible? > > Cheers, > -- > Saso > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mir at miras.org Fri May 8 17:11:02 2015 From: mir at miras.org (Michael Rasmussen) Date: Fri, 8 May 2015 19:11:02 +0200 Subject: [OmniOS-discuss] What repos do people use to build a *AMP server? In-Reply-To: References: <554CE8DA.6010504@gmail.com> Message-ID: <20150508191102.3e6089fc@sleipner.datanom.net> On Fri, 8 May 2015 11:56:30 -0500 "Schweiss, Chip" wrote: > I've done really well with the OpenCSW packages on OmniOS. > There is also a fine repository here: http://pkg.niksula.hut.fi/en/index.shtml -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: It's hard to argue that God hated Oklahoma. If He didn't, why is it so close to Texas? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 181 bytes Desc: OpenPGP digital signature URL: From skiselkov.ml at gmail.com Fri May 8 17:14:01 2015 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Fri, 08 May 2015 19:14:01 +0200 Subject: [OmniOS-discuss] What repos do people use to build a *AMP server? In-Reply-To: References: <554CE8DA.6010504@gmail.com> Message-ID: <554CEED9.4020906@gmail.com> On 5/8/15 6:56 PM, Schweiss, Chip wrote: > I've done really well with the OpenCSW packages on OmniOS. Thanks, seems to be working pretty well. Still, lamentable that there's no IPS mirrors around (although, given how IPS can be obnoxious, I'm not surprised). Cheers, -- Saso From alka at hfg-gmuend.de Fri May 8 20:27:33 2015 From: alka at hfg-gmuend.de (=?utf-8?Q?G=C3=BCnther_Alka?=) Date: Fri, 8 May 2015 22:27:33 +0200 Subject: [OmniOS-discuss] What repos do people use to build a *AMP server? In-Reply-To: <554CE8DA.6010504@gmail.com> References: <554CE8DA.6010504@gmail.com> Message-ID: <8D2EC53B-E904-402A-9E2B-974382467069@hfg-gmuend.de> You can use the pkgin repo from SmartOS as it is the most complete source. It is used by the amp setup script provided as a community add-on for napp-it (you do not need napp-it to use the script) http://napp-it.org/extensions/amp_en.html Gea > Am 08.05.2015 um 18:48 schrieb Saso Kiselkov : > > I've decided to try and update my r151006 box to something newer, seeing > as r151014 just came out and it's supposed to be LTS. Trouble is, I'm > trying to build a *AMP box and I can't find any prebuilt packages for it > in any of these repos: > http://omnios.omniti.com/wiki.php/Packaging > What do you guys use for getting pre-built software? Do all people here > just roll their own? > > Also, allow me to say, I *hate* consolidations and the way they lock > accessible package versions. Where are the days when OSes used to be > backwards-compatible? > > Cheers, > -- > Saso > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.elling at richardelling.com Fri May 8 20:52:39 2015 From: richard.elling at richardelling.com (Richard Elling) Date: Fri, 8 May 2015 13:52:39 -0700 Subject: [OmniOS-discuss] Writeback Cache Auto disabled In-Reply-To: <002801d087d1$3a8454d0$af8cfe70$@xmweixun.com> References: <001901d08708$bff6b910$3fe42b30$@xmweixun.com> <70165454-5855-455D-BE88-8AB444934C45@RichardElling.com> <002801d087d1$3a8454d0$af8cfe70$@xmweixun.com> Message-ID: > On May 6, 2015, at 12:49 AM, dwq at xmweixun.com wrote: > > Hi Richard > I use stmfadm modify-lu ?p wcd=false LU Name, change write cache to enable,but when client read or wirte io from lu,lu status (writeback cache) change to disable again. This is correct. Initiators can override the target's default. -- richard > > > > Best Regards, > Deng Wei Quan / ??? > Mob: +86 13906055059 > Mail: dwq at xmweixun.com > ???????????? > > ???: dwq+auto_=dengweiquan=139.com at xmweixun.com [mailto:dwq+auto_=dengweiquan=139.com at xmweixun.com] ?? Richard Elling > ????: 2015?5?5? 23:17 > ???: dwq at xmweixun.com > ??: omnios-discuss at lists.omniti.com > ??: Re: [OmniOS-discuss] Writeback Cache Auto disabled > > >> On May 5, 2015, at 12:54 AM, > > wrote: >> >> Hi All, >> When I present lu to hpux or aix, lu writeback cache auto disabled,why? > > In SCSI, initiators can change the write cache policy. > ? richard > > >> >> LU Name: 600144F00000000000005548DC360005 >> Operational Status: Online >> Provider Name : sbd >> Alias : /dev/zvol/rdsk/wxnas/hpuxtest03 >> View Entry Count : 1 >> Data File : /dev/zvol/rdsk/wxnas/hpuxtest03 >> Meta File : not set >> Size : 21474836480 >> Block Size : 512 >> Management URL : not set >> Vendor ID : SUN >> Product ID : COMSTAR >> Serial Num : not set >> Write Protect : Disabled >> Writeback Cache : Disabled >> Access State : Active >> >> >> Thanks. >> >> Version: >> SunOS wxos1 5.11 omnios-b281e50 i86pc i386 i86pc >> Deng >> >> _______________________________________________ >> OmniOS-discuss mailing list >> OmniOS-discuss at lists.omniti.com >> http://lists.omniti.com/mailman/listinfo/omnios-discuss > > -- > > Richard.Elling at RichardElling.com > +1-760-896-4422 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.elling at richardelling.com Sat May 9 00:49:52 2015 From: richard.elling at richardelling.com (Richard Elling) Date: Fri, 8 May 2015 17:49:52 -0700 Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then resumes In-Reply-To: <201505051648.t45GmpA4025308@lists-il.int.omniti.net> References: <55487539.6030408@zunaj.si> <201505051648.t45GmpA4025308@lists-il.int.omniti.net> Message-ID: <40C78E86-F32D-4588-AF98-EB9820019960@richardelling.com> > On May 5, 2015, at 9:48 AM, Matej Zerovnik wrote: > > I will replace the hardwarw in about 4 months with all SAS drives, but I would love to have a working setup for the time being as well;) > > I looked at smart stats and there doesnt seem to be any errors. Also, no hard/soft/transfer error reported by any drive. Will take a look at service time tomorrow, maybe put the drives to graphite and look at them over a longer period. > > I looked at iostat -x status today and stats for pool itself reported 100% busy most of the time, 98-100% wait, 500-1300 transactions in queue, around 500 active,... First line, that is average from boot, says avg service time.is around 1600ms which seems like aaaalot. Can it be due to really big queue? > > Would it help to create 5 10drives raidz pools instead of one with 50 drives? It is a bad idea to build a single raidz set with 50 drives. Very bad. Hence the zpool man page says, "The recommended number is between 3 and 9 to help increase performance." But this recommendation applies to reliability, too. -- richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave-oo at pooserville.com Sat May 9 17:38:25 2015 From: dave-oo at pooserville.com (Dave Pooser) Date: Sat, 09 May 2015 12:38:25 -0500 Subject: [OmniOS-discuss] r151012 is coming... In-Reply-To: References: Message-ID: On 9/2/14, 1:22 PM, "Dan McDonald" wrote: >This includes HW goodies like LSI 3008-based 12G SAS (albeit not at >optimal performance yet) How sub-optimal is the LSI 3008-based support currently (as in 014)? Are we talking "faster than 6G SAS but not as fast as it should be" or "same speed as 6G SAS" or something else? The application would be a storage server running 24-36 hard drives as multiple RAIDz2 devices, used mostly for archiving large video files, so ridiculous performance isn't necessary -- mostly I'm looking at SuperMicro boards that already have the 3008 inside and want to know if I need to consider adding a better-supported HBA instead. -- Dave Pooser Cat-Herder-in-Chief, Pooserville.com From danmcd at omniti.com Sat May 9 17:45:40 2015 From: danmcd at omniti.com (Dan McDonald) Date: Sat, 9 May 2015 13:45:40 -0400 Subject: [OmniOS-discuss] r151012 is coming... In-Reply-To: References: Message-ID: I *believe* it's more than 6G, but not quite 12 yet. I didn't have any 3008 boards in house to see, but the illumos community did. You may be better off asking an illumos mailing list that question. I'd go with a 3008 on the board one, just make sure it has the IT firmware, and the correct (not latest) version. I think 28 is the known good version. Storage types here can confirm/deny that data point. Dan Sent from my iPhone (typos, autocorrect, and all) > On May 9, 2015, at 1:38 PM, Dave Pooser wrote: > >> On 9/2/14, 1:22 PM, "Dan McDonald" wrote: >> >> This includes HW goodies like LSI 3008-based 12G SAS (albeit not at >> optimal performance yet) > > How sub-optimal is the LSI 3008-based support currently (as in 014)? Are > we talking "faster than 6G SAS but not as fast as it should be" or "same > speed as 6G SAS" or something else? The application would be a storage > server running 24-36 hard drives as multiple RAIDz2 devices, used mostly > for archiving large video files, so ridiculous performance isn't necessary > -- mostly I'm looking at SuperMicro boards that already have the 3008 > inside and want to know if I need to consider adding a better-supported > HBA instead. > -- > Dave Pooser > Cat-Herder-in-Chief, Pooserville.com > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss From nagele at wildbit.com Sat May 9 18:06:51 2015 From: nagele at wildbit.com (Chris Nagele) Date: Sat, 9 May 2015 14:06:51 -0400 Subject: [OmniOS-discuss] High density 2.5" chassis Message-ID: Hi all. Continuing on my all SSD discussion, I am looking for some recommendations on a new Supermicro chassis for our file servers. So far I have been looking at this thing: http://www.supermicro.com/products/chassis/4U/417/SC417E16-R1400LP.cfm Does anyone have experience with this? If so, what would you recommend for a motherboard and HBA to support all of the disks? We've traditionally used the X9DRD-7LN4F-JBOD or the X9DRi-F with a LSI 9211-8i HBA. Thanks, Chris From chip at innovates.com Sat May 9 19:28:36 2015 From: chip at innovates.com (Schweiss, Chip) Date: Sat, 9 May 2015 14:28:36 -0500 Subject: [OmniOS-discuss] High density 2.5" chassis In-Reply-To: References: Message-ID: I have an SSD server in one of those chassis. Here's a write-up about it on my blog, there are 3 postings about it. http://www.bigdatajunkie.com/index.php/9-solaris/zfs/10-short-stroking-consumer-ssds Not necessarily a build for everyone, but it has been absolutely awesome for our use. After a few bumps at the beginning and giving up on HA on this server, it has been rock solid. Many will swear against the interposers, but combined with Samsung SSDs they have worked very well. -Chip On Sat, May 9, 2015 at 1:06 PM, Chris Nagele wrote: > Hi all. Continuing on my all SSD discussion, I am looking for some > recommendations on a new Supermicro > chassis for our file servers. So far I have been looking at this > thing: > > http://www.supermicro.com/products/chassis/4U/417/SC417E16-R1400LP.cfm > > Does anyone have experience with this? If so, what would you recommend > for a motherboard and HBA to support all of the disks? We've > traditionally used the X9DRD-7LN4F-JBOD or the X9DRi-F with a LSI > 9211-8i HBA. > > Thanks, > Chris > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Mon May 11 15:48:24 2015 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 11 May 2015 11:48:24 -0400 Subject: [OmniOS-discuss] KVM Performance Update Message-ID: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com> I first want to apologize for not recognizing the cause of KVM performance problems (which were DROPPED PACKETS) much sooner. Until recently, our KVM deployments in house have been either on r151006, or nothing else. I've added an OI KVM box to our r151014 build machine, to make sure I have a platform to attempt replications. What happened was that upstream illumos KVM (from Joyent) had a platform flag day during r151012's development --> the VND code. Joyent's illumos child has Virtual Networking Devices (VND) that allow KVM instances to not depend on an actual NIC's Promiscuous Mode to receive packets. They updated their illumos, and subsequently their KVM. Remember that "KVM" has two parts: The kernel KVM driver (from Joyent's illumos-kvm repo), and the "KVM-cmd", which is QEMU (from Joyent's illumos-kvm-cmd repo). Other distros do not have VND currently (the illumos community is attempting to fix this, and Joyent is leading here, modulo their own day jobs). The compilation of illumos-kvm-cmd's latest revisions (the QEMU bits) fails without having VND around. We reset illumos-kvm-cmd to the pre-VND revision, but did NOT reset illumos-kvm bits to pre-VND. Since the world compiled and ran in this split state, I moved forward. The PROBLEM was that the amount of internal buffering for promiscuous devices is low, and while VND fixes the problem by reducing the use of promiscuous mode, non-VND illumos (like OmniOS) still needs to increase limits. The up-to-date kernel side eliminated the method for increasing these buffering limits, causing MUCH higher packet drop rates. Quoting Joyent's Robert Mustacchi: > By default the stream high watermark for the promisc mode is quite low. > And for some reason, that I don't recall, there was no great way to do > that ourselves from user land (could be wrong entirely). As a result, if > you don't set it, we're basically going to start dropping mblk_t's > queued on the stream. > > Basically without vnd, you need both of those. With vnd, then you can > get rid of it in both QEMU and KVM. Tobi Oetiker (who deserves a ton of credit for calling this problem out, AND determining it was packet drops) helped me test two solutions to the problem: 1.) Revert illumos-kvm to the pre-VND level as well. 2.) Keep up to date with illumos-kvm and illumos-kvm-cmd, but explicitly revert the VND changes in BOTH. I'm strongly leaning toward committing solution #2. Regardless of which, I will be issuing an update for r151014 later this week that will push KVM performance back to its pre-VND-bump levels. GOING FORWARD, once VND is upstreamed into illumos-gate, I can eliminate the VND backouts (or just catch up the built repos if I use option #1 above). Thank you all for your patience, and again, sorry for not addressing this sooner. Dan McDonald -- OmniOS Engineering From nagele at wildbit.com Mon May 11 16:24:33 2015 From: nagele at wildbit.com (Chris Nagele) Date: Mon, 11 May 2015 12:24:33 -0400 Subject: [OmniOS-discuss] High density 2.5" chassis In-Reply-To: References: Message-ID: Thanks Chip. That's a great write up. I've definitely heard a lot of negative things about interposers, right we've been using them for years as well. Not saying it is fine, but just my experience. If we didn't use interposers how else would it work with that many drives? Chris Chris Nagele Co-founder, Wildbit Beanstalk, Postmark, dploy.io On Sat, May 9, 2015 at 3:28 PM, Schweiss, Chip wrote: > I have an SSD server in one of those chassis. Here's a write-up about it on > my blog, there are 3 postings about it. > > http://www.bigdatajunkie.com/index.php/9-solaris/zfs/10-short-stroking-consumer-ssds > > Not necessarily a build for everyone, but it has been absolutely awesome for > our use. After a few bumps at the beginning and giving up on HA on this > server, it has been rock solid. Many will swear against the interposers, > but combined with Samsung SSDs they have worked very well. > > -Chip > > > On Sat, May 9, 2015 at 1:06 PM, Chris Nagele wrote: >> >> Hi all. Continuing on my all SSD discussion, I am looking for some >> recommendations on a new Supermicro >> chassis for our file servers. So far I have been looking at this >> thing: >> >> http://www.supermicro.com/products/chassis/4U/417/SC417E16-R1400LP.cfm >> >> Does anyone have experience with this? If so, what would you recommend >> for a motherboard and HBA to support all of the disks? We've >> traditionally used the X9DRD-7LN4F-JBOD or the X9DRi-F with a LSI >> 9211-8i HBA. >> >> Thanks, >> Chris >> _______________________________________________ >> OmniOS-discuss mailing list >> OmniOS-discuss at lists.omniti.com >> http://lists.omniti.com/mailman/listinfo/omnios-discuss > > From john.barfield at bissinc.com Mon May 11 17:15:59 2015 From: john.barfield at bissinc.com (John Barfield) Date: Mon, 11 May 2015 17:15:59 +0000 Subject: [OmniOS-discuss] KVM Performance Update In-Reply-To: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com> References: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com> Message-ID: <4379D4FD-F566-4A07-AABA-5A7355635B20@bissinc.com> This is great news! Thank you. John Barfield / Sr Principal Engineer +1 (214) 425-0783/ john.barfield at bissinc.com BISS, Inc. Office: +1 (214) 506-8354 4925 Greenville Ave Suite 900 Dallas, TX 75206 support.bissinc.com This e-mail message may contain confidential or legally privileged information and is intended only for the use of the intended recipient(s). Any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is prohibited. E-mails are not secure and cannot be guaranteed to be error free as they can be intercepted, amended, or contain viruses. Anyone who communicates with us by e-mail is deemed to have accepted these risks. Company Name is not responsible for errors or omissions in this message and denies any responsibility for any damage arising from the use of e-mail. Any opinion and other statement contained in this message and any attachment are solely those of the author and do not necessarily represent those of the company. On 5/11/15, 10:48 AM, "Dan McDonald" wrote: >I first want to apologize for not recognizing the cause of KVM >performance problems (which were DROPPED PACKETS) much sooner. Until >recently, our KVM deployments in house have been either on r151006, or >nothing else. I've added an OI KVM box to our r151014 build machine, to >make sure I have a platform to attempt replications. > >What happened was that upstream illumos KVM (from Joyent) had a platform >flag day during r151012's development --> the VND code. Joyent's illumos >child has Virtual Networking Devices (VND) that allow KVM instances to >not depend on an actual NIC's Promiscuous Mode to receive packets. They >updated their illumos, and subsequently their KVM. Remember that "KVM" >has two parts: The kernel KVM driver (from Joyent's illumos-kvm repo), >and the "KVM-cmd", which is QEMU (from Joyent's illumos-kvm-cmd repo). > >Other distros do not have VND currently (the illumos community is >attempting to fix this, and Joyent is leading here, modulo their own day >jobs). The compilation of illumos-kvm-cmd's latest revisions (the QEMU >bits) fails without having VND around. We reset illumos-kvm-cmd to the >pre-VND revision, but did NOT reset illumos-kvm bits to pre-VND. Since >the world compiled and ran in this split state, I moved forward. The >PROBLEM was that the amount of internal buffering for promiscuous devices >is low, and while VND fixes the problem by reducing the use of >promiscuous mode, non-VND illumos (like OmniOS) still needs to increase >limits. The up-to-date kernel side eliminated the method for increasing >these buffering limits, causing MUCH higher packet drop rates. > >Quoting Joyent's Robert Mustacchi: > >> By default the stream high watermark for the promisc mode is quite low. >> And for some reason, that I don't recall, there was no great way to do >> that ourselves from user land (could be wrong entirely). As a result, if >> you don't set it, we're basically going to start dropping mblk_t's >> queued on the stream. >> >> Basically without vnd, you need both of those. With vnd, then you can >> get rid of it in both QEMU and KVM. > >Tobi Oetiker (who deserves a ton of credit for calling this problem out, >AND determining it was packet drops) helped me test two solutions to the >problem: > >1.) Revert illumos-kvm to the pre-VND level as well. > >2.) Keep up to date with illumos-kvm and illumos-kvm-cmd, but explicitly >revert the VND changes in BOTH. > >I'm strongly leaning toward committing solution #2. Regardless of which, >I will be issuing an update for r151014 later this week that will push >KVM performance back to its pre-VND-bump levels. > >GOING FORWARD, once VND is upstreamed into illumos-gate, I can eliminate >the VND backouts (or just catch up the built repos if I use option #1 >above). > >Thank you all for your patience, and again, sorry for not addressing this >sooner. > >Dan McDonald -- OmniOS Engineering > > >_______________________________________________ >OmniOS-discuss mailing list >OmniOS-discuss at lists.omniti.com >http://lists.omniti.com/mailman/listinfo/omnios-discuss From matej at zunaj.si Tue May 12 05:13:35 2015 From: matej at zunaj.si (Matej Zerovnik) Date: Tue, 12 May 2015 07:13:35 +0200 Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then resumes In-Reply-To: <40C78E86-F32D-4588-AF98-EB9820019960@richardelling.com> References: <55487539.6030408@zunaj.si> <201505051648.t45GmpA4025308@lists-il.int.omniti.net> <40C78E86-F32D-4588-AF98-EB9820019960@richardelling.com> Message-ID: <55518BFF.6080608@zunaj.si> I know building a single 50 drives RaidZ2 is a bad idea. As I said, it's a legacy that I can't easily change. I already have a backup pool with 7x10 drives RaidZ2 to which I hope I will be able to switch this week. I hope to get some better results and less crashing... What is interesting is that when the 'event' happens, server works normaly, ZFS is accessable and writable(at least, there is no errors in log files), only iscsi reports errors and drops the connection. Another interesting thing is that after the 'event', all write stops, only read continues for another 30min. After 30min all traffic stops for half an hour. After that, everything starts to coming back up... Weird?! Matej On 09. 05. 2015 02:49, Richard Elling wrote: > >> On May 5, 2015, at 9:48 AM, Matej Zerovnik > > wrote: >> >> I will replace the hardwarw in about 4 months with all SAS drives, >> but I would love to have a working setup for the time being as well;) >> >> I looked at smart stats and there doesnt seem to be any errors. Also, >> no hard/soft/transfer error reported by any drive. Will take a look >> at service time tomorrow, maybe put the drives to graphite and look >> at them over a longer period. >> >> I looked at iostat -x status today and stats for pool itself reported >> 100% busy most of the time, 98-100% wait, 500-1300 transactions in >> queue, around 500 active,... First line, that is average from boot, >> says avg service time.is around 1600ms which seems >> like aaaalot. Can it be due to really big queue? >> >> Would it help to create 5 10drives raidz pools instead of one with 50 >> drives? > > It is a bad idea to build a single raidz set with 50 drives. Very bad. > Hence the zpool > man page says, "The recommended number is between 3 and 9 to help > increase performance." > But this recommendation applies to reliability, too. > -- richard > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paladinemishakal at gmail.com Tue May 12 10:23:34 2015 From: paladinemishakal at gmail.com (Lawrence Giam) Date: Tue, 12 May 2015 18:23:34 +0800 Subject: [OmniOS-discuss] Debugging crash dump Message-ID: Hi All, I have a few time the server panic and auto-rebooted with crash dump. I am looking at this post http://wiki.illumos.org/display/illumos/How+To+Report+Problems but it seem the info is not updated. When I run this: echo '::panicinfo\n::cpuinfo -v\n::threadlist -v 10\n::msgbuf\n*panic_thread::findstack -v\n::stacks' | mdb 5 > ~/crash.5 root at sgbk02:/var/crash/unknown# echo '::panicinfo\n::cpuinfo -v\n::threadlist -v 10\n::msgbuf\n*panic_thread::findstack -v\n::stacks' | mdb 5 > crash.5 mdb: warning: dump is from SunOS 5.11 omnios-8c08411; dcmds and macros may not match kernel implementation mdb: failed to read .symtab header for 'unix', id=0: no mapping for address mdb: failed to read .symtab header for 'genunix', id=1: no mapping for address mdb: failed to read modctl at ffffff113ba4cf08: no mapping for address mdb: invalid command '::panicinfo': unknown dcmd name mdb: invalid command '::cpuinfo': unknown dcmd name mdb: invalid command '::threadlist': unknown dcmd name mdb: invalid command '::msgbuf': unknown dcmd name mdb: invalid command '::findstack': unknown dcmd name mdb: invalid command '::stacks': unknown dcmd name Can someone update the wiki on how to get the kernel messages and stack information? Thanks & Regards, Lawrence. -------------- next part -------------- An HTML attachment was scrubbed... URL: From paladinemishakal at gmail.com Tue May 12 10:33:42 2015 From: paladinemishakal at gmail.com (Lawrence Giam) Date: Tue, 12 May 2015 18:33:42 +0800 Subject: [OmniOS-discuss] Help with debugging crash dump Message-ID: Hi All, I have tried to analyse the crash dump and the following is what I get: root at sgbk02:/var/crash/unknown# mdb -k unix.3 vmcore.3 mdb: warning: dump is from SunOS 5.11 omnios-8c08411; dcmds and macros may not match kernel implementation mdb: failed to read .symtab header for 'unix', id=0: no mapping for address mdb: failed to read .symtab header for 'genunix', id=1: no mapping for address mdb: failed to read modctl at ffffff113ba4cf08: no mapping for address > ::stack > ::showrev Hostname: sgsan3 Release: 5.11 Kernel architecture: i86pc Application architecture: amd64 Kernel version: SunOS 5.11 i86pc omnios-8c08411 Platform: i86pc > ::status debugging crash dump vmcore.3 (64-bit) from sgsan3 operating system: 5.11 omnios-8c08411 (i86pc) image uuid: 299f9dfc-c835-4319-b0cb-d5c0b0c5841e panic message: BAD TRAP: type=e (#pf Page fault) rp=ffffff007d2cd450 addr=ffffff14a0c700c8 dump content: kernel pages only > $r %rax = 0x0000000000000000 %r9 = 0xffffff14a0c6fea8 %rbx = 0xffffff14a0c6f990 %r10 = 0x6636314141416574 %rcx = 0xffffff116bc9da00 %r11 = 0xffffff007d2cd530 %rdx = 0xffffff14a0c6fdb8 %r12 = 0x0000000000000016 %rsi = 0x0000000000000000 %r13 = 0xffffff11b84409b8 %rdi = 0xffffff113b62de00 %r14 = 0x0000000000000002 %r8 = 0xffffff11b84409b8 %r15 = 0xffffff14a0c6fea8 %rip = 0xfffffffff82a22b8 smb_fsop_lookup+0x118 %rbp = 0xffffff007d2cd6b0 %rsp = 0xffffff007d2cd540 %rflags = 0x00010286 id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0 status= %cs = 0x0030 %ds = 0x004b %es = 0x004b %trapno = 0xe %fs = 0x0000 %gs = 0x01c3 %err = 0x0 > 0xfffffffff82a22b8::dis mdb: failed to read instruction at 0xfffffffff82a22b8: no mapping for address > 0xfffffffff82a22b8::dump 0 1 2 3 4 5 6 7 \/ 9 a b c d e f 01234567v9abcdef mdb: failed to read data at 0xfffffffff82a22b8: no mapping for address > ::quit Look at this, it looks like some SMB issue. Can some one show me how to get more info from the crash dump? Thanks & Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Tue May 12 13:02:28 2015 From: danmcd at omniti.com (Dan McDonald) Date: Tue, 12 May 2015 09:02:28 -0400 Subject: [OmniOS-discuss] Debugging crash dump In-Reply-To: References: Message-ID: > On May 12, 2015, at 6:23 AM, Lawrence Giam wrote: > > Hi All, > > I have a few time the server panic and auto-rebooted with crash dump. I am looking at this post http://wiki.illumos.org/display/illumos/How+To+Report+Problems but it seem the info is not updated. > > When I run this: > echo '::panicinfo\n::cpuinfo -v\n::threadlist -v 10\n::msgbuf\n*panic_thread::findstack -v\n::stacks' | mdb 5 > ~/crash.5 > > root at sgbk02:/var/crash/unknown# echo '::panicinfo\n::cpuinfo -v\n::threadlist -v 10\n::msgbuf\n*panic_thread::findstack -v\n::stacks' | mdb 5 > crash.5 > mdb: warning: dump is from SunOS 5.11 omnios-8c08411; dcmds and macros may not match kernel implementation 8c08411 is OmniOS r151010 -- that's where your dump is from. Are you running the analysis on a later-release machine? It looks like that's the case, and there have been mdb changes between 1-2 stable releases that would make reading 010 dumps difficult. "$c" or "$C" show you the kernel stack, and "::msgbuf" shows you the in-kernel-memory dmesg(1M) output. Dan From johan.kragsterman at capvert.se Tue May 12 17:08:19 2015 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Tue, 12 May 2015 19:08:19 +0200 Subject: [OmniOS-discuss] opendj in a zone Message-ID: Hi! Right now I'm trying to do some things that are not really within my knowledge. So I need to bother you guys a little bit, for advices... I'm setting up OpenDJ from Forgerock in a zone(151014). OpenDJ is a directory server, with SUN heritage, and it is java based. I seem to have managed to get the server up and running, but I can't reach the management console, I get some java errors. So if anyone of you got any input on this, pls let me know....(I'm a complete nooob with java...): root at z1:/etc/opendj/bin# ./control-panel Could not launch Control Panel. Check that you have access to the display. Check file /var/tmp/opendj-control-panel-1827401860598694601.log for details. root at z1:/etc/opendj/bin# cat /var/tmp/opendj-control-panel-1827401860598694601.log May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.util.ControlPanelLog initLogFileHandler INFO: Application launched May 12, 2015 4:52:28 PM UTC May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.ControlPanelLauncher$1 run WARNING: Error setting look and feel: java.awt.AWTError: Toolkit not found: sun.awt.X11.XToolkit java.awt.AWTError: Toolkit not found: sun.awt.X11.XToolkit at java.awt.Toolkit$2.run(Toolkit.java:876) at java.security.AccessController.doPrivileged(Native Method) at java.awt.Toolkit.getDefaultToolkit(Toolkit.java:861) at java.awt.Toolkit.getEventQueue(Toolkit.java:1752) at java.awt.EventQueue.isDispatchThread(EventQueue.java:1018) at javax.swing.SwingUtilities.isEventDispatchThread(SwingUtilities.java:1360) at org.opends.quicksetup.ui.UIFactory.initializeLookAndFeel(UIFactory.java:722) at org.opends.guitools.controlpanel.ControlPanelLauncher.initLookAndFeel(ControlPanelLauncher.java:240) at org.opends.guitools.controlpanel.ControlPanelLauncher.access$000(ControlPanelLauncher.java:61) at org.opends.guitools.controlpanel.ControlPanelLauncher$1.run(ControlPanelLauncher.java:178) at java.lang.Thread.run(Thread.java:745) May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.ControlPanelLauncher$1 run WARNING: Error launching GUI: java.awt.HeadlessException May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.ControlPanelLauncher$1 run WARNING: java.awt.GraphicsEnvironment.checkHeadless(GraphicsEnvironment.java:207) java.awt.Window.(Window.java:535) java.awt.Frame.(Frame.java:420) java.awt.Frame.(Frame.java:385) org.opends.quicksetup.SplashScreen.(SplashScreen.java:104) org.opends.guitools.controlpanel.ControlPanelSplashScreen.(ControlPanelLauncher.java:300) org.opends.guitools.controlpanel.ControlPanelSplashScreen.main(ControlPanelLauncher.java:316) org.opends.guitools.controlpanel.ControlPanelLauncher$1.run(ControlPanelLauncher.java:185) java.lang.Thread.run(Thread.java:745) root at z1:/etc/opendj/bin# There are definitly no port access problems, the ports in use are 389 and 4444: root at z1:/etc/opendj/bin# netstat -an UDP: IPv4 Local Address Remote Address State -------------------- -------------------- ---------- *.111 Idle *.* Unbound *.37411 Idle *.111 Idle *.* Unbound *.50848 Idle UDP: IPv6 Local Address Remote Address State If --------------------------------- --------------------------------- ---------- ----- *.111 Idle *.* Unbound *.37411 Idle TCP: IPv4 Local Address Remote Address Swind Send-Q Rwind Recv-Q State -------------------- -------------------- ----- ------ ----- ------ ----------- *.111 *.* 0 0 128000 0 LISTEN *.* *.* 0 0 128000 0 IDLE *.111 *.* 0 0 128000 0 LISTEN *.* *.* 0 0 128000 0 IDLE *.22 *.* 0 0 128000 0 LISTEN *.54631 *.* 0 0 128000 0 LISTEN *.58404 *.* 0 0 128000 0 LISTEN *.4444 *.* 0 0 128000 0 LISTEN *.389 *.* 0 0 128000 0 LISTEN TCP: IPv6 Local Address Remote Address Swind Send-Q Rwind Recv-Q State If --------------------------------- --------------------------------- ----- ------ ----- ------ ----------- ----- *.111 *.* 0 0 128000 0 LISTEN *.* *.* 0 0 128000 0 IDLE *.22 *.* 0 0 128000 0 LISTEN *.58404 *.* 0 0 128000 0 LISTEN *.4444 *.* 0 0 128000 0 LISTEN *.389 *.* 0 0 128000 0 LISTEN Active UNIX domain sockets Address Type Vnode Conn Local Addr Remote Addr ffffff03f2f99048 stream-ord 0000000 0000000 ffffff03f2f99b58 stream-ord 0000000 0000000 ffffff03f282fb48 stream-ord ffffff03f27b4580 0000000 /var/run/.inetd.uds root at z1:/etc/opendj/bin# Best regards from/Med v?nliga h?lsningar fr?n Johan Kragsterman Capvert From danmcd at omniti.com Tue May 12 17:38:04 2015 From: danmcd at omniti.com (Dan McDonald) Date: Tue, 12 May 2015 13:38:04 -0400 Subject: [OmniOS-discuss] opendj in a zone In-Reply-To: References: Message-ID: > On May 12, 2015, at 1:08 PM, Johan Kragsterman wrote: > > So if anyone of you got any input on this, pls let me know....(I'm a complete nooob with java...): I'm no Java wizard, so take this with a grain of salt, but... > root at z1:/etc/opendj/bin# ./control-panel > Could not launch Control Panel. Check that you have access to the display. > Check file /var/tmp/opendj-control-panel-1827401860598694601.log for details. > root at z1:/etc/opendj/bin# cat /var/tmp/opendj-control-panel-1827401860598694601.log > May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.util.ControlPanelLog initLogFileHandler > INFO: Application launched May 12, 2015 4:52:28 PM UTC > May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.ControlPanelLauncher$1 run > WARNING: Error setting look and feel: java.awt.AWTError: Toolkit not found: sun.awt.X11.XToolkit > java.awt.AWTError: Toolkit not found: sun.awt.X11.XToolkit From that last quoted line, it looks like you'll need X11 libraries, possibly X11 *JAVA* libraries as well. We don't supply X11 at all in the "omnios" publisher. I'd suggest using pkgsrc and installing X11 libraries to help you out. I'm sure there are others on the list with experience installing X11 libraries on OmniOS, possibly even to help out Java apps. Dan From johan.kragsterman at capvert.se Tue May 12 18:55:13 2015 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Tue, 12 May 2015 20:55:13 +0200 Subject: [OmniOS-discuss] Ang: Re: opendj in a zone In-Reply-To: References: , Message-ID: Hi! -----Benjamin Sherman skrev: ----- Till: Dan McDonald Fr?n: Benjamin Sherman Datum: 2015-05-12 20:33 Kopia: Johan Kragsterman , omnios-discuss at lists.omniti.com ?rende: Re: [OmniOS-discuss] opendj in a zone Johan, I use OpenDJ and I've run it both on Linux and OmniOS. ? The simplest solution is do not attempt to run the control panel app from the OmniOS server. 1) Download/copy the OpenDJ package, or even a tarball of your installation from the OmniOS machine to a "desktop" machine, anything where you have Java and X11 (or Mac or Windows). ? 2) From that machine, run the control-panel. 3) When it starts, you'll need to provide IP, port, and credentials to make it talk to the OpenDJ daemon process on OmniOS machine, but it should work just fine otherwise. As Dan suggested, you can also install the X11 libs on OmniOS, but you'd still need a local X11 server for the remote control-panel process to use for user interaction. Ah, I see! Good, thanks! OK, I'm not trying to run it from the omnios zone, but from a linux LTSP fat client, so I would need to rebuild the chroot with opendj inside. But that's not a problem, I can do that. Or I can put the software on the LTSP server and run X over ssh... Question is, do I really need the control panel? Can I live without it? I guess I can administrate the server with other tools? May I ask what your use case is? Do you also use other Forgerock software, like OpenAM and OpenIDM? Do you use any of the REST2LDAP tools? Regards Johan -Benjamin > On May 12, 2015, at 10:38 AM, Dan McDonald wrote: > > >> On May 12, 2015, at 1:08 PM, Johan Kragsterman wrote: >> >> So if anyone of you got any input on this, pls let me know....(I'm a complete nooob with java...): > > I'm no Java wizard, so take this with a grain of salt, but... > >> root at z1:/etc/opendj/bin# ./control-panel >> Could not launch Control Panel. ?Check that you have access to the display. >> Check file /var/tmp/opendj-control-panel-1827401860598694601.log for details. >> root at z1:/etc/opendj/bin# cat /var/tmp/opendj-control-panel-1827401860598694601.log >> May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.util.ControlPanelLog initLogFileHandler >> INFO: Application launched May 12, 2015 4:52:28 PM UTC >> May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.ControlPanelLauncher$1 run >> WARNING: Error setting look and feel: java.awt.AWTError: Toolkit not found: sun.awt.X11.XToolkit >> java.awt.AWTError: Toolkit not found: sun.awt.X11.XToolkit > > From that last quoted line, it looks like you'll need X11 libraries, possibly X11 *JAVA* libraries as well. > > We don't supply X11 at all in the "omnios" publisher. ?I'd suggest using pkgsrc and installing X11 libraries to help you out. > > I'm sure there are others on the list with experience installing X11 libraries on OmniOS, possibly even to help out Java apps. > > Dan > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss From danmcd at omniti.com Tue May 12 18:59:02 2015 From: danmcd at omniti.com (Dan McDonald) Date: Tue, 12 May 2015 14:59:02 -0400 Subject: [OmniOS-discuss] KVM Performance Update In-Reply-To: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com> References: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com> Message-ID: <90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com> > On May 11, 2015, at 11:48 AM, Dan McDonald wrote: > > > 1.) Revert illumos-kvm to the pre-VND level as well. > > 2.) Keep up to date with illumos-kvm and illumos-kvm-cmd, but explicitly revert the VND changes in BOTH. > > I'm strongly leaning toward committing solution #2. Regardless of which, I will be issuing an update for r151014 later this week that will push KVM performance back to its pre-VND-bump levels. I chose option #2: https://github.com/omniti-labs/omnios-build/commit/0268a2ff04b1cbed2324054cb97a0f36c58989b0 There's now an update for r151014 that has the updated system/kvm (qemu/userland) and driver/virtualization/kvm (kernel KVM driver) on the repo server. A "pkg update" will update your packages AND boot archive without. I do recommend, however, you power down your KVM instances and "pkill qemu" prior to running the update. Along with this update is a small fix to onu(1) for illumos developers who are working with r151014 as their base system for ONU-ing. Thank you all again for your patience, Dan From benjamin at holyarmy.org Tue May 12 18:33:36 2015 From: benjamin at holyarmy.org (Benjamin Sherman) Date: Tue, 12 May 2015 11:33:36 -0700 Subject: [OmniOS-discuss] opendj in a zone In-Reply-To: References: Message-ID: Johan, I use OpenDJ and I've run it both on Linux and OmniOS. The simplest solution is do not attempt to run the control panel app from the OmniOS server. 1) Download/copy the OpenDJ package, or even a tarball of your installation from the OmniOS machine to a "desktop" machine, anything where you have Java and X11 (or Mac or Windows). 2) From that machine, run the control-panel. 3) When it starts, you'll need to provide IP, port, and credentials to make it talk to the OpenDJ daemon process on OmniOS machine, but it should work just fine otherwise. As Dan suggested, you can also install the X11 libs on OmniOS, but you'd still need a local X11 server for the remote control-panel process to use for user interaction. -Benjamin > On May 12, 2015, at 10:38 AM, Dan McDonald wrote: > > >> On May 12, 2015, at 1:08 PM, Johan Kragsterman wrote: >> >> So if anyone of you got any input on this, pls let me know....(I'm a complete nooob with java...): > > I'm no Java wizard, so take this with a grain of salt, but... > >> root at z1:/etc/opendj/bin# ./control-panel >> Could not launch Control Panel. Check that you have access to the display. >> Check file /var/tmp/opendj-control-panel-1827401860598694601.log for details. >> root at z1:/etc/opendj/bin# cat /var/tmp/opendj-control-panel-1827401860598694601.log >> May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.util.ControlPanelLog initLogFileHandler >> INFO: Application launched May 12, 2015 4:52:28 PM UTC >> May 12, 2015 4:52:28 PM org.opends.guitools.controlpanel.ControlPanelLauncher$1 run >> WARNING: Error setting look and feel: java.awt.AWTError: Toolkit not found: sun.awt.X11.XToolkit >> java.awt.AWTError: Toolkit not found: sun.awt.X11.XToolkit > > From that last quoted line, it looks like you'll need X11 libraries, possibly X11 *JAVA* libraries as well. > > We don't supply X11 at all in the "omnios" publisher. I'd suggest using pkgsrc and installing X11 libraries to help you out. > > I'm sure there are others on the list with experience installing X11 libraries on OmniOS, possibly even to help out Java apps. > > Dan > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss From omnios at citrus-it.net Tue May 12 20:32:46 2015 From: omnios at citrus-it.net (Andy Fiddaman) Date: Tue, 12 May 2015 20:32:46 +0000 (UTC) Subject: [OmniOS-discuss] KVM Performance Update In-Reply-To: <90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com> References: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com> <90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com> Message-ID: Thanks Dan, and just before I was about to migrate my kvms across to r151014 too! Andy On Tue, 12 May 2015, Dan McDonald wrote: ; ; > On May 11, 2015, at 11:48 AM, Dan McDonald wrote: ; > ; > ; > 1.) Revert illumos-kvm to the pre-VND level as well. ; > ; > 2.) Keep up to date with illumos-kvm and illumos-kvm-cmd, but explicitly revert the VND changes in BOTH. ; > ; > I'm strongly leaning toward committing solution #2. Regardless of which, I will be issuing an update for r151014 later this week that will push KVM performance back to its pre-VND-bump levels. ; ; I chose option #2: ; ; https://github.com/omniti-labs/omnios-build/commit/0268a2ff04b1cbed2324054cb97a0f36c58989b0 ; ; There's now an update for r151014 that has the updated system/kvm (qemu/userland) and driver/virtualization/kvm (kernel KVM driver) on the repo server. A "pkg update" will update your packages AND boot archive without. I do recommend, however, you power down your KVM instances and "pkill qemu" prior to running the update. ; ; Along with this update is a small fix to onu(1) for illumos developers who are working with r151014 as their base system for ONU-ing. ; ; Thank you all again for your patience, ; Dan ; ; _______________________________________________ ; OmniOS-discuss mailing list ; OmniOS-discuss at lists.omniti.com ; http://lists.omniti.com/mailman/listinfo/omnios-discuss ; -- Citrus IT Limited | +44 (0)870 199 8000 | enquiries at citrus-it.co.uk Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ Registered in England and Wales | Company number 4899123 From hasslerd at gmx.li Wed May 13 09:02:57 2015 From: hasslerd at gmx.li (Dominik Hassler) Date: Wed, 13 May 2015 11:02:57 +0200 Subject: [OmniOS-discuss] ping rtt for KVM in zone Message-ID: Hi, I am running my KVMs in individual zones and seeing an increased ping rtt by a factor of approx. 7 compared to ping rtt when running the same KVM inside the GZ (cf. attached smokeping chart). This does *only* affect virtio nics but not e1000 nics. For e1000 nics the ping rtt remains the same, no matter if the KVM runs in the GZ or a NGZ. Dan's 'KVM Performance Update' did resolve the throughput issue, but not the strange ping behaviour I am seeing. Any ideas why it only affects virtio nics and when the KVM is in a zone? Any ideas how to improve it? -------------- next part -------------- A non-text attachment was scrubbed... Name: kvm_virtio_zone.png Type: image/png Size: 34510 bytes Desc: not available URL: From matthew.lagoe at subrigo.net Wed May 13 09:11:02 2015 From: matthew.lagoe at subrigo.net (Matthew Lagoe) Date: Wed, 13 May 2015 02:11:02 -0700 Subject: [OmniOS-discuss] ping rtt for KVM in zone In-Reply-To: References: Message-ID: <003d01d08d5c$bc814340$3583c9c0$@subrigo.net> Some nic's don?t handle the virtio stuff very well (myricom im looking at you) so that could be part of the problem Intel typically is pretty good about it however so the e1000's working doesn?t surprise me. What nics are you specifically having issues with that have the extra delay? -----Original Message----- From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] On Behalf Of Dominik Hassler Sent: Wednesday, May 13, 2015 02:03 AM To: omnios-discuss at lists.omniti.com Subject: [OmniOS-discuss] ping rtt for KVM in zone Hi, I am running my KVMs in individual zones and seeing an increased ping rtt by a factor of approx. 7 compared to ping rtt when running the same KVM inside the GZ (cf. attached smokeping chart). This does *only* affect virtio nics but not e1000 nics. For e1000 nics the ping rtt remains the same, no matter if the KVM runs in the GZ or a NGZ. Dan's 'KVM Performance Update' did resolve the throughput issue, but not the strange ping behaviour I am seeing. Any ideas why it only affects virtio nics and when the KVM is in a zone? Any ideas how to improve it? From hasslerd at gmx.li Wed May 13 09:33:03 2015 From: hasslerd at gmx.li (Dominik Hassler) Date: Wed, 13 May 2015 11:33:03 +0200 Subject: [OmniOS-discuss] ping rtt for KVM in zone In-Reply-To: <003d01d08d5c$bc814340$3583c9c0$@subrigo.net> References: , <003d01d08d5c$bc814340$3583c9c0$@subrigo.net> Message-ID: Matthew, I have 'Intel I350' nics. It is not about virtio performance in general but the difference whether the *same* KVM runs in the GZ or in a NGZ. > Gesendet: Mittwoch, 13. Mai 2015 um 11:11 Uhr > Von: "Matthew Lagoe" > An: "'Dominik Hassler'" , omnios-discuss at lists.omniti.com > Betreff: RE: [OmniOS-discuss] ping rtt for KVM in zone > > Some nic's don?t handle the virtio stuff very well (myricom im looking at you) so that could be part of the problem > > Intel typically is pretty good about it however so the e1000's working doesn?t surprise me. > > What nics are you specifically having issues with that have the extra delay? > > -----Original Message----- > From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] On Behalf Of Dominik Hassler > Sent: Wednesday, May 13, 2015 02:03 AM > To: omnios-discuss at lists.omniti.com > Subject: [OmniOS-discuss] ping rtt for KVM in zone > > Hi, > > I am running my KVMs in individual zones and seeing an increased ping rtt by a factor of approx. 7 compared to ping rtt when running the same KVM inside the GZ (cf. attached smokeping chart). > > This does *only* affect virtio nics but not e1000 nics. For e1000 nics the ping rtt remains the same, no matter if the KVM runs in the GZ or a NGZ. > > Dan's 'KVM Performance Update' did resolve the throughput issue, but not the strange ping behaviour I am seeing. > > Any ideas why it only affects virtio nics and when the KVM is in a zone? Any ideas how to improve it? > > > From mcgee at sci-world.net Wed May 13 11:40:05 2015 From: mcgee at sci-world.net (Matthew McGee) Date: Wed, 13 May 2015 07:40:05 -0400 Subject: [OmniOS-discuss] CIFS Issues Message-ID: I am attempting to migrate my CIFS shares from FreeNAS to OmniOS. I have attempted a number of different installs and for now I am working in a VM for speed of reboots and testing. I have Windows 2012 AD, and a number of Mac OSX & Windows 7 clients. Server name = DATA Domain HOME.example.net I install the system, configure the IP of 10.0.1.230/8, set and test route, create a base boot environment and a CIFS boot environment. Reboot into the CIFS boot environment. I have attempted going straight to Napp-it and I have tried manual initialization as follows: verify /etc/hosts and /etc/nodename entries Verify AD DNS verify system is using AD DNS server only nslookup to verify forward & reverse entries are functional and resolve on the host pkg install kerberos-5 # Tried with and without this setting sharectl set -p ddns_enable=true klcient -T ms_ad kinit Administrator klist & verify output svcadm enable -r smb/server smbadm join -u Administrator Successful join smbadm list shows my domain. Verified kerberos delegation is allowed on the AD side. vi /etc/nsswitch.conf and add "ad" to passwd & group lines Have also tried adding smb line to pam Both of the following produce valid output touch foo && chown myuser at HOME.example.net && ls -l foo id myuser at HOME # Although this doesn't show all my groups create a zfs filesystem and corresponding share called documents root at data:/root# smbutil view //myuser at DATA Password: Share Type Comment ------------------------------- c$ disk Default Share documents disk IPC$ IPC Remote IPC vss$ disk VSS 4 shares listed from 4 available When I attempt to access from a Windows 7 host, I see the following: \\DATA is not accessible. You might not have permission to use this network resource. Contact the administrator of this server to find out if you have access permissions. The account is not authorized to log in from this station. \\10.0.1.230 - Works, I can set permissions, read & write files Neither the netbios nor FQDN function, but it functions by IP. Samba on FreeNAS or Fedora works without issues, but I need working FC and comstar will do that for me. I cannot seem to get the CIFS piece working and it is the one thing preventing me from moving forward. Any assistance would be appreciated. I hate asking for help but I've been working on this every night for a month and I know there must be one little thing I am missing, maybe a GPO? -------------- next part -------------- An HTML attachment was scrubbed... URL: From hasslerd at gmx.li Wed May 13 12:10:46 2015 From: hasslerd at gmx.li (Dominik Hassler) Date: Wed, 13 May 2015 14:10:46 +0200 Subject: [OmniOS-discuss] CIFS Issues In-Reply-To: References: Message-ID: Did you try to end your FQDN with a trailing dot? like: 'DATA.HOME.example.net.' in your example? ? Gesendet:?Mittwoch, 13. Mai 2015 um 13:40 Uhr Von:?"Matthew McGee" An:?omnios-discuss at lists.omniti.com Betreff:?[OmniOS-discuss] CIFS Issues I am attempting to migrate my CIFS shares from FreeNAS to OmniOS. I have attempted a number of different installs and for now I am working in a VM for speed of reboots and testing. ? I have Windows 2012 AD, and a number of Mac OSX & Windows 7 clients. ? Server name = DATA Domain HOME.example.net[http://HOME.example.net] ?I install the system, configure the IP of 10.0.1.230/8[http://10.0.1.230/8], set and test route, create a base boot environmentand a CIFS boot environment. Reboot into the CIFS boot environment. ?I have attempted going straight to Napp-it and I have tried manual initialization as follows: ?verify /etc/hosts and /etc/nodename entries Verify AD DNS verify system is using AD DNS server only nslookup to verify forward & reverse entries are functional and resolve on the host pkg install kerberos-5# Tried with and without this setting sharectl set -p ddns_enable=true klcient -T ms_ad kinit Administrator klist & verify output svcadm enable -r smb/server smbadm join -u Administrator Successful join smbadm list shows my domain. Verified kerberos delegation is allowed on the AD side. vi /etc/nsswitch.conf and add "ad" to passwd & group lines Have also tried adding smb line to pam ? ? Both of the following produce valid output touch foo && chown myuser at HOME.example.net[myuser at HOME.example.net] && ls -l foo id myuser at HOME # Although this doesn't show all my groups create a zfs filesystem and corresponding share called documents root at data:/root# smbutil view //myuser at DATA Password: Share??????? Type?????? Comment ------------------------------- c$?????????? disk?????? Default Share documents??? disk IPC$???????? IPC??????? Remote IPC vss$???????? disk?????? VSS 4 shares listed from 4 available When I attempt to access from a Windows 7 host, I see the following: \\DATA is not accessible. You might not have permission to use this network resource. Contact the administrator of this server to find out if you have access permissions. The account is not authorized to log in from this station. ? \\10.0.1.230 - Works, I can set permissions, read & write files ? Neither the netbios nor FQDN function, but it functions by IP. ? Samba on FreeNAS or Fedora works without issues, but I need working FC and comstar will do that for me. I cannot seem to get the CIFS piece working and it is the one thing preventing me from moving forward. Any assistance would be appreciated. I hate asking for help but I've been working on this every night for a month and I know there must be one little thing I am missing, maybe a GPO?_______________________________________________ OmniOS-discuss mailing list OmniOS-discuss at lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss[http://lists.omniti.com/mailman/listinfo/omnios-discuss] From danmcd at omniti.com Wed May 13 13:34:36 2015 From: danmcd at omniti.com (Dan McDonald) Date: Wed, 13 May 2015 09:34:36 -0400 Subject: [OmniOS-discuss] ping rtt for KVM in zone In-Reply-To: References: Message-ID: > On May 13, 2015, at 5:02 AM, Dominik Hassler wrote: > > Any ideas why it only affects virtio nics and when the KVM is in a zone? Any ideas how to improve it? I'm not 100% sure, but I suspect it has to do with the fact that KVM needs to put the vnic/nic into promiscuous mode. In a zone, this gets harder, because of permissions the process in a zone needs beyond what it would need in the global zone. I tried to get Joyent's VND upstreamed in time for r151014. I suspect VND will still hold an improvement on many fronts, including this one. Dan From asc1111 at gmail.com Wed May 13 16:08:46 2015 From: asc1111 at gmail.com (Aaron Curry) Date: Wed, 13 May 2015 10:08:46 -0600 Subject: [OmniOS-discuss] CIFS Issues In-Reply-To: References: Message-ID: I ran into the same issue when setting up my home server. Access to CIFS works by IP but not name. I ended up setting up a second IP address and created a DNS entry with a different name for that IP. I have no idea why it works but it does. Aaron On Wed, May 13, 2015 at 6:10 AM, Dominik Hassler wrote: > Did you try to end your FQDN with a trailing dot? > > like: 'DATA.HOME.example.net.' in your example? > > > Gesendet: Mittwoch, 13. Mai 2015 um 13:40 Uhr > Von: "Matthew McGee" > An: omnios-discuss at lists.omniti.com > Betreff: [OmniOS-discuss] CIFS Issues > > I am attempting to migrate my CIFS shares from FreeNAS to OmniOS. > I have attempted a number of different installs and for now I am working > in a VM > for speed of reboots and testing. > > I have Windows 2012 AD, and a number of Mac OSX & Windows 7 clients. > > Server name = DATA > Domain HOME.example.net[http://HOME.example.net] > I install the system, configure the IP of > 10.0.1.230/8[http://10.0.1.230/8], set and test route, create a base boot > environmentand a CIFS boot environment. Reboot into the CIFS boot > environment. > I have attempted going straight to Napp-it and I have tried manual > initialization as follows: > verify /etc/hosts and /etc/nodename entries > Verify AD DNS > verify system is using AD DNS server only > nslookup to verify forward & reverse entries are functional and resolve on > the host > pkg install kerberos-5# Tried with and without this setting > sharectl set -p ddns_enable=true > klcient -T ms_ad > kinit Administrator > klist & verify output > svcadm enable -r smb/server > > smbadm join -u Administrator > Successful join > smbadm list shows my domain. > Verified kerberos delegation is allowed on the AD side. > vi /etc/nsswitch.conf and add "ad" to passwd & group lines > Have also tried adding smb line to pam > > > Both of the following produce valid output > touch foo && chown myuser at HOME.example.net[myuser at HOME.example.net] && ls > -l foo > id myuser at HOME # Although this doesn't show all my groups > create a zfs filesystem and corresponding share called documents > > root at data:/root# smbutil view //myuser at DATA > Password: > Share Type Comment > ------------------------------- > c$ disk Default Share > documents disk > IPC$ IPC Remote IPC > vss$ disk VSS > > 4 shares listed from 4 available > > When I attempt to access from a Windows 7 host, I see the following: > > \\DATA is not accessible. You might not have permission to use this > network resource. > Contact the administrator of this server to find out if you have access > permissions. > The account is not authorized to log in from this station. > > > \\10.0.1.230 - Works, I can set permissions, read & write files > > Neither the netbios nor FQDN function, but it functions by IP. > > Samba on FreeNAS or Fedora works without issues, but I need working FC and > comstar will do that for me. > I cannot seem to get the CIFS piece working and it is the one thing > preventing me from moving forward. > Any assistance would be appreciated. I hate asking for help but I've been > working on this every night for a month > and I know there must be one little thing I am missing, maybe a > GPO?_______________________________________________ OmniOS-discuss mailing > list OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss[http://lists.omniti.com/mailman/listinfo/omnios-discuss] > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Wed May 13 17:55:41 2015 From: danmcd at omniti.com (Dan McDonald) Date: Wed, 13 May 2015 13:55:41 -0400 Subject: [OmniOS-discuss] VENOM (CVE-2015-3456) update Message-ID: Some of you probably have been tracking VENOM (aka. CVE-2015-3456). I have patched the qemu that OmniOS's KVM uses with a VENOM fix and pushed updates on to the repo servers. Source people can consult: https://github.com/joyent/illumos-kvm-cmd/commit/407546e5132f54065f3f78ac293ad7a8d16bf57c for the fix itself. r151006 --> new system/kvm package, with just VENOM patched. r151014 --> new system/kvm package, with just VENOM patched. r151012 --> new system/kvm AND driver/virtualization/kvm. VENOM is patched, and due to 012's closeness to 014, the 014 performance changes came along for the ride. I'd recommend: 1.) Shutting down all KVM instances, and make sure "pgrep qemu" in the global zone shows no processes. If you still see qemu processes, kill them after insuring your KVMs are shut down. 2.) pkg update 3.) Restarting your KVM instances, all of which will use the new, patched QEMU. Thank you folks! Dan From mir at miras.org Wed May 13 18:14:35 2015 From: mir at miras.org (Michael Rasmussen) Date: Wed, 13 May 2015 20:14:35 +0200 Subject: [OmniOS-discuss] KVM Performance Update In-Reply-To: <90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com> References: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com> <90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com> Message-ID: <20150513201435.4d3a3d7c@sleipner.datanom.net> On Tue, 12 May 2015 14:59:02 -0400 Dan McDonald wrote: > > I chose option #2: > > https://github.com/omniti-labs/omnios-build/commit/0268a2ff04b1cbed2324054cb97a0f36c58989b0 > > There's now an update for r151014 that has the updated system/kvm (qemu/userland) and driver/virtualization/kvm (kernel KVM driver) on the repo server. A "pkg update" will update your packages AND boot archive without. I do recommend, however, you power down your KVM instances and "pkill qemu" prior to running the update. > Has someone made performance test with the patched kvm package? -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: Give your very best today. Heaven knows it's little enough. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 181 bytes Desc: OpenPGP digital signature URL: From danmcd at omniti.com Wed May 13 18:28:22 2015 From: danmcd at omniti.com (Dan McDonald) Date: Wed, 13 May 2015 14:28:22 -0400 Subject: [OmniOS-discuss] KVM Performance Update In-Reply-To: <20150513201435.4d3a3d7c@sleipner.datanom.net> References: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com> <90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com> <20150513201435.4d3a3d7c@sleipner.datanom.net> Message-ID: > On May 13, 2015, at 2:14 PM, Michael Rasmussen wrote: > > Has someone made performance test with the patched kvm package? Tobi's sheet has a preliminary version. Not sure if he's tested with the one that actually is in the repo servers now. ALSO, 012 got the perf fix because it was easier to bring that along for the ride instead of addressing VENOM by itself for 012. Dan From hasslerd at gmx.li Wed May 13 18:39:21 2015 From: hasslerd at gmx.li (Dominik Hassler) Date: Wed, 13 May 2015 20:39:21 +0200 Subject: [OmniOS-discuss] KVM Performance Update Message-ID: I've applied yesterday's kvm performance patch, did performance tests and posted the results in tobi's sheet. Sent from my Samsung device -------- Original message -------- From: Dan McDonald Date: 13/05/2015 20:28 (GMT+01:00) To: Michael Rasmussen Cc: omnios-discuss at lists.omniti.com Subject: Re: [OmniOS-discuss] KVM Performance Update > On May 13, 2015, at 2:14 PM, Michael Rasmussen wrote: > > Has someone made performance test with the patched kvm package? Tobi's sheet has a preliminary version.? Not sure if he's tested with the one that actually is in the repo servers now. ALSO, 012 got the perf fix because it was easier to bring that along for the ride instead of addressing VENOM by itself for 012. Dan _______________________________________________ OmniOS-discuss mailing list OmniOS-discuss at lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discussi ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From nsmith at careyweb.com Wed May 13 18:50:54 2015 From: nsmith at careyweb.com (Nate Smith) Date: Wed, 13 May 2015 14:50:54 -0400 Subject: [OmniOS-discuss] High density 2.5" chassis In-Reply-To: References: Message-ID: <40849b67-966a-4f47-97e3-5e3a39124afe@careyweb.com> I?ve been running an all-ssd setup on a Dell R720, with dual 9207-8i cards connected to dual 8x2.5 disk backplane. (9207-8i is one of the only cards that doesn?t interfere with the BIOS, as dell Implemented it for Tape Drive Support). Boot disks are hooked up internally connected to the onboard sata (I could use USB).? I?ve been using Samsung 843TN drives which could be purchased fairly cheaply for a while. They are underprovisioned at 480GB, and feature a supercap to ensure writes in the event of a powerloss. Plus they have a long write endurance cycle. It has worked well so far, outside of some Queue Depth problems with my fibre channel. I was originally going to use the R720XD, but I found that the backplane uses expanders instead of going 1:1.? I run a 15 disk RAIDZ6 with a hotspare. -Nate From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] On Behalf Of Schweiss, Chip Sent: Saturday, May 09, 2015 3:29 PM To: Chris Nagele Cc: omnios-discuss at lists.omniti.com Subject: Re: [OmniOS-discuss] High density 2.5" chassis I have an SSD server in one of those chassis. Here's a write-up about it on my blog, there are 3 postings about it. http://www.bigdatajunkie.com/index.php/9-solaris/zfs/10-short-stroking-consumer-ssds Not necessarily a build for everyone, but it has been absolutely awesome for our use. After a few bumps at the beginning and giving up on HA on this server, it has been rock solid. Many will swear against the interposers, but combined with Samsung SSDs they have worked very well. -Chip On Sat, May 9, 2015 at 1:06 PM, Chris Nagele wrote: Hi all. Continuing on my all SSD discussion, I am looking for some recommendations on a new Supermicro chassis for our file servers. So far I have been looking at this thing: http://www.supermicro.com/products/chassis/4U/417/SC417E16-R1400LP.cfm Does anyone have experience with this? If so, what would you recommend for a motherboard and HBA to support all of the disks? We've traditionally used the X9DRD-7LN4F-JBOD or the X9DRi-F with a LSI 9211-8i HBA. Thanks, Chris _______________________________________________ OmniOS-discuss mailing list OmniOS-discuss at lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From mir at miras.org Wed May 13 20:13:12 2015 From: mir at miras.org (Michael Rasmussen) Date: Wed, 13 May 2015 22:13:12 +0200 Subject: [OmniOS-discuss] KVM Performance Update In-Reply-To: References: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com> <90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com> <20150513201435.4d3a3d7c@sleipner.datanom.net> Message-ID: <20150513221312.5e69fb09@sleipner.datanom.net> On Wed, 13 May 2015 14:28:22 -0400 Dan McDonald wrote: > > Tobi's sheet has a preliminary version. Not sure if he's tested with the one that actually is in the repo servers now. > > ALSO, 012 got the perf fix because it was easier to bring that along for the ride instead of addressing VENOM by itself for 012. > If I read the numbers correct I still find the performance disappointing with the patch. Doing the same kind of test using Linux or FreeBSD host to Linux or FreeBSD guest gives much higher performance. -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: That wouldn't be good enough. -- Larry Wall in <199710131621.JAA14907 at wall.org> -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 181 bytes Desc: OpenPGP digital signature URL: From hasslerd at gmx.li Wed May 13 20:26:23 2015 From: hasslerd at gmx.li (Dominik Hassler) Date: Wed, 13 May 2015 22:26:23 +0200 Subject: [OmniOS-discuss] KVM Performance Update In-Reply-To: <20150513221312.5e69fb09@sleipner.datanom.net> References: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com> <90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com> <20150513201435.4d3a3d7c@sleipner.datanom.net> <20150513221312.5e69fb09@sleipner.datanom.net> Message-ID: <5553B36F.6040709@gmx.li> Well, don't forget, my latest tests were w/ KWMs running inside zones. As Dan pointed out today in another thread, the lack of VND upstream might have a bigger impact on KVMs running inside zones. On 05/13/2015 10:13 PM, Michael Rasmussen wrote: > On Wed, 13 May 2015 14:28:22 -0400 > Dan McDonald wrote: > >> >> Tobi's sheet has a preliminary version. Not sure if he's tested with the one that actually is in the repo servers now. >> >> ALSO, 012 got the perf fix because it was easier to bring that along for the ride instead of addressing VENOM by itself for 012. >> > If I read the numbers correct I still find the performance > disappointing with the patch. Doing the same kind of test using Linux > or FreeBSD host to Linux or FreeBSD guest gives much higher performance. > > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > From mcgee at sci-world.net Wed May 13 22:45:36 2015 From: mcgee at sci-world.net (Matthew McGee) Date: Wed, 13 May 2015 18:45:36 -0400 Subject: [OmniOS-discuss] CIFS Issues In-Reply-To: References: Message-ID: Interesting. Using the trailing "." for an absolute FQDN works. Any hints on how to make it work without the full FQDN? I assume it's probably a kerberos related issue? On Wed, May 13, 2015 at 8:10 AM, Dominik Hassler wrote: Did you try to end your FQDN with a trailing dot? > > like: 'DATA.HOME.example.net.' in your example? > > > Gesendet: Mittwoch, 13. Mai 2015 um 13:40 Uhr > Von: "Matthew McGee" > An: omnios-discuss at lists.omniti.com > Betreff: [OmniOS-discuss] CIFS Issues > > I am attempting to migrate my CIFS shares from FreeNAS to OmniOS. > I have attempted a number of different installs and for now I am working > in a VM > for speed of reboots and testing. > > I have Windows 2012 AD, and a number of Mac OSX & Windows 7 clients. > > Server name = DATA > Domain HOME.example.net[http://HOME.example.net] > I install the system, configure the IP of > 10.0.1.230/8[http://10.0.1.230/8], set and test route, create a base boot > environmentand a CIFS boot environment. Reboot into the CIFS boot > environment. > I have attempted going straight to Napp-it and I have tried manual > initialization as follows: > verify /etc/hosts and /etc/nodename entries > Verify AD DNS > verify system is using AD DNS server only > nslookup to verify forward & reverse entries are functional and resolve on > the host > pkg install kerberos-5# Tried with and without this setting > sharectl set -p ddns_enable=true > klcient -T ms_ad > kinit Administrator > klist & verify output > svcadm enable -r smb/server > > smbadm join -u Administrator > Successful join > smbadm list shows my domain. > Verified kerberos delegation is allowed on the AD side. > vi /etc/nsswitch.conf and add "ad" to passwd & group lines > Have also tried adding smb line to pam > > > Both of the following produce valid output > touch foo && chown myuser at HOME.example.net[myuser at HOME.example.net] && ls > -l foo > id myuser at HOME # Although this doesn't show all my groups > create a zfs filesystem and corresponding share called documents > > root at data:/root# smbutil view //myuser at DATA > Password: > Share Type Comment > ------------------------------- > c$ disk Default Share > documents disk > IPC$ IPC Remote IPC > vss$ disk VSS > > 4 shares listed from 4 available > > When I attempt to access from a Windows 7 host, I see the following: > > \\DATA is not accessible. You might not have permission to use this > network resource. > Contact the administrator of this server to find out if you have access > permissions. > The account is not authorized to log in from this station. > > > \\10.0.1.230 - Works, I can set permissions, read & write files > > Neither the netbios nor FQDN function, but it functions by IP. > > Samba on FreeNAS or Fedora works without issues, but I need working FC and > comstar will do that for me. > I cannot seem to get the CIFS piece working and it is the one thing > preventing me from moving forward. > Any assistance would be appreciated. I hate asking for help but I've been > working on this every night for a month > and I know there must be one little thing I am missing, maybe a > GPO?_______________________________________________ OmniOS-discuss mailing > list OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss[http://lists.omniti.com/mailman/listinfo/omnios-discuss] > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Thu May 14 05:15:00 2015 From: danmcd at omniti.com (Dan McDonald) Date: Thu, 14 May 2015 01:15:00 -0400 Subject: [OmniOS-discuss] CIFS Issues In-Reply-To: References: Message-ID: <136B9632-3196-41B7-961E-B9BD113321BC@omniti.com> > On May 13, 2015, at 6:45 PM, Matthew McGee wrote: > > Interesting. Using the trailing "." for an absolute FQDN works. > Any hints on how to make it work without the full FQDN? > I assume it's probably a kerberos related issue? I'd suggest asking the illumos mailing list (discussion or developer). The SMB experts in illumos all work at Nexenta. Dan From alka at hfg-gmuend.de Thu May 14 11:15:56 2015 From: alka at hfg-gmuend.de (=?utf-8?Q?G=C3=BCnther_Alka?=) Date: Thu, 14 May 2015 13:15:56 +0200 Subject: [OmniOS-discuss] CIFS Issues In-Reply-To: <136B9632-3196-41B7-961E-B9BD113321BC@omniti.com> References: <136B9632-3196-41B7-961E-B9BD113321BC@omniti.com> Message-ID: <84BD5B5F-1490-40AB-B176-4991062BE510@hfg-gmuend.de> Matthew As you use napp-it and as I have many OmniOS SMB filers in an AD environment without such problems can to compare what happens when you use napp-it to join the domain instead doing manually (menu Services >> SMB >> Active Directory) Gea > > >> On May 13, 2015, at 6:45 PM, Matthew McGee wrote: >> >> Interesting. Using the trailing "." for an absolute FQDN works. >> Any hints on how to make it work without the full FQDN? >> I assume it's probably a kerberos related issue? > From ottmarklaas at countermail.com Wed May 13 18:21:51 2015 From: ottmarklaas at countermail.com (Ottmar Klaas) Date: Wed, 13 May 2015 14:21:51 -0400 Subject: [OmniOS-discuss] KVM Performance Update In-Reply-To: <20150513201435.4d3a3d7c@sleipner.datanom.net> References: <794603CD-A44A-4266-9AA7-B92F6670F67F@omniti.com> <90AB3F0A-FA77-4F74-B71C-C05E826143B6@omniti.com> <20150513201435.4d3a3d7c@sleipner.datanom.net> Message-ID: <8DE571E8-C8D1-4406-8D2E-283C255446E7@countermail.com> On 13 May 2015, at 14:14, Michael Rasmussen wrote: > On Tue, 12 May 2015 14:59:02 -0400 > Dan McDonald wrote: > >> >> I chose option #2: >> >> https://github.com/omniti-labs/omnios-build/commit/0268a2ff04b1cbed2324054cb97a0f36c58989b0 >> >> There's now an update for r151014 that has the updated system/kvm >> (qemu/userland) and driver/virtualization/kvm (kernel KVM driver) on >> the repo server. A "pkg update" will update your packages AND boot >> archive without. I do recommend, however, you power down your KVM >> instances and "pkill qemu" prior to running the update. >> > Has someone made performance test with the patched kvm package? For me network performance measured via iperf almost quadrupled, hitting around 380MBit/s. Both from global zone to ubuntu guest and separate computer on network to ubuntu guest. My previous results are listed here (amongst other): https://docs.google.com/spreadsheets/d/1uhCR4A9VawJsNG01AuC5CBVTQYlkoY1m9qdSuAwZp-s/edit#gid=0 > > -- > Hilsen/Regards > Michael Rasmussen > > Get my public GnuPG keys: > michael rasmussen cc > http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E > mir datanom net > http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C > mir miras org > http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 > -------------------------------------------------------------- > /usr/games/fortune -es says: > Give your very best today. Heaven knows it's little enough. > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss From cks at cs.toronto.edu Fri May 15 15:51:11 2015 From: cks at cs.toronto.edu (Chris Siebenmann) Date: Fri, 15 May 2015 11:51:11 -0400 Subject: [OmniOS-discuss] Clues for tracking down a drastic ZFS fs space difference? In-Reply-To: cks's message of Wed, 29 Apr 2015 15:21:03 -0400. <20150429192103.D6B397A0605@apps0.cs.toronto.edu> Message-ID: <20150515155111.314317A0614@apps0.cs.toronto.edu> Several weeks ago I reported: > We have a filesystem/dataset with no snapshots, no subordinate > filesystems, nothing complicated (and no compression), that has a > drastic difference in space used between what df/zfs list/etc report > at the ZFS level and what du reports at the filesystem level. [...] (At the time ZFS reported 70.5 GB used and du reported 17 GB.) With the assistance of George Wilson of Delphix, we've now identified what the cause of this was: nlockmgr was apparently holding references to now-deleted files in the kernel, preventing them from being reclaimed by ZFS. Because these references were held in the kernel in some way, they weren't visible to tools like fuser. Restarting nlockmgr immediately reclaimed the space and dropped usage to what it should be. Delphix made a fix to their version of the nlm code to avoid this but has not yet pushed it upstream. The summary of the problem (from a comment in the commit): A busy client will prevent the idle timeout from ever being reached but may have stale holds associated with it. If these stale holds are for vnodes which have been removed they will prevent the file system from being able to reclaim the file's space. George Wilson's initial reply to me on the illumos-zfs mailing list is: http://permalink.gmane.org/gmane.os.illumos.zfs/4836 (and it includes a link to the Delphix commit.) Obviously this is only a concern for people doing NFS service on OmniOS machines, but if this is your environment you may want to watch for this issue and consider periodic precautionary nlockmgr restarts or the like until the fix is pushed upstream and is incorporated into an OmniOS update. - cks From martin at waldenvik.se Fri May 15 19:56:09 2015 From: martin at waldenvik.se (martin at waldenvik.se) Date: Fri, 15 May 2015 19:56:09 +0000 Subject: [OmniOS-discuss] nfs client in a zone won't start Message-ID: Hi I created a zone per omnios wiki for a mysql-server (omnios r151014). But i can?t seem to start the nfs/client service. It just says offline*.There are no clue in any of the logs. If i do a svcadm enable -r nfs/client it says svcadm: svc:/milestone/network depends on svc:/network/physical, which has multiple instances. Any help would be appreciated. I wish to mount a nfs share for backing up mysql-databases Regards Martin Sent with Airmail -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Fri May 15 20:09:19 2015 From: danmcd at omniti.com (Dan McDonald) Date: Fri, 15 May 2015 16:09:19 -0400 Subject: [OmniOS-discuss] nfs client in a zone won't start In-Reply-To: References: Message-ID: <8D390D89-85FD-429B-92C7-CDCFBBE0E92A@omniti.com> > On May 15, 2015, at 3:56 PM, martin at waldenvik.se wrote: > > Hi > > I created a zone per omnios wiki for a mysql-server (omnios r151014). But i can?t seem to start the nfs/client service. It just says offline*.There are no clue in any of the logs. If i do a svcadm enable -r nfs/client it says svcadm: svc:/milestone/network depends on svc:/network/physical, which has multiple instances. > > Any help would be appreciated. I wish to mount a nfs share for backing up mysql-databases Please share the output of svcs -xv network/physical Also, try enabling nfs/client without -r, and use "svcs -xv" to see what all else you need to activate. *CLIENT* should work in a zone. Dan From jimklimov at cos.ru Fri May 15 21:25:42 2015 From: jimklimov at cos.ru (Jim Klimov) Date: Fri, 15 May 2015 23:25:42 +0200 Subject: [OmniOS-discuss] nfs client in a zone won't start In-Reply-To: References: Message-ID: 15 ??? 2015??. 21:56:09 CEST, "martin at waldenvik.se" ?????: >Hi > >I created a zone per omnios wiki for a mysql-server (omnios r151014). >But i can?t seem to start the nfs/client service. It just says >offline*.There are no clue in any of the logs. If i do a svcadm enable >-r nfs/client it says svcadm: svc:/milestone/network depends on >svc:/network/physical, which has multiple instances. > >Any help would be appreciated. I wish to mount a nfs share for backing >up mysql-databases > >Regards >Martin >Sent with Airmail > > >------------------------------------------------------------------------ > >_______________________________________________ >OmniOS-discuss mailing list >OmniOS-discuss at lists.omniti.com >http://lists.omniti.com/mailman/listinfo/omnios-discuss The state offline* (with asterisk) means transition from offline (is in process of onlining). You might want to look into /var/svc/log/*nfs-client*log for possible more details, and/or to manually rerun (or instrument with 'sh -x' and the likes) the scripts and bits of the service to trace into the problem. While the message about network/physical is common and harmless, do verify that indeed you have one of the networking engines enabled (legacy default, or new magicky nwam). Also 'svcs -d nfs/client' can show dependencies, and 'svcs -xv' will detail any failures. Recently there are many online discussions on nlm (nfs lock manager) and recent/nearfuture changes applied to it, so see if enabling or kicking it helps you any. Finally, did you test if the client works from the global zone? Good luck, Jim Klimov -- Typos courtesy of K-9 Mail on my Samsung Android From richard.elling at richardelling.com Fri May 15 21:44:51 2015 From: richard.elling at richardelling.com (Richard Elling) Date: Fri, 15 May 2015 14:44:51 -0700 Subject: [OmniOS-discuss] nfs client in a zone won't start In-Reply-To: References: Message-ID: > On May 15, 2015, at 2:25 PM, Jim Klimov wrote: > > 15 ??? 2015 ?. 21:56:09 CEST, "martin at waldenvik.se" ?????: >> Hi >> >> I created a zone per omnios wiki for a mysql-server (omnios r151014). >> But i can?t seem to start the nfs/client service. It just says >> offline*.There are no clue in any of the logs. If i do a svcadm enable >> -r nfs/client it says svcadm: svc:/milestone/network depends on >> svc:/network/physical, which has multiple instances. >> >> Any help would be appreciated. I wish to mount a nfs share for backing >> up mysql-databases >> >> Regards >> Martin >> Sent with Airmail >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> OmniOS-discuss mailing list >> OmniOS-discuss at lists.omniti.com >> http://lists.omniti.com/mailman/listinfo/omnios-discuss > > The state offline* (with asterisk) means transition from offline (is in process of onlining). You might want to look into /var/svc/log/*nfs-client*log for possible more details, and/or to manually rerun (or instrument with 'sh -x' and the likes) the scripts and bits of the service to trace into the problem. pro tip: cat $(svcs -L nfs/client) -- richard From illumos at cucumber.demon.co.uk Fri May 15 22:18:29 2015 From: illumos at cucumber.demon.co.uk (Andrew Gabriel) Date: Fri, 15 May 2015 23:18:29 +0100 Subject: [OmniOS-discuss] nfs client in a zone won't start In-Reply-To: References: Message-ID: <555670B5.8000509@cucumber.demon.co.uk> Richard Elling wrote: >> On May 15, 2015, at 2:25 PM, Jim Klimov wrote: >> >> >> The state offline* (with asterisk) means transition from offline (is in process of onlining). You might want to look into /var/svc/log/*nfs-client*log for possible more details, and/or to manually rerun (or instrument with 'sh -x' and the likes) the scripts and bits of the service to trace into the problem. >> > > pro tip: > cat $(svcs -L nfs/client) > or a "tail -f" running whilst you try starting it from another terminal window. svcs -p nfs/client can also be useful when it's stuck in a startup script, to see what processes it currently has running. -- Andrew From mcgee at sci-world.net Fri May 15 22:34:29 2015 From: mcgee at sci-world.net (Matthew McGee) Date: Fri, 15 May 2015 18:34:29 -0400 Subject: [OmniOS-discuss] CIFS Issues Message-ID: I didn't see this message until it came through on the digest. I have a working system now albeit, its a cludge. The person who suggested using a DNS alias gets a beer. I took this idea and did further troubleshooting and found that if the hostname is in AD, I get the error message. If I remove it from AD and reboot the client, it works. There is no discernible difference between using Napp-it and not. I get the same result either way. I also find it curious that all my shares are now forcibly in lower case. My Documents share comes in as documents. No big deal, but strange. Thank you for the suggestions and I am all ears if you have anything further. Message: 4 Date: Thu, 14 May 2015 13:15:56 +0200 From: G?nther Alka To: omnios-discuss Subject: Re: [OmniOS-discuss] CIFS Issues Message-ID: <84BD5B5F-1490-40AB-B176-4991062BE510 at hfg-gmuend.de> Content-Type: text/plain; charset=us-ascii Matthew As you use napp-it and as I have many OmniOS SMB filers in an AD environment without such problems can to compare what happens when you use napp-it to join the domain instead doing manually (menu Services >> SMB >> Active Directory) Gea > > >> On May 13, 2015, at 6:45 PM, Matthew McGee wrote: >> >> Interesting. Using the trailing "." for an absolute FQDN works. >> Any hints on how to make it work without the full FQDN? >> I assume it's probably a kerberos related issue? -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at waldenvik.se Fri May 15 22:44:06 2015 From: martin at waldenvik.se (martin at waldenvik.se) Date: Fri, 15 May 2015 22:44:06 +0000 Subject: [OmniOS-discuss] nfs client in a zone won't start In-Reply-To: <555670B5.8000509@cucumber.demon.co.uk> References: <555670B5.8000509@cucumber.demon.co.uk> Message-ID: Hi Thanks for all your tips regarding the nfs client. I still does not know what caused it. Maybe some network configuration mishap. The nfs/client worked in the GZ without problem. Wish you all a nice weekend Martin Sent with Airmail On 16 May 2015 at 00:17:19, Andrew Gabriel (illumos at cucumber.demon.co.uk) wrote: Richard Elling wrote: >> On May 15, 2015, at 2:25 PM, Jim Klimov wrote: >> >> >> The state offline* (with asterisk) means transition from offline (is in process of onlining). You might want to look into /var/svc/log/*nfs-client*log for possible more details, and/or to manually rerun (or instrument with 'sh -x' and the likes) the scripts and bits of the service to trace into the problem. >> > > pro tip: > cat $(svcs -L nfs/client) > or a "tail -f" running whilst you try starting it from another terminal window. svcs -p nfs/client can also be useful when it's stuck in a startup script, to see what processes it currently has running. -- Andrew -------------- next part -------------- An HTML attachment was scrubbed... URL: From jstockett at molalla.com Mon May 18 18:25:34 2015 From: jstockett at molalla.com (Jeff Stockett) Date: Mon, 18 May 2015 18:25:34 +0000 Subject: [OmniOS-discuss] disk failure causing reboot? Message-ID: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com> A drive failed in one of our supermicro 5048R-E1CR36L servers running omnios r151012 last night, and somewhat unexpectedly, the whole system seems to have panicked. May 18 04:43:08 zfs01 scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,2f02 at 1/pci15d9,808 at 0 (mpt_sas0): May 18 04:43:08 zfs01 Log info 0x31140000 received for target 29 w50000c0f01f1bf06. May 18 04:43:08 zfs01 scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc May 18 04:44:36 zfs01 genunix: [ID 843051 kern.info] NOTICE: SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major May 18 04:44:36 zfs01 unix: [ID 836849 kern.notice] May 18 04:44:36 zfs01 ^Mpanic[cpu0]/thread=ffffff00f3ecbc40: May 18 04:44:36 zfs01 genunix: [ID 918906 kern.notice] I/O to pool 'dpool' appears to be hung. May 18 04:44:36 zfs01 unix: [ID 100000 kern.notice] May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecba20 zfs:vdev_deadman+10b () May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecba70 zfs:vdev_deadman+4a () May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbac0 zfs:vdev_deadman+4a () May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbaf0 zfs:spa_deadman+ad () May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbb90 genunix:cyclic_softint+fd () May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbba0 unix:cbe_low_level+14 () May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbbf0 unix:av_dispatch_softvect+78 () May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbc20 apix:apix_dispatch_softint+35 () May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05990 unix:switch_sp_and_call+13 () May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e059e0 apix:apix_do_softint+6c () May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05a40 apix:apix_do_interrupt+34a () May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05a50 unix:cmnint+ba () May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05bc0 unix:acpi_cpu_cstate+11b () May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05bf0 unix:cpu_acpi_idle+8d () May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05c00 unix:cpu_idle_adaptive+13 () May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05c20 unix:idle+a7 () May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05c30 unix:thread_start+8 () May 18 04:44:36 zfs01 unix: [ID 100000 kern.notice] May 18 04:44:36 zfs01 genunix: [ID 672855 kern.notice] syncing file systems... May 18 04:44:38 zfs01 genunix: [ID 904073 kern.notice] done May 18 04:44:39 zfs01 genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel May 18 04:44:39 zfs01 ahci: [ID 405573 kern.info] NOTICE: ahci0: ahci_tran_reset_dport port 1 reset port May 18 05:17:56 zfs01 genunix: [ID 100000 kern.notice] May 18 05:17:56 zfs01 genunix: [ID 665016 kern.notice] ^M100% done: 8607621 pages dumped, May 18 05:17:56 zfs01 genunix: [ID 851671 kern.notice] dump succeeded The disks are all 4TB WD40001FYYG enterprise SAS drives. Googling seems to indicate it is a known problem with the way the various subsystems sometimes interact. Is there any way to fix/workaround this issue? -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Mon May 18 18:33:17 2015 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 18 May 2015 14:33:17 -0400 Subject: [OmniOS-discuss] disk failure causing reboot? In-Reply-To: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com> References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com> Message-ID: > On May 18, 2015, at 2:25 PM, Jeff Stockett wrote: > > A drive failed in one of our supermicro 5048R-E1CR36L servers running omnios r151012 last night, and somewhat unexpectedly, the whole system seems to have panicked. The panic was done for protection of your pool: > May 18 04:44:36 zfs01 genunix: [ID 918906 kern.notice] I/O to pool 'dpool' appears to be hung. > > The disks are all 4TB WD40001FYYG enterprise SAS drives. Googling seems to indicate it is a known problem with the way the various subsystems sometimes interact. Is there any way to fix/workaround this issue? Pull the drive. I'm assuming you have a raidz or mirrored setup where you can do that, right? Or is it a question of finding *which* drive failed? Dan From illumos at cucumber.demon.co.uk Mon May 18 18:59:16 2015 From: illumos at cucumber.demon.co.uk (Andrew Gabriel) Date: Mon, 18 May 2015 19:59:16 +0100 Subject: [OmniOS-discuss] disk failure causing reboot? In-Reply-To: References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com> Message-ID: <555A3684.2020409@cucumber.demon.co.uk> Dan McDonald wrote: >> On May 18, 2015, at 2:25 PM, Jeff Stockett wrote: >> >> A drive failed in one of our supermicro 5048R-E1CR36L servers running omnios r151012 last night, and somewhat unexpectedly, the whole system seems to have panicked. >> > > The panic was done for protection of your pool: > > >> May 18 04:44:36 zfs01 genunix: [ID 918906 kern.notice] I/O to pool 'dpool' appears to be hung. >> > > > > >> >> The disks are all 4TB WD40001FYYG enterprise SAS drives. Googling seems to indicate it is a known problem with the way the various subsystems sometimes interact. Is there any way to fix/workaround this issue? >> > > Pull the drive. I'm assuming you have a raidz or mirrored setup where you can do that, right? Or is it a question of finding *which* drive failed? > Must admit I haven't played with this since the protection against no TX commits completing for a while went in, but I would have expected FMA would have faulted out the disk to prevent hanging the pool, unless there was no redundancy for the top level vdev it's in? Would be interesting to know what the pool layout and state was. -- Andrew From jstockett at molalla.com Mon May 18 19:01:46 2015 From: jstockett at molalla.com (Jeff Stockett) Date: Mon, 18 May 2015 19:01:46 +0000 Subject: [OmniOS-discuss] disk failure causing reboot? In-Reply-To: References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com> Message-ID: <136C13E89D22BB468B2A7025993639732F527113@EXMCCMB.molalla.com> Hi Dan, The pool is made up of 36 disks - 6 x 6 raidz2 vdevs with some SSDs for l2arc and slog. I already replaced the drive and the rebuild is nearly done, but I was mostly curious why a disk failure would cause a reboot? I get that it was apparently hanging the pool up, and that according to some posts I read the developers seem to think it is better the panic/dump/reboot than leave it hung until someone notices, but wouldn't it really be better just to drop the failed drive out of the array? Is it because the system in question is using a SAS expander or is this only expected behavior sometimes depending on how the drive fails? I guess I might expect this with consumer grade SATA drives, but wasn't expecting it with $$$ enterprise SAS drives. Thanks, Jeff -----Original Message----- From: Dan McDonald [mailto:danmcd at omniti.com] Sent: Monday, May 18, 2015 11:33 AM To: Jeff Stockett Cc: omnios-discuss Subject: Re: [OmniOS-discuss] disk failure causing reboot? > On May 18, 2015, at 2:25 PM, Jeff Stockett wrote: > > A drive failed in one of our supermicro 5048R-E1CR36L servers running omnios r151012 last night, and somewhat unexpectedly, the whole system seems to have panicked. The panic was done for protection of your pool: > May 18 04:44:36 zfs01 genunix: [ID 918906 kern.notice] I/O to pool 'dpool' appears to be hung. > > The disks are all 4TB WD40001FYYG enterprise SAS drives. Googling seems to indicate it is a known problem with the way the various subsystems sometimes interact. Is there any way to fix/workaround this issue? Pull the drive. I'm assuming you have a raidz or mirrored setup where you can do that, right? Or is it a question of finding *which* drive failed? Dan From danmcd at omniti.com Mon May 18 19:09:17 2015 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 18 May 2015 15:09:17 -0400 Subject: [OmniOS-discuss] disk failure causing reboot? In-Reply-To: <136C13E89D22BB468B2A7025993639732F527113@EXMCCMB.molalla.com> References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com> <136C13E89D22BB468B2A7025993639732F527113@EXMCCMB.molalla.com> Message-ID: <9964F883-77F7-4159-B704-5DB7CC57A1E6@omniti.com> > On May 18, 2015, at 3:01 PM, Jeff Stockett wrote: > > Hi Dan, > > The pool is made up of 36 disks - 6 x 6 raidz2 vdevs with some SSDs for l2arc and slog. I already replaced the drive and the rebuild is nearly done, but I was mostly curious why a disk failure would cause a reboot? I get that it was apparently hanging the pool up, and that according to some posts I read the developers seem to think it is better the panic/dump/reboot than leave it hung until someone notices, but wouldn't it really be better just to drop the failed drive out of the array? Is it because the system in question is using a SAS expander or is this only expected behavior sometimes depending on how the drive fails? I guess I might expect this with consumer grade SATA drives, but wasn't expecting it with $$$ enterprise SAS drives. $$$ SAS drives *should* tickle FMA as Andrew G. was saying. I've heard expanders can complicate things, but I'm not enough of a storage guru to address that directly (I will say that SATA drives + expanders == disaster but you know that already). There are more storage-informed people on this list, and they may have more insight than I. Thanks, Dan From illumos at cucumber.demon.co.uk Mon May 18 19:37:22 2015 From: illumos at cucumber.demon.co.uk (Andrew Gabriel) Date: Mon, 18 May 2015 20:37:22 +0100 Subject: [OmniOS-discuss] disk failure causing reboot? In-Reply-To: <9964F883-77F7-4159-B704-5DB7CC57A1E6@omniti.com> References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com> <136C13E89D22BB468B2A7025993639732F527113@EXMCCMB.molalla.com> <9964F883-77F7-4159-B704-5DB7CC57A1E6@omniti.com> Message-ID: <555A3F72.2080302@cucumber.demon.co.uk> Dan McDonald wrote: >> On May 18, 2015, at 3:01 PM, Jeff Stockett wrote: >> >> Hi Dan, >> >> The pool is made up of 36 disks - 6 x 6 raidz2 vdevs with some SSDs for l2arc and slog. I already replaced the drive and the rebuild is nearly done, but I was mostly curious why a disk failure would cause a reboot? I get that it was apparently hanging the pool up, and that according to some posts I read the developers seem to think it is better the panic/dump/reboot than leave it hung until someone notices, but wouldn't it really be better just to drop the failed drive out of the array? Is it because the system in question is using a SAS expander or is this only expected behavior sometimes depending on how the drive fails? I guess I might expect this with consumer grade SATA drives, but wasn't expecting it with $$$ enterprise SAS drives. >> > > $$$ SAS drives *should* tickle FMA as Andrew G. was saying. I've heard expanders can complicate things, but I'm not enough of a storage guru to address that directly (I will say that SATA drives + expanders == disaster but you know that already). > > There are more storage-informed people on this list, and they may have more insight than I. > Might be worth looking at fmdump output, to see what FMA made of the disk error at 04:43:08. -- Andrew From henson at acm.org Mon May 18 20:08:48 2015 From: henson at acm.org (Paul B. Henson) Date: Mon, 18 May 2015 13:08:48 -0700 Subject: [OmniOS-discuss] disk failure causing reboot? In-Reply-To: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com> References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com> Message-ID: <20150518200848.GH3720@bender.unx.cpp.edu> On Mon, May 18, 2015 at 06:25:34PM +0000, Jeff Stockett wrote: > A drive failed in one of our supermicro 5048R-E1CR36L servers running > omnios r151012 last night, and somewhat unexpectedly, the whole system > seems to have panicked. You don't happen to have failmode set to panic on the pool? >From the zpool manpage: failmode=wait | continue | panic Controls the system behavior in the event of catastrophic pool failure. This condition is typically a result of a loss of connectivity to the underlying storage device(s) or a failure of all devices within the pool. The behavior of such an event is determined as follows: wait Blocks all I/O access until the device connectivity is recovered and the errors are cleared. This is the default behavior. continue Returns EIO to any new write I/O requests but allows reads to any of the remaining healthy devices. Any write requests that have yet to be committed to disk would be blocked. panic Prints out a message to the console and generates a system crash dump. From chip at innovates.com Mon May 18 20:30:34 2015 From: chip at innovates.com (Schweiss, Chip) Date: Mon, 18 May 2015 15:30:34 -0500 Subject: [OmniOS-discuss] disk failure causing reboot? In-Reply-To: <20150518200848.GH3720@bender.unx.cpp.edu> References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com> <20150518200848.GH3720@bender.unx.cpp.edu> Message-ID: I had the exact same failure mode last week. With over 1000 spindles I see this about once a month. I can publish my dump also if anyone actually want's to try to fix this problem, but I think there are several of the same thing already linked to tickets in Illumos-gate. Pools for the most part should be set to failmode=panic or wait, but a failed disk should not cause a panic. The system this happened to me on failmode was set to wait. It is also on r151012, waiting on a window to upgrade to r151014. My pool is raidz3, so no reason not to kick a bad disk. All my disks are SAS in DataON JBODs, dual connected across two LSI HBAs. BTW, pull a SAS cable and you get a panic too, not degraded multipath. Illumos seems to panic on just about any SAS event these days regardless of redundancy. -Chip On Mon, May 18, 2015 at 3:08 PM, Paul B. Henson wrote: > On Mon, May 18, 2015 at 06:25:34PM +0000, Jeff Stockett wrote: > > A drive failed in one of our supermicro 5048R-E1CR36L servers running > > omnios r151012 last night, and somewhat unexpectedly, the whole system > > seems to have panicked. > > You don't happen to have failmode set to panic on the pool? > > From the zpool manpage: > > failmode=wait | continue | panic > Controls the system behavior in the event of catastrophic pool > failure. This condition is typically a result of a loss of > connectivity to the underlying storage device(s) or a failure of > all devices within the pool. The behavior of such an event is > determined as follows: > > wait > Blocks all I/O access until the device connectivity > is > recovered and the errors are cleared. This is the > default behavior. > > continue > Returns EIO to any new write I/O requests but allows > reads to any of the remaining healthy devices. Any > write requests that have yet to be committed to disk > would be blocked. > > panic > Prints out a message to the console and generates a > system crash dump. > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jstockett at molalla.com Mon May 18 20:33:33 2015 From: jstockett at molalla.com (Jeff Stockett) Date: Mon, 18 May 2015 20:33:33 +0000 Subject: [OmniOS-discuss] disk failure causing reboot? In-Reply-To: <20150518200848.GH3720@bender.unx.cpp.edu> References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com> <20150518200848.GH3720@bender.unx.cpp.edu> Message-ID: <136C13E89D22BB468B2A7025993639732F52738E@EXMCCMB.molalla.com> The pool is set to fail mode wait. In looking at the fmdump -e and fmdump -eV output, it looks just like the drive started having media/disk/transport errors around 3:40am and eventually culminated in the reboot around 6:18am. The funny thing is that driver-assessment = fatal was returned 42 times on the same device in that period, so I'm not quite sure why it didn't just drop the drive - because the documentation says: Note: An ereport with the value driver-assessment = fatal results in the fault being propagated. It appears it didn't drop the drive until after it rebooted. I can upload the crash dump and or fmdump output if anyone is interested. Thanks, Jeff -----Original Message----- From: Paul Henson [mailto:paul.b.henson at gmail.com] On Behalf Of Paul B. Henson Sent: Monday, May 18, 2015 1:09 PM To: Jeff Stockett Cc: omnios-discuss at lists.omniti.com Subject: Re: [OmniOS-discuss] disk failure causing reboot? On Mon, May 18, 2015 at 06:25:34PM +0000, Jeff Stockett wrote: > A drive failed in one of our supermicro 5048R-E1CR36L servers running > omnios r151012 last night, and somewhat unexpectedly, the whole system > seems to have panicked. You don't happen to have failmode set to panic on the pool? >From the zpool manpage: failmode=wait | continue | panic Controls the system behavior in the event of catastrophic pool failure. This condition is typically a result of a loss of connectivity to the underlying storage device(s) or a failure of all devices within the pool. The behavior of such an event is determined as follows: wait Blocks all I/O access until the device connectivity is recovered and the errors are cleared. This is the default behavior. continue Returns EIO to any new write I/O requests but allows reads to any of the remaining healthy devices. Any write requests that have yet to be committed to disk would be blocked. panic Prints out a message to the console and generates a system crash dump. From danmcd at omniti.com Mon May 18 20:38:56 2015 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 18 May 2015 16:38:56 -0400 Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX? Message-ID: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com> Now this isn't a gcc update for illumos/illumos-omnios... that way is full of pain, and I'll wait for now. OTOH, we've transitioned gcc before going into r151008 with 4.8.1. My question to you all is this: To which gcc version should we jump? I see two viable candidates: - gcc 4.9.2 (last updated October 2014) or - gcc 5.1 (last updated April 2015) The current gcc "development" is happening on 6.0, and we're not ready for that. I appreciate feedback. I'll be making a decision soon, as I hope to land a compiler upgrade as the major push for this bloody cycle and r151016. Thanks, Dan From vab at bb-c.de Mon May 18 20:54:43 2015 From: vab at bb-c.de (Volker A. Brandt) Date: Mon, 18 May 2015 22:54:43 +0200 Subject: [OmniOS-discuss] Can't update bloody In-Reply-To: <5ECE2D45-CBD8-49AB-8F7A-DB138B6E9C3B@omniti.com> References: <21832.58196.941714.304987@glaurung.bb-c.de> <5ECE2D45-CBD8-49AB-8F7A-DB138B6E9C3B@omniti.com> Message-ID: <21850.20883.162713.422803@glaurung.bb-c.de> Hi Dan! > I'll be updating the whole wad of bloody later this week. Can y'all > wait a couple of days? I want to include some illumos updates that > I'm about to push this afternoon. So after that push, everything was fine with the bloody repo. However, it seems that the original problem Chavdar and myself were seeing has reappeared today: # /usr/bin/pkgrecv -s http://pkg.omniti.com/omnios/bloody/ -d /pkg/omnios-151015 '*' Processing packages for publisher omnios ... Retrieving and evaluating 3057 package(s)... Download Manifests (1634/3057) |pkgrecv: http protocol error: code: 404 reason: Not Found URL: 'http://pkg.omniti.com/omnios/bloody/omnios/manifest/0/package%2Fpkg at 0.5.11%2C5.11-0.151015%3A20150422T144502Z' (happened 4 times) Is it just me? Regards -- Volker -- ------------------------------------------------------------------------ Volker A. Brandt Consulting and Support for Oracle Solaris Brandt & Brandt Computer GmbH WWW: http://www.bb-c.de/ Am Wiesenpfad 6, 53340 Meckenheim, GERMANY Email: vab at bb-c.de Handelsregister: Amtsgericht Bonn, HRB 10513 Schuhgr??e: 46 Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt "When logic and proportion have fallen sloppy dead" From vab at bb-c.de Mon May 18 20:56:44 2015 From: vab at bb-c.de (Volker A. Brandt) Date: Mon, 18 May 2015 22:56:44 +0200 Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX? In-Reply-To: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com> References: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com> Message-ID: <21850.21004.799624.152739@glaurung.bb-c.de> > My question to you all is this: To which gcc version should we jump? > I see two viable candidates: > > - gcc 4.9.2 (last updated October 2014) > > or > > - gcc 5.1 (last updated April 2015) Na?vely, shouldn't the newer be better? Less work during the the next version jump... Regards -- Volker -- ------------------------------------------------------------------------ Volker A. Brandt Consulting and Support for Oracle Solaris Brandt & Brandt Computer GmbH WWW: http://www.bb-c.de/ Am Wiesenpfad 6, 53340 Meckenheim, GERMANY Email: vab at bb-c.de Handelsregister: Amtsgericht Bonn, HRB 10513 Schuhgr??e: 46 Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt "When logic and proportion have fallen sloppy dead" From eric.sproul at circonus.com Mon May 18 21:00:05 2015 From: eric.sproul at circonus.com (Eric Sproul) Date: Mon, 18 May 2015 17:00:05 -0400 Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX? In-Reply-To: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com> References: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com> Message-ID: On Mon, May 18, 2015 at 4:38 PM, Dan McDonald wrote: > My question to you all is this: To which gcc version should we jump? I see two viable candidates: > > - gcc 4.9.2 (last updated October 2014) > > or > > - gcc 5.1 (last updated April 2015) > > The current gcc "development" is happening on 6.0, and we're not ready for that. It should be noted that due to a version scheme change (https://gcc.gnu.org/develop.html#timeline) 5.1 is what would have been 4.10. It's the first stable release in a new major series (5), with development moving to 6, as Dan pointed out. Eric From danmcd at omniti.com Mon May 18 21:03:55 2015 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 18 May 2015 17:03:55 -0400 Subject: [OmniOS-discuss] Can't update bloody In-Reply-To: <21850.20883.162713.422803@glaurung.bb-c.de> References: <21832.58196.941714.304987@glaurung.bb-c.de> <5ECE2D45-CBD8-49AB-8F7A-DB138B6E9C3B@omniti.com> <21850.20883.162713.422803@glaurung.bb-c.de> Message-ID: <389521B5-A1BD-4FFB-A457-CD31F421625E@omniti.com> I am indeed seeing this problem. I'm not sure how to fix it (or how it got that way in the first place. The only other thing I can recommend is the new for '014 and later "-m latest" option to pkgrecv, which only gets you the LATEST version(s) of the packages: nowhere(~/junk)[0]% /usr/bin/pkgrecv -s http://pkg.omniti.com/omnios/bloody/ -d 015.repo '*' Processing packages for publisher omnios ... Retrieving and evaluating 3057 package(s)... Download Manifests (1634/3057) /pkgrecv: http protocol error: code: 404 reason: Not Found URL: 'http://pkg.omniti.com/omnios/bloody/omnios/manifest/0/package%2Fpkg at 0.5.11%2C5.11-0.151015%3A20150422T144502Z' (happened 4 times) nowhere(~/junk)[1]% ls 015.repo/ nowhere(~/junk)[0]% /bin/rm -rf 015.repo/ nowhere(~/junk)[0]% pkgrepo create 015.repo nowhere(~/junk)[0]% /usr/bin/pkgrecv -m latest -s http://pkg.omniti.com/omnios/bloody/ -d 015.repo '*' Processing packages for publisher omnios ... Retrieving and evaluating 1018 package(s)... PROCESS ITEMS GET (MB) SEND (MB) SUNWcs 66/1018 10/1010 15/2945.....(IN PROGRESS) Hope this helps, Dan From eric.sproul at circonus.com Mon May 18 21:04:22 2015 From: eric.sproul at circonus.com (Eric Sproul) Date: Mon, 18 May 2015 17:04:22 -0400 Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX? In-Reply-To: <21850.21004.799624.152739@glaurung.bb-c.de> References: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com> <21850.21004.799624.152739@glaurung.bb-c.de> Message-ID: On Mon, May 18, 2015 at 4:56 PM, Volker A. Brandt wrote: > Na?vely, shouldn't the newer be better? Less work during the the next > version jump... The very first thing on the 5-series changes list (https://gcc.gnu.org/gcc-5/changes.html): * The default mode for C is now -std=gnu11 instead of -std=gnu89. I have no problem with that, but it *may* cause us some heartburn. Generally speaking though, I would vote for 5.1, mainly because of support for new/upcoming CPU instructions and optimization improvements. Eric From hasslerd at gmx.li Mon May 18 21:09:01 2015 From: hasslerd at gmx.li (Dominik Hassler) Date: Mon, 18 May 2015 23:09:01 +0200 Subject: [OmniOS-discuss] disk failure causing reboot? In-Reply-To: <136C13E89D22BB468B2A7025993639732F527113@EXMCCMB.molalla.com> References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com> <136C13E89D22BB468B2A7025993639732F527113@EXMCCMB.molalla.com> Message-ID: <555A54ED.9080609@gmx.li> Jeff, I have them WD40001FYYG drives in my home server but just as a simple mirror. AFAIK those drives are equivalent to the SATA WD Re 4GB drives but just w/ a SAS controller instead a SATA controller on top and just a little more expensive than their SATA equivalents... I have no real facts but I assume that these SAS drives (they call them "nearline SAS") are not 100% like "real" SAS drives... E.g. they don't run automated background scans, that's what I observed. In what extent they differ from "real" SAS drives, I don't know. On 05/18/2015 09:01 PM, Jeff Stockett wrote: > Hi Dan, > > The pool is made up of 36 disks - 6 x 6 raidz2 vdevs with some SSDs for l2arc and slog. I already replaced the drive and the rebuild is nearly done, but I was mostly curious why a disk failure would cause a reboot? I get that it was apparently hanging the pool up, and that according to some posts I read the developers seem to think it is better the panic/dump/reboot than leave it hung until someone notices, but wouldn't it really be better just to drop the failed drive out of the array? Is it because the system in question is using a SAS expander or is this only expected behavior sometimes depending on how the drive fails? I guess I might expect this with consumer grade SATA drives, but wasn't expecting it with $$$ enterprise SAS drives. > > Thanks, Jeff > > -----Original Message----- > From: Dan McDonald [mailto:danmcd at omniti.com] > Sent: Monday, May 18, 2015 11:33 AM > To: Jeff Stockett > Cc: omnios-discuss > Subject: Re: [OmniOS-discuss] disk failure causing reboot? > > >> On May 18, 2015, at 2:25 PM, Jeff Stockett wrote: >> >> A drive failed in one of our supermicro 5048R-E1CR36L servers running omnios r151012 last night, and somewhat unexpectedly, the whole system seems to have panicked. > > The panic was done for protection of your pool: > >> May 18 04:44:36 zfs01 genunix: [ID 918906 kern.notice] I/O to pool 'dpool' appears to be hung. > > > >> >> The disks are all 4TB WD40001FYYG enterprise SAS drives. Googling seems to indicate it is a known problem with the way the various subsystems sometimes interact. Is there any way to fix/workaround this issue? > > Pull the drive. I'm assuming you have a raidz or mirrored setup where you can do that, right? Or is it a question of finding *which* drive failed? > > Dan > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > From dain.bentley at gmail.com Mon May 18 21:18:15 2015 From: dain.bentley at gmail.com (Dain Bentley) Date: Mon, 18 May 2015 17:18:15 -0400 Subject: [OmniOS-discuss] Poor performance on writes Zraid Message-ID: Hello all, I have a RaidZ setup with 5 disks and rad performance is good. I have no ZIL pool and 8 GB or ECC Ram. Writes are like 2 MB a second with a 1GB network. I'm pulling faster writes on a similar drive in a windows VM over CIFS on VMware. My OmniOS box is bare metal. Any tips on speeding this up? -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim at multitalents.net Mon May 18 22:31:53 2015 From: tim at multitalents.net (Tim Rice) Date: Mon, 18 May 2015 15:31:53 -0700 (PDT) Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX? In-Reply-To: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com> References: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com> Message-ID: On Mon, 18 May 2015, Dan McDonald wrote: > Now this isn't a gcc update for illumos/illumos-omnios... that way is full of pain, and I'll wait for now. > > OTOH, we've transitioned gcc before going into r151008 with 4.8.1. > > My question to you all is this: To which gcc version should we jump? I see two viable candidates: Any reason not to consider CLANG instead of GCC? > > - gcc 4.9.2 (last updated October 2014) > or > - gcc 5.1 (last updated April 2015) > > The current gcc "development" is happening on 6.0, and we're not ready for that. > > I appreciate feedback. I'll be making a decision soon, as I hope to land a compiler upgrade as the major push for this bloody cycle and r151016. > > Thanks, > Dan > -- Tim Rice Multitalents tim at multitalents.net From danmcd at omniti.com Mon May 18 22:46:24 2015 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 18 May 2015 18:46:24 -0400 Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX? In-Reply-To: References: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com> Message-ID: <902CDD21-2BED-4ADA-AFF6-84660DCF942D@omniti.com> > On May 18, 2015, at 6:31 PM, Tim Rice wrote: >> >> My question to you all is this: To which gcc version should we jump? I see two viable candidates: > > Any reason not to consider CLANG instead of GCC? Completely new beast and potential for least-surprise. I can imagine CLANG/LLVM showing up *alongside* gcc in some future release, but not as an outright replacement. Not yet. And remember -- these are just for building non-illumos stuff. illumos is still being built with the specially-modified gcc4.4.4. (Though I wouldn't mind if someone spent time bringing up LLVM/CLANG to build illumos... it would just be really REALLY hard.) Dan From jdg117 at elvis.arl.psu.edu Tue May 19 01:23:04 2015 From: jdg117 at elvis.arl.psu.edu (John D Groenveld) Date: Mon, 18 May 2015 21:23:04 -0400 Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX? In-Reply-To: Your message of "Mon, 18 May 2015 15:31:53 PDT." References: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com> Message-ID: <201505190123.t4J1N4qc029517@elvis.arl.psu.edu> In message , Tim Rice writes: >Any reason not to consider CLANG instead of GCC? Does anyone have a build recipe for LLVM/clang on OmniOS? I'm about to try that path to build Google's V8. John groenveld at acm.org From danmcd at omniti.com Tue May 19 01:37:51 2015 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 18 May 2015 21:37:51 -0400 Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX? In-Reply-To: <201505190123.t4J1N4qc029517@elvis.arl.psu.edu> References: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com> <201505190123.t4J1N4qc029517@elvis.arl.psu.edu> Message-ID: <614F6199-6E65-492B-B5E7-3CDBB6208512@omniti.com> ISTR there being an old pull request in omnios-build for it. I can't just take something like that in, but it may serve your needs. Dan Sent from my iPhone (typos, autocorrect, and all) > On May 18, 2015, at 9:23 PM, John D Groenveld wrote: > > In message , > Tim Rice writes: >> Any reason not to consider CLANG instead of GCC? > > Does anyone have a build recipe for LLVM/clang on OmniOS? > I'm about to try that path to build Google's V8. > > John > groenveld at acm.org > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss From jimklimov at cos.ru Tue May 19 04:45:43 2015 From: jimklimov at cos.ru (Jim Klimov) Date: Tue, 19 May 2015 06:45:43 +0200 Subject: [OmniOS-discuss] Poor performance on writes Zraid In-Reply-To: References: Message-ID: <6A0DDFC7-B659-4A5F-B464-607AC6104006@cos.ru> 18 ??? 2015??. 23:18:15 CEST, Dain Bentley ?????: >Hello all, I have a RaidZ setup with 5 disks and rad performance is >good. >I have no ZIL pool and 8 GB or ECC Ram. Writes are like 2 MB a second >with >a 1GB network. I'm pulling faster writes on a similar drive in a >windows >VM over CIFS on VMware. My OmniOS box is bare metal. Any tips on >speeding >this up? > > >------------------------------------------------------------------------ > >_______________________________________________ >OmniOS-discuss mailing list >OmniOS-discuss at lists.omniti.com >http://lists.omniti.com/mailman/listinfo/omnios-discuss Do you have dedup enabled? (This is pretty slow, and needs lots of metadata reads to make each write, and little RAM and no L2ARC is very bad with this) Also, very full pools (vague definition based on history of the writes - generally 80% as a rule of thumb, though pathologies can be after 50% for some and 95%+ for others) - these can have very fragmented and small 'holes' in free space, which impacts write speeds (more random, and it takes more time to find the available location for a block). You can also look at 'iostat -Xnz 1' output to see the i/o values per active device. Younare interested in reads/sec+writes/sec (hdds can serve about 200ops/sec total, unless they happen to be small requests to sequentially placed sector numbers - in theory you might be lucky to see even 20000iops in such favorable case, in practice about 500 is not uncommon since related block locations in zfs are often coalesced). In iostat you'd also worry about %b(usy), %w(rite-wait) to see if some disks have a very different performance than others (e.g. one has internal problems and sector relocations to spare areas, or flaky cabling and many protocol re-requests involved in succesful ops). svct (service times) and queue lengths can also be useful. You can get similar info with 'zpool iostat -v 1' as well, though interactions between pool io's and component vdev io's may be tricky to compare between raidz and mirror for example. You might be more interested in averaged differences (maybe across larger time ranges) between these two iostats - e.g. if you have some other io's that those from the pool (say, a raw swap partition). Finally, consider dtrace-toolkit's and Richard Elling's scripts to sniff what logical (file/vdev) operations you have - and see how these numbers compare to those in pool i/o's at least on the order of magnitude. The difference can be metadata ops, or something else. Hooe this helps get you started, Jim Klimov -- Typos courtesy of K-9 Mail on my Samsung Android From vab at bb-c.de Tue May 19 06:16:39 2015 From: vab at bb-c.de (Volker A. Brandt) Date: Tue, 19 May 2015 08:16:39 +0200 Subject: [OmniOS-discuss] Can't update bloody In-Reply-To: <389521B5-A1BD-4FFB-A457-CD31F421625E@omniti.com> References: <21832.58196.941714.304987@glaurung.bb-c.de> <5ECE2D45-CBD8-49AB-8F7A-DB138B6E9C3B@omniti.com> <21850.20883.162713.422803@glaurung.bb-c.de> <389521B5-A1BD-4FFB-A457-CD31F421625E@omniti.com> Message-ID: <21850.54599.267449.146761@glaurung.bb-c.de> > The only other thing I can recommend is the new for '014 and later > "-m latest" option to pkgrecv, which only gets you the LATEST > version(s) of the packages: Good point. Works. Thanks! Regards -- Volker -- ------------------------------------------------------------------------ Volker A. Brandt Consulting and Support for Oracle Solaris Brandt & Brandt Computer GmbH WWW: http://www.bb-c.de/ Am Wiesenpfad 6, 53340 Meckenheim, GERMANY Email: vab at bb-c.de Handelsregister: Amtsgericht Bonn, HRB 10513 Schuhgr??e: 46 Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt "When logic and proportion have fallen sloppy dead" From rt at steait.net Tue May 19 09:19:17 2015 From: rt at steait.net (Rune Tipsmark) Date: Tue, 19 May 2015 09:19:17 +0000 Subject: [OmniOS-discuss] disk failure causing reboot? In-Reply-To: References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com> <20150518200848.GH3720@bender.unx.cpp.edu> Message-ID: Same issue here around two months ago when a L2arc device failed? failmode was default and the device was actually an mSata SSD mounted in a PCI-E mSata card: http://www.addonics.com/products/ad4mspx2.php and the disk was one of four of these http://www.samsung.com/us/computer/memory-storage/MZ-MTE1T0BW Can these reboots be avoided in any way? Br, Rune From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] On Behalf Of Schweiss, Chip Sent: Monday, May 18, 2015 10:31 PM To: Paul B. Henson Cc: omnios-discuss Subject: Re: [OmniOS-discuss] disk failure causing reboot? I had the exact same failure mode last week. With over 1000 spindles I see this about once a month. I can publish my dump also if anyone actually want's to try to fix this problem, but I think there are several of the same thing already linked to tickets in Illumos-gate. Pools for the most part should be set to failmode=panic or wait, but a failed disk should not cause a panic. The system this happened to me on failmode was set to wait. It is also on r151012, waiting on a window to upgrade to r151014. My pool is raidz3, so no reason not to kick a bad disk. All my disks are SAS in DataON JBODs, dual connected across two LSI HBAs. BTW, pull a SAS cable and you get a panic too, not degraded multipath. Illumos seems to panic on just about any SAS event these days regardless of redundancy. -Chip On Mon, May 18, 2015 at 3:08 PM, Paul B. Henson > wrote: On Mon, May 18, 2015 at 06:25:34PM +0000, Jeff Stockett wrote: > A drive failed in one of our supermicro 5048R-E1CR36L servers running > omnios r151012 last night, and somewhat unexpectedly, the whole system > seems to have panicked. You don't happen to have failmode set to panic on the pool? From the zpool manpage: failmode=wait | continue | panic Controls the system behavior in the event of catastrophic pool failure. This condition is typically a result of a loss of connectivity to the underlying storage device(s) or a failure of all devices within the pool. The behavior of such an event is determined as follows: wait Blocks all I/O access until the device connectivity is recovered and the errors are cleared. This is the default behavior. continue Returns EIO to any new write I/O requests but allows reads to any of the remaining healthy devices. Any write requests that have yet to be committed to disk would be blocked. panic Prints out a message to the console and generates a system crash dump. _______________________________________________ OmniOS-discuss mailing list OmniOS-discuss at lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From dain.bentley at gmail.com Tue May 19 11:09:50 2015 From: dain.bentley at gmail.com (Dain Bentley) Date: Tue, 19 May 2015 07:09:50 -0400 Subject: [OmniOS-discuss] Poor performance on writes Zraid In-Reply-To: <6A0DDFC7-B659-4A5F-B464-607AC6104006@cos.ru> References: <6A0DDFC7-B659-4A5F-B464-607AC6104006@cos.ru> Message-ID: Thanks for the help guys. Integrated CIFS. Reads are fast. The pool is about 60% full only. Thanks for the tips! I'll try iostat to sniff this out On Tuesday, May 19, 2015, Jim Klimov wrote: > 18 ??? 2015 ?. 23:18:15 CEST, Dain Bentley > ?????: > >Hello all, I have a RaidZ setup with 5 disks and rad performance is > >good. > >I have no ZIL pool and 8 GB or ECC Ram. Writes are like 2 MB a second > >with > >a 1GB network. I'm pulling faster writes on a similar drive in a > >windows > >VM over CIFS on VMware. My OmniOS box is bare metal. Any tips on > >speeding > >this up? > > > > > >------------------------------------------------------------------------ > > > >_______________________________________________ > >OmniOS-discuss mailing list > >OmniOS-discuss at lists.omniti.com > >http://lists.omniti.com/mailman/listinfo/omnios-discuss > > Do you have dedup enabled? (This is pretty slow, and needs lots of > metadata reads to make each write, and little RAM and no L2ARC is very bad > with this) > > Also, very full pools (vague definition based on history of the writes - > generally 80% as a rule of thumb, though pathologies can be after 50% for > some and 95%+ for others) - these can have very fragmented and small > 'holes' in free space, which impacts write speeds (more random, and it > takes more time to find the available location for a block). > > You can also look at 'iostat -Xnz 1' output to see the i/o values per > active device. Younare interested in reads/sec+writes/sec (hdds can serve > about 200ops/sec total, unless they happen to be small requests to > sequentially placed sector numbers - in theory you might be lucky to see > even 20000iops in such favorable case, in practice about 500 is not > uncommon since related block locations in zfs are often coalesced). In > iostat you'd also worry about %b(usy), %w(rite-wait) to see if some disks > have a very different performance than others (e.g. one has internal > problems and sector relocations to spare areas, or flaky cabling and many > protocol re-requests involved in succesful ops). svct (service times) and > queue lengths can also be useful. > > You can get similar info with 'zpool iostat -v 1' as well, though > interactions between pool io's and component vdev io's may be tricky to > compare between raidz and mirror for example. You might be more interested > in averaged differences (maybe across larger time ranges) between these two > iostats - e.g. if you have some other io's that those from the pool (say, a > raw swap partition). > > Finally, consider dtrace-toolkit's and Richard Elling's scripts to sniff > what logical (file/vdev) operations you have - and see how these numbers > compare to those in pool i/o's at least on the order of magnitude. The > difference can be metadata ops, or something else. > > Hooe this helps get you started, > Jim Klimov > -- > Typos courtesy of K-9 Mail on my Samsung Android > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dain.bentley at gmail.com Tue May 19 11:10:27 2015 From: dain.bentley at gmail.com (Dain Bentley) Date: Tue, 19 May 2015 07:10:27 -0400 Subject: [OmniOS-discuss] Poor performance on writes Zraid In-Reply-To: <6A0DDFC7-B659-4A5F-B464-607AC6104006@cos.ru> References: <6A0DDFC7-B659-4A5F-B464-607AC6104006@cos.ru> Message-ID: And no dedup On Tuesday, May 19, 2015, Jim Klimov wrote: > 18 ??? 2015 ?. 23:18:15 CEST, Dain Bentley > ?????: > >Hello all, I have a RaidZ setup with 5 disks and rad performance is > >good. > >I have no ZIL pool and 8 GB or ECC Ram. Writes are like 2 MB a second > >with > >a 1GB network. I'm pulling faster writes on a similar drive in a > >windows > >VM over CIFS on VMware. My OmniOS box is bare metal. Any tips on > >speeding > >this up? > > > > > >------------------------------------------------------------------------ > > > >_______________________________________________ > >OmniOS-discuss mailing list > >OmniOS-discuss at lists.omniti.com > >http://lists.omniti.com/mailman/listinfo/omnios-discuss > > Do you have dedup enabled? (This is pretty slow, and needs lots of > metadata reads to make each write, and little RAM and no L2ARC is very bad > with this) > > Also, very full pools (vague definition based on history of the writes - > generally 80% as a rule of thumb, though pathologies can be after 50% for > some and 95%+ for others) - these can have very fragmented and small > 'holes' in free space, which impacts write speeds (more random, and it > takes more time to find the available location for a block). > > You can also look at 'iostat -Xnz 1' output to see the i/o values per > active device. Younare interested in reads/sec+writes/sec (hdds can serve > about 200ops/sec total, unless they happen to be small requests to > sequentially placed sector numbers - in theory you might be lucky to see > even 20000iops in such favorable case, in practice about 500 is not > uncommon since related block locations in zfs are often coalesced). In > iostat you'd also worry about %b(usy), %w(rite-wait) to see if some disks > have a very different performance than others (e.g. one has internal > problems and sector relocations to spare areas, or flaky cabling and many > protocol re-requests involved in succesful ops). svct (service times) and > queue lengths can also be useful. > > You can get similar info with 'zpool iostat -v 1' as well, though > interactions between pool io's and component vdev io's may be tricky to > compare between raidz and mirror for example. You might be more interested > in averaged differences (maybe across larger time ranges) between these two > iostats - e.g. if you have some other io's that those from the pool (say, a > raw swap partition). > > Finally, consider dtrace-toolkit's and Richard Elling's scripts to sniff > what logical (file/vdev) operations you have - and see how these numbers > compare to those in pool i/o's at least on the order of magnitude. The > difference can be metadata ops, or something else. > > Hooe this helps get you started, > Jim Klimov > -- > Typos courtesy of K-9 Mail on my Samsung Android > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdg117 at elvis.arl.psu.edu Tue May 19 16:28:18 2015 From: jdg117 at elvis.arl.psu.edu (John D Groenveld) Date: Tue, 19 May 2015 12:28:18 -0400 Subject: [OmniOS-discuss] Query - Update gcc48 to gccXX, which XX? In-Reply-To: Your message of "Mon, 18 May 2015 21:23:04 EDT." <201505190123.t4J1N4qc029517@elvis.arl.psu.edu> References: <7A8D6F1D-38A0-4DC4-AD24-19BA876FB13D@omniti.com> <201505190123.t4J1N4qc029517@elvis.arl.psu.edu> Message-ID: <201505191628.t4JGSISi015918@elvis.arl.psu.edu> In message <201505190123.t4J1N4qc029517 at elvis.arl.psu.edu>, John D Groenveld writes: >Does anyone have a build recipe for LLVM/clang on OmniOS? LLVM depends on CMake and Python-2.7.9. Both build easily with stock gcc-4.8.1. John groenveld at acm.org From mtalbott at lji.org Tue May 19 17:36:19 2015 From: mtalbott at lji.org (Michael Talbott) Date: Tue, 19 May 2015 10:36:19 -0700 Subject: [OmniOS-discuss] Samba Performance Message-ID: <46496B91-013E-4940-BECB-B167D979509E@lji.org> Hi all. I've been transitioning a file server to OmniOS for many reasons (abandoning zfsonlinux). But I seem to have one last issue I'd like to resolve. It seems to be running into a performance issue with samba. I'm not using the built in zfs smb sharing because I need more flexibility in our environment than it offers such as sharing subfolders, shadow copies, etc. I'm running the latest LTS version of Omni on a well equipped server with 2x Xeon E5-2630 v2 @ 2.60GHz, 128G ECC ram with dual intel 10g NICs, and two dual LSI 6G SAS cards. The zpool has 8 x raidz2s that have 10 4TB drives in each totaling 290TB. Disk performance is not an issue. I've compiled samba 4.2.1 and netatalk from source, got winbind working nicely with our AD environment. Samba is even happily kerberized. Everything authenticates and functions correctly. But, while netatalk gives me line speed performance (120-150MB/s on a gigabit workstation), samba won't budge above 40-60MB/s (same speeds using MacOS and Windows 7,8,2012 clients). Using the same hardware on CentOS 7 with zfsonlinux, samba gives me just about the same throughput as netatalk. In linux, I could tune it with socket options giving it a bigger buffer and it made a big difference. But using the same options on Omni doesn't seem to have any significant affect (actually seems to slow it down a bit). In smb.conf on both OSs: socket options = TCP_NODELAY IPTOS_LOWDELAY SO_SNDBUF=2097152 SO_RCVBUF=2097152 In CentOS7 in sysctl.conf: net.ipv4.tcp_rmem = 10000000 10000000 10000000 net.ipv4.tcp_wmem = 10000000 10000000 10000000 net.ipv4.tcp_mem = 10000000 10000000 10000000 net.ipv4.tcp_sack = 0 net.core.rmem_max = 524287 net.core.wmem_max = 524287 net.core.rmem_default = 524287 net.core.wmem_default = 524287 net.core.optmem_max = 524287 net.core.netdev_max_backlog = 300000 And then in Omni, I've set these ip properties root at store3:# ipadm show-prop PROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLE tcp max_buf rw 16777216 16777216 1048576 8192-1073741824 tcp recv_buf rw 16777216 16777216 128000 2048-16777216 tcp send_buf rw 16777216 16777216 49152 4096-16777216 I just can't get samba on Omni to go any faster than 60MB/s. I've tried adjusting those buffers, removing the socket options in smb.conf altogether, but, to no avail. Anyone else out there running samba on Omni and getting faster throughput? Anyone have any ideas of how I could get more throughput with samba? Thanks, Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Tue May 19 17:49:58 2015 From: danmcd at omniti.com (Dan McDonald) Date: Tue, 19 May 2015 13:49:58 -0400 Subject: [OmniOS-discuss] Samba Performance In-Reply-To: <46496B91-013E-4940-BECB-B167D979509E@lji.org> References: <46496B91-013E-4940-BECB-B167D979509E@lji.org> Message-ID: > On May 19, 2015, at 1:36 PM, Michael Talbott wrote: > > And then in Omni, I've set these ip properties > root at store3:# ipadm show-prop > PROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLE > tcp max_buf rw 16777216 16777216 1048576 8192-1073741824 > tcp recv_buf rw 16777216 16777216 128000 2048-16777216 > tcp send_buf rw 16777216 16777216 49152 4096-16777216 > > I just can't get samba on Omni to go any faster than 60MB/s. I've tried adjusting those buffers, removing the socket options in smb.conf altogether, but, to no avail. I'd lower recv/send from 16MB down to 1MB unless you have a VERY HIGH DELAY network. You just aren't buying much beyond 1MB. I've heard Samba itself is the source of most of these problems. As for the built-in smb sharing... there are improvements already starting to be upstreamed in illumos-gate (and are in the OmniOS bloody release), but it may not solve all of your problems that Samba will solve. I'd suggest asking the illumos list your SMB questions as well -- maybe one of the Nexentians will be able to point the way toward what's coming. Dan From doug at will.to Tue May 19 18:41:23 2015 From: doug at will.to (Doug Hughes) Date: Tue, 19 May 2015 14:41:23 -0400 Subject: [OmniOS-discuss] Samba Performance In-Reply-To: References: <46496B91-013E-4940-BECB-B167D979509E@lji.org> Message-ID: The equivalent TCP raw tunings for Solaris based OS's ndd -set /dev/tcp tcp_xmit_hiwat 1048576 ndd -set /dev/tcp tcp_recv_hiwat 1048576 ndd -set /dev/tcp tcp_max_buf 4194304 Those are the raw tunables and if you run get on those you'll see that they are different than what's in ipadm. One is just buffers, but these are the sliding window parameters. How big is your latency? Agreed that 1Mb seems like plenty even for east/west coast high-bandwidth WAN. You certainly wouldn't need to go about 4MB for that. On Tue, May 19, 2015 at 1:49 PM, Dan McDonald wrote: > > > On May 19, 2015, at 1:36 PM, Michael Talbott wrote: > > > > And then in Omni, I've set these ip properties > > root at store3:# ipadm show-prop > > PROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT > POSSIBLE > > tcp max_buf rw 16777216 16777216 1048576 > 8192-1073741824 > > tcp recv_buf rw 16777216 16777216 128000 > 2048-16777216 > > tcp send_buf rw 16777216 16777216 49152 > 4096-16777216 > > > > I just can't get samba on Omni to go any faster than 60MB/s. I've tried > adjusting those buffers, removing the socket options in smb.conf > altogether, but, to no avail. > > I'd lower recv/send from 16MB down to 1MB unless you have a VERY HIGH > DELAY network. You just aren't buying much beyond 1MB. > > I've heard Samba itself is the source of most of these problems. > > As for the built-in smb sharing... there are improvements already starting > to be upstreamed in illumos-gate (and are in the OmniOS bloody release), > but it may not solve all of your problems that Samba will solve. I'd > suggest asking the illumos list your SMB questions as well -- maybe one of > the Nexentians will be able to point the way toward what's coming. > > Dan > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Tue May 19 18:58:15 2015 From: danmcd at omniti.com (Dan McDonald) Date: Tue, 19 May 2015 14:58:15 -0400 Subject: [OmniOS-discuss] Samba Performance In-Reply-To: References: <46496B91-013E-4940-BECB-B167D979509E@lji.org> Message-ID: <863E2F65-B57C-46B2-8DC1-FFCBA9F2A723@omniti.com> > On May 19, 2015, at 2:41 PM, Doug Hughes wrote: > > > ndd -set /dev/tcp tcp_xmit_hiwat 1048576 > ndd -set /dev/tcp tcp_recv_hiwat 1048576 > ndd -set /dev/tcp tcp_max_buf 4194304 Umm... not as much now. The ipadm(1M) Michael showed is the moral equivalent, and better supported. Dan From doug at will.to Tue May 19 19:11:09 2015 From: doug at will.to (Doug Hughes) Date: Tue, 19 May 2015 15:11:09 -0400 Subject: [OmniOS-discuss] Samba Performance In-Reply-To: <863E2F65-B57C-46B2-8DC1-FFCBA9F2A723@omniti.com> References: <46496B91-013E-4940-BECB-B167D979509E@lji.org> <863E2F65-B57C-46B2-8DC1-FFCBA9F2A723@omniti.com> Message-ID: Oops, when I was comparing numbers I didn't take the left column into account. I didn't realize it was enumerated by protocol, and being in 'screen', I didn't see it correctly in scrollback. mea culpa. On Tue, May 19, 2015 at 2:58 PM, Dan McDonald wrote: > > > On May 19, 2015, at 2:41 PM, Doug Hughes wrote: > > > > > > ndd -set /dev/tcp tcp_xmit_hiwat 1048576 > > ndd -set /dev/tcp tcp_recv_hiwat 1048576 > > ndd -set /dev/tcp tcp_max_buf 4194304 > > Umm... not as much now. The ipadm(1M) Michael showed is the moral > equivalent, and better supported. > > Dan > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Tue May 19 21:08:50 2015 From: danmcd at omniti.com (Dan McDonald) Date: Tue, 19 May 2015 17:08:50 -0400 Subject: [OmniOS-discuss] New OmniOS bloody update Message-ID: <9CA9EC60-D441-47F3-80C6-D24DDD09A563@omniti.com> Based on omnios-build commit 155193f and illumos-omnios commit c4ba593. This is a partial update, but includes the entirety of illumos-omnios, so expect a reboot. Remember, if you're doing full-repo transfers, use the new "-m latest" argument in pkgrecv to prevent pulling old packages over. Since last time: - KVM has been updated per the r151014 update --> it's now up to date modulo the removal of VND stuff. Once VND upstreams, bloody will be the first to see a fully synched with upstream KVM. - The rest of the changes are in illumos, pulled down from upstream. - Various bugfixes upstreamed from Delphix in mdb & zfs, and Joyent in other areas. - Flow control is now in the NFS server, which prevents starvation when the network outperforms the disks. (This will be backported to r151014.) - zpool import speedup. This should improved boot times on systems with many ZFS filesystems. - More SMB bugfixes upstreamed from Nexenta. Happy updating! Dan From danmcd at omniti.com Tue May 19 21:10:14 2015 From: danmcd at omniti.com (Dan McDonald) Date: Tue, 19 May 2015 17:10:14 -0400 Subject: [OmniOS-discuss] New OmniOS bloody update In-Reply-To: References: Message-ID: <4FC32FBB-A59C-4A25-B5BC-F8B6E74AECC4@omniti.com> > On May 6, 2015, at 10:13 AM, Dan McDonald wrote: > > Based on omnios-build commit 69a5016 and illumos-omnios commit 385735e. Shoot. The packages aren't out yet and I hit Send early. Please wait about 30-60 minutes before upgrading. Otherwise you'll only see the small changes outside illumos-omnios. Sorry, Dan From tim at multitalents.net Tue May 19 23:02:00 2015 From: tim at multitalents.net (Tim Rice) Date: Tue, 19 May 2015 16:02:00 -0700 (PDT) Subject: [OmniOS-discuss] Updating r151006 to r151014 Message-ID: Last weekend I updated my r151006 VMs to r151014. One with a zone. The notes at http://omnios.omniti.com/wiki.php/Upgrade_to_r151014 were quite good. One piece not mentioned (although obvious when you think about it) was that for those of us that froze at r151006, it is necessary to unfreeze to upgrade. In zone and global zone, # pkg unfreeze entire at 11-0.151006 \ consolidation/osnet/osnet-incorporation at 0.5.11-0.151006 \ incorporation/jeos/illumos-gate at 11-0.151006 \ incorporation/jeos/omnios-userland at 11-0.151006 Since one of the VMs is the storage server on my all-in-one box it had smartmontools loaded. I had to remove smartmontools for the update to work and there is no smartmontools for r151014. :-( I hope these notes save someone some time. -- Tim Rice Multitalents tim at multitalents.net From danmcd at omniti.com Wed May 20 02:40:20 2015 From: danmcd at omniti.com (Dan McDonald) Date: Tue, 19 May 2015 22:40:20 -0400 Subject: [OmniOS-discuss] Updating r151006 to r151014 In-Reply-To: References: Message-ID: <38596CDA-5C4C-435D-8AC6-B70883B06F76@omniti.com> > On May 19, 2015, at 7:02 PM, Tim Rice wrote: > > > Since one of the VMs is the storage server on my all-in-one box it > had smartmontools loaded. I had to remove smartmontools for the > update to work and there is no smartmontools for r151014. :-( Which publisher provides smartmontools? If its ms.omniti.com, it'll be up to our internal staff to update that package. Remember that ms.omniti.com is not supported, it is a convenience offering of the tools our ops people use internally. Dan From tim at multitalents.net Wed May 20 04:54:02 2015 From: tim at multitalents.net (Tim Rice) Date: Tue, 19 May 2015 21:54:02 -0700 (PDT) Subject: [OmniOS-discuss] Updating r151006 to r151014 In-Reply-To: <38596CDA-5C4C-435D-8AC6-B70883B06F76@omniti.com> References: <38596CDA-5C4C-435D-8AC6-B70883B06F76@omniti.com> Message-ID: On Tue, 19 May 2015, Dan McDonald wrote: | | > On May 19, 2015, at 7:02 PM, Tim Rice wrote: | > | > | > Since one of the VMs is the storage server on my all-in-one box it | > had smartmontools loaded. I had to remove smartmontools for the | > update to work and there is no smartmontools for r151014. :-( | | Which publisher provides smartmontools? If its ms.omniti.com, it'll be | up to our internal staff to update that package. Remember that | ms.omniti.com is not supported, it is a convenience offering of the | tools our ops people use internally. Yes it was ms.omniti.com. I fully understand that they are not supported. Thanks for providing it for 006. While I would have liked to have had it available for 014 and save me some time, I can roll my own. Thank You to all the people at omniti for all that they do provide. | | Dan | -- Tim Rice Multitalents tim at multitalents.net From jimklimov at cos.ru Wed May 20 05:32:12 2015 From: jimklimov at cos.ru (Jim Klimov) Date: Wed, 20 May 2015 07:32:12 +0200 Subject: [OmniOS-discuss] Updating r151006 to r151014 In-Reply-To: References: <38596CDA-5C4C-435D-8AC6-B70883B06F76@omniti.com> Message-ID: 20 ??? 2015??. 6:54:02 CEST, Tim Rice ?????: >On Tue, 19 May 2015, Dan McDonald wrote: > >| >| > On May 19, 2015, at 7:02 PM, Tim Rice wrote: >| > >| > >| > Since one of the VMs is the storage server on my all-in-one box it >| > had smartmontools loaded. I had to remove smartmontools for the >| > update to work and there is no smartmontools for r151014. :-( >| >| Which publisher provides smartmontools? If its ms.omniti.com, it'll >be >| up to our internal staff to update that package. Remember that >| ms.omniti.com is not supported, it is a convenience offering of the >| tools our ops people use internally. > >Yes it was ms.omniti.com. I fully understand that they are not >supported. >Thanks for providing it for 006. While I would have liked to have had >it available for 014 and save me some time, I can roll my own. > >Thank You to all the people at omniti for all that they do provide. > >| >| Dan >| You can also give a shot to pkgsrc, it is fairly easy to bootstrap and install, though they are usually updated only quarterly and an upgrade seems to require changing the repo used by your system. This is tricky to script for hands-off management. And for the past 2 upgrades (though disruptive with new packaging features) I was better off removing it all, bootstrapping and reinstalling what I remembered as needed (e.g. top or vnc+twm to occasionally head my virtualboxes) to get out of incompatible package version conflicts that I managed to get otherwise. HTH, Jim -- Typos courtesy of K-9 Mail on my Samsung Android From vab at bb-c.de Wed May 20 07:53:32 2015 From: vab at bb-c.de (Volker A. Brandt) Date: Wed, 20 May 2015 09:53:32 +0200 Subject: [OmniOS-discuss] Updating r151006 to r151014 In-Reply-To: References: <38596CDA-5C4C-435D-8AC6-B70883B06F76@omniti.com> Message-ID: <21852.15740.375385.805897@glaurung.bb-c.de> > | Which publisher provides smartmontools? If its ms.omniti.com, > it'll be | up to our internal staff to update that package. > Remember that | ms.omniti.com is not supported, it is a convenience > offering of the | tools our ops people use internally. > > Yes it was ms.omniti.com. I fully understand that they are not > supported. Thanks for providing it for 006. While I would have > liked to have had it available for 014 and save me some time, I can > roll my own. I am using smartmontools from ms.o.c under 151014 with no problems. The package version is: pkg://ms.omniti.com/omniti/system/storage/smartmontools at 6.0-0.151004:20130113T222019Z However, I installed it when the box was at 151010. I since upgraded to 151012 and then 151014. Maybe you can reinstall smartmontools using some IPS trickery? Have you tried? Having said that, I do know that some of the pkgs on ms.o.c have dependency problems. Like you said I am thankful that there is a lot of usable stuff there so I am not complaining. :-) Regards -- Volker -- ------------------------------------------------------------------------ Volker A. Brandt Consulting and Support for Oracle Solaris Brandt & Brandt Computer GmbH WWW: http://www.bb-c.de/ Am Wiesenpfad 6, 53340 Meckenheim, GERMANY Email: vab at bb-c.de Handelsregister: Amtsgericht Bonn, HRB 10513 Schuhgr??e: 46 Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt "When logic and proportion have fallen sloppy dead" From danmcd at omniti.com Wed May 20 12:44:57 2015 From: danmcd at omniti.com (Dan McDonald) Date: Wed, 20 May 2015 08:44:57 -0400 Subject: [OmniOS-discuss] Logjam & IKE Message-ID: Security researchers published this recently: https://weakdh.org/ This note (which should be forwarded to other illumos interest lists) briefly discusses how logjam affects the closed-source in.iked. IKE can use one of many Diffie-Hellman groups both for establishing IKE's own security, and ALSO optionally for generating IPsec keying material. The former is specified by the "oakley_group", and the latter by the "p2_pfs" keyword. Now the ike.config(4) man page was recently updated to reflect the full range of available choices. I did discover (and sorry Eric for not catching this in code review) that p2_pfs accepts the same choices as the now-updated oakley_group parameter does. They follow, with markings around which ones I'd deprecate, and which ones I have naive questions about, were in.iked & libike.so open-source: oakley_group number The Oakley Diffie-Hellman group used for IKE SA key derivation. The group numbers are defined in RFC 2409, Appendix A, RFC 3526, and RFC 5114, section 3.2. Acceptable values are currently: 1 (MODP 768-bit) ****** DO NOT USE ****** 2 (MODP 1024-bit) ****** DO NOT USE ****** 3 (EC2N 155-bit) ****** NOT SURE ****** 4 (EC2N 185-bit) ****** NOT SURE ****** 5 (MODP 1536-bit) 14 (MODP 2048-bit) 15 (MODP 3072-bit) 16 (MODP 4096-bit) 17 (MODP 6144-bit) 18 (MODP 8192-bit) 19 (ECP 256-bit) 20 (ECP 384-bit) 21 (ECP 521-bit) 22 (MODP 1024-bit, with 160-bit Prime Order Subgroup) ***** NOT SURE, but more sure than 1-4 ***** 23 (MODP 2048-bit, with 224-bit Prime Order Subgroup) 24 (MODP 2048-bit, with 256-bit Prime Order Subgroup) 25 (ECP 192-bit) 26 (ECP 224-bit) I don't think anyone in the audience who uses IPsec & IKE uses groups 1-4 anymore anyway (people who remember punchin from Sun should know I never/rarely accepted anything less than group 5). IF you happen to be using Oakley groups 1-4, STOP. Had I access to the source, I'd compile these right out and set a flag day. BTW, if you are using or providing SSL services, I'd highly recommend configuring them to avoid the weak DH groups mentioned in the URL above as well. Thanks, Dan McDonald -- OmniOS Engineering p.s. I'm travelling today, so I won't be replying to mail until tonight at the earliest. From danmcd at omniti.com Wed May 20 17:03:45 2015 From: danmcd at omniti.com (Dan McDonald) Date: Wed, 20 May 2015 13:03:45 -0400 Subject: [OmniOS-discuss] [discuss] Logjam & IKE In-Reply-To: References: Message-ID: > On May 20, 2015, at 10:06 AM, Jonathan Adams wrote: > > Thanks for the heads up ... we have quite a few IKE/ipsec connections, although static ip addresses are used. They've been in use since forever ... > > fortunately we use 5 for all the connections. You should really move up to 2048-bit MODP or use one of the 256-or-higher ECC groups. Do you have legacy reasons not to? Dan From mtalbott at lji.org Wed May 20 23:51:16 2015 From: mtalbott at lji.org (Michael Talbott) Date: Wed, 20 May 2015 16:51:16 -0700 Subject: [OmniOS-discuss] Backing up HUGE zfs volumes Message-ID: I'm trying to find ways of efficiently archiving up some huge (120TB and growing) zfs volumes with millions maybe billions of files of all sizes. I use zfs send/recv for replication to another box for tier 1/2 recovery. But, I'm trying to find a good open source solution that runs on Omni for archival purposes that doesn't have to crawl the filesystem or rely on any proprietary formats. I was thinking I could use zfs diff to get a list of changed data, parse that into a usable format, create a tar and par of the data, and an accompanying plain text index file. From there, upload that set of data to a cloud provider. While I could probably script it all out myself to accomplish this, I'm hoping someone knows of an existing solution that can produce somewhat similar results. Ideas anyone? Thanks, Michael From chip at innovates.com Thu May 21 12:24:56 2015 From: chip at innovates.com (Schweiss, Chip) Date: Thu, 21 May 2015 07:24:56 -0500 Subject: [OmniOS-discuss] Backing up HUGE zfs volumes In-Reply-To: References: Message-ID: I would caution against anything using 'zfs diff' It has been perpetually broken, either not working at all, or returning incomplete information. Avoiding crawling the directory is pretty much impossible unless you use 'zfs send'. However, as long as there is enough cache on the system, directory crawls can be very efficient. I have daily rsync jobs that crawl over 200 million files. The impact of the crawl is not noticeable to other users. I has also used ZFS send to AWS Glacier. This worked well until the data change rate got high enough I need to start over too often to keep the storage size reasonable on Glacier. I also use CrashPlan on my home OmniOS server to back up about 5TB. It works very nicely. -Chip On Wed, May 20, 2015 at 6:51 PM, Michael Talbott wrote: > I'm trying to find ways of efficiently archiving up some huge (120TB and > growing) zfs volumes with millions maybe billions of files of all sizes. I > use zfs send/recv for replication to another box for tier 1/2 recovery. > But, I'm trying to find a good open source solution that runs on Omni for > archival purposes that doesn't have to crawl the filesystem or rely on any > proprietary formats. > > I was thinking I could use zfs diff to get a list of changed data, parse > that into a usable format, create a tar and par of the data, and an > accompanying plain text index file. From there, upload that set of data to > a cloud provider. While I could probably script it all out myself to > accomplish this, I'm hoping someone knows of an existing solution that can > produce somewhat similar results. > > Ideas anyone? > > Thanks, > > Michael > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.elling at richardelling.com Thu May 21 23:58:28 2015 From: richard.elling at richardelling.com (Richard Elling) Date: Thu, 21 May 2015 16:58:28 -0700 Subject: [OmniOS-discuss] disk failure causing reboot? In-Reply-To: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com> References: <136C13E89D22BB468B2A7025993639732F52704D@EXMCCMB.molalla.com> Message-ID: > On May 18, 2015, at 11:25 AM, Jeff Stockett wrote: > > A drive failed in one of our supermicro 5048R-E1CR36L servers running omnios r151012 last night, and somewhat unexpectedly, the whole system seems to have panicked. > > May 18 04:43:08 zfs01 scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,2f02 at 1/pci15d9,808 at 0 (mpt_sas0): > May 18 04:43:08 zfs01 Log info 0x31140000 received for target 29 w50000c0f01f1bf06. > May 18 04:43:08 zfs01 scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc [forward reference] > May 18 04:44:36 zfs01 genunix: [ID 843051 kern.info] NOTICE: SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major > May 18 04:44:36 zfs01 unix: [ID 836849 kern.notice] > May 18 04:44:36 zfs01 ^Mpanic[cpu0]/thread=ffffff00f3ecbc40: > May 18 04:44:36 zfs01 genunix: [ID 918906 kern.notice] I/O to pool 'dpool' appears to be hung. > May 18 04:44:36 zfs01 unix: [ID 100000 kern.notice] > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecba20 zfs:vdev_deadman+10b () Bugs notwithstanding, the ZFS deadman timer occurs when a ZFS I/O does not complete in 10,000 seconds (by default). The problem likely lies below ZFS. For this reason, the deadman timer was invented -- don't blame ZFS for a problem below ZFS. > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecba70 zfs:vdev_deadman+4a () > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbac0 zfs:vdev_deadman+4a () > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbaf0 zfs:spa_deadman+ad () > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbb90 genunix:cyclic_softint+fd () > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbba0 unix:cbe_low_level+14 () > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbbf0 unix:av_dispatch_softvect+78 () > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3ecbc20 apix:apix_dispatch_softint+35 () > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05990 unix:switch_sp_and_call+13 () > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e059e0 apix:apix_do_softint+6c () > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05a40 apix:apix_do_interrupt+34a () > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05a50 unix:cmnint+ba () > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05bc0 unix:acpi_cpu_cstate+11b () > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05bf0 unix:cpu_acpi_idle+8d () > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05c00 unix:cpu_idle_adaptive+13 () > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05c20 unix:idle+a7 () > May 18 04:44:36 zfs01 genunix: [ID 655072 kern.notice] ffffff00f3e05c30 unix:thread_start+8 () > May 18 04:44:36 zfs01 unix: [ID 100000 kern.notice] > May 18 04:44:36 zfs01 genunix: [ID 672855 kern.notice] syncing file systems... > May 18 04:44:38 zfs01 genunix: [ID 904073 kern.notice] done > May 18 04:44:39 zfs01 genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel > May 18 04:44:39 zfs01 ahci: [ID 405573 kern.info] NOTICE: ahci0: ahci_tran_reset_dport port 1 reset port > May 18 05:17:56 zfs01 genunix: [ID 100000 kern.notice] > May 18 05:17:56 zfs01 genunix: [ID 665016 kern.notice] ^M100% done: 8607621 pages dumped, > May 18 05:17:56 zfs01 genunix: [ID 851671 kern.notice] dump succeeded > > The disks are all 4TB WD40001FYYG enterprise SAS drives. > I've had such bad luck with that model, IMNSHO I recommend replacing with anything else :-( That said, I don't think it is a root cause for this panic. To get the trail of tears, you'll need to look at the FMA ereports for the 10,000 seconds prior to the panic. fmdump has a -t option you'll find useful. The [foreward reference] is the result of a SCSI reset of the target, LUN, or HBA. These occur when the sd driver has not had a reply and issues one of those types of resets *or* the device or something in the data path resets. HTH, -- richard > Googling seems to indicate it is a known problem with the way the various subsystems sometimes interact. Is there any way to fix/workaround this issue? > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From guorong.koh at gmail.com Fri May 22 01:12:03 2015 From: guorong.koh at gmail.com (Guo-Rong Koh) Date: Fri, 22 May 2015 10:42:03 +0930 Subject: [OmniOS-discuss] Crashplan alternatives? Message-ID: <1432257123.13727.16.camel@gmail.com> Hello everyone, I know others here are running Crashplan for Solaris on OmniOS. However, given the (not so recent) retirement announcement: https://helpdesk.code42.com/entries/53070937-Solaris-Platform -Retirement-Announcement I'm seeking some discussion and advice on possibilities. My original strategy for supporting Linux and Windows clients in a home server environment is starting to disintegrate. Due to this issue: http://support.code42.com/CrashPlan/Latest/Troubleshooting/Computer-To -Computer_Backups_Between_CrashPlan_App_4.2.0_And_Earlier_Versions_Cont inuously_Synchronize my Linux client is now no longer backing up the way it used to (OmniOS server is on Crashplan 3.7, Linux client automatically upgraded to 4.2.0). Than kfully, the Windows clients seem to be OK for now. Eventually however, I expect the whole solution to fail when Code42 EOL all Solaris support. My current options are: 1. Migrate Crashplan to a Linux KVM instance - this seems like the least effort for now 2. Find an alternate, multiplatform solution - thus far I have found nothing suitable Do others here have a migration plan? regards, Guo-Rong -------------- next part -------------- An HTML attachment was scrubbed... URL: From matej at zunaj.si Fri May 22 09:50:58 2015 From: matej at zunaj.si (Matej Zerovnik) Date: Fri, 22 May 2015 11:50:58 +0200 Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then resumes In-Reply-To: <55518BFF.6080608@zunaj.si> References: <55487539.6030408@zunaj.si> <201505051648.t45GmpA4025308@lists-il.int.omniti.net> <40C78E86-F32D-4588-AF98-EB9820019960@richardelling.com> <55518BFF.6080608@zunaj.si> Message-ID: <555EFC02.5070802@zunaj.si> After having troubles almost every week and missing the time frame to catch the bastard, today I finally had the opportunity to catch it in action:) As it turns out, it looks like a ZFS(not likely) or HW(probably) problem. When in "hangup" state, iscsi and network worked flawlessly and I was able to connect to iSCSI(but mounting the FS and issuing commands(show lvm volume,..) worked really slow). I was also able to work on the server, so it wasn't locked up. Then I decided to check the ZFS FS. I tried to create a file in ZFS mount directory by issuing 'touch test-file' and command froze. I tried to kill it with CTRL+C to no success. I tried to kill the process with kill -9, but that did not help either. Looking at iostat output, there was some reading happening, but absolutely no writes (0, nada). I used 'lsiutils' to connect to my LSI HBA and issued port reset, following a hard SAS link reset in a hope it will come back, but it was still frozen. I also checked 'phy counters' in lsiutils, and there were some devices with errors, but that could be due to port / link reset. Long story short, after 30min, everything returned to normal, without an errors message in logs or anywhere else. Bad thing is, iSCSI target froze a few minutes later and only way to resolve the trouble was to restart the server:( Matej On 12. 05. 2015 07:13, Matej Zerovnik wrote: > I know building a single 50 drives RaidZ2 is a bad idea. As I said, > it's a legacy that I can't easily change. I already have a backup pool > with 7x10 drives RaidZ2 to which I hope I will be able to switch this > week. I hope to get some better results and less crashing... > > What is interesting is that when the 'event' happens, server works > normaly, ZFS is accessable and writable(at least, there is no errors > in log files), only iscsi reports errors and drops the connection. > Another interesting thing is that after the 'event', all write stops, > only read continues for another 30min. After 30min all traffic stops > for half an hour. After that, everything starts to coming back up... > Weird?! > > Matej > > On 09. 05. 2015 02:49, Richard Elling wrote: >> >>> On May 5, 2015, at 9:48 AM, Matej Zerovnik >> > wrote: >>> >>> I will replace the hardwarw in about 4 months with all SAS drives, >>> but I would love to have a working setup for the time being as well;) >>> >>> I looked at smart stats and there doesnt seem to be any errors. >>> Also, no hard/soft/transfer error reported by any drive. Will take a >>> look at service time tomorrow, maybe put the drives to graphite and >>> look at them over a longer period. >>> >>> I looked at iostat -x status today and stats for pool itself >>> reported 100% busy most of the time, 98-100% wait, 500-1300 >>> transactions in queue, around 500 active,... First line, that is >>> average from boot, says avg service time.is around >>> 1600ms which seems like aaaalot. Can it be due to really big queue? >>> >>> Would it help to create 5 10drives raidz pools instead of one with >>> 50 drives? >> >> It is a bad idea to build a single raidz set with 50 drives. Very >> bad. Hence the zpool >> man page says, "The recommended number is between 3 and 9 to help >> increase performance." >> But this recommendation applies to reliability, too. >> -- richard >> > > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard at netbsd.org Fri May 22 15:47:45 2015 From: richard at netbsd.org (Richard PALO) Date: Fri, 22 May 2015 17:47:45 +0200 Subject: [OmniOS-discuss] [developer] FLAG DAY - 4719 affects nightly, package, and poold In-Reply-To: <1BAFD21A-EF91-4C6E-8A2A-4D2AB691574E@omniti.com> References: <1BAFD21A-EF91-4C6E-8A2A-4D2AB691574E@omniti.com> Message-ID: <555F4FA1.1000205@netbsd.org> Le 05/05/15 19:32, Dan McDonald a ?crit : > Illumos #4719 introduces a flag day for people who build illumos-gate. > Starting now, you will need a Java Developers Kit (JDK) 7 or later. > OpenIndiana 151a9 does NOT have this by default. Builders must either set > JAVA_ROOT to a source of JDK7, or must have /usr/java populated with JDK7. > > Users still on JDK6 will see build errors in the packaging portions like > such: Kind reminder about the build-time/run-time issue for poold (https://www.illumos.org/issues/5851) with its latest incantation: https://www.illumos.org/rb/r/34/ -- Richard PALO From Josh.Barton at usurf.usu.edu Fri May 22 16:03:40 2015 From: Josh.Barton at usurf.usu.edu (Josh Barton) Date: Fri, 22 May 2015 16:03:40 +0000 Subject: [OmniOS-discuss] HP Proliant Gen9 Message-ID: <9595d4e2ca4b4569ac0d51ffc4c061b9@Perses.usurf.usu.edu> I have been trying to install OmniOS on a HP Proliant Gen9 server (r151014) but it will only boot in Legacy boot mode. R151012 will boot but no disks are found when I try to install. Has anyone experienced these issues? R151012 worked with our Proliant Gen8, is this a driver issue or something else? Thanks! Josh -------------- next part -------------- An HTML attachment was scrubbed... URL: From johan.kragsterman at capvert.se Fri May 22 16:48:08 2015 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Fri, 22 May 2015 18:48:08 +0200 Subject: [OmniOS-discuss] Ang: HP Proliant Gen9 In-Reply-To: <9595d4e2ca4b4569ac0d51ffc4c061b9@Perses.usurf.usu.edu> References: <9595d4e2ca4b4569ac0d51ffc4c061b9@Perses.usurf.usu.edu> Message-ID: Hi! -----"OmniOS-discuss" skrev: ----- Till: "omnios-discuss at lists.omniti.com" Fr?n: Josh Barton S?nt av: "OmniOS-discuss" Datum: 2015-05-22 18:04 ?rende: [OmniOS-discuss] HP Proliant Gen9 I have been trying to install OmniOS on a HP Proliant Gen9 server (r151014) but it will only boot in Legacy boot mode. R151012 will boot but no disks are found when I try to install. Has anyone experienced these issues? R151012 worked with our Proliant Gen8, is this a driver issue or something else? ? WHat type of controller du you use? Do you use the HP provided raid(what is it, P430i...?), or du you use something else? Rgrds Johan Thanks! ? Josh _______________________________________________ OmniOS-discuss mailing list OmniOS-discuss at lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss From danmcd at omniti.com Fri May 22 17:39:34 2015 From: danmcd at omniti.com (Dan McDonald) Date: Fri, 22 May 2015 13:39:34 -0400 Subject: [OmniOS-discuss] HP Proliant Gen9 In-Reply-To: <9595d4e2ca4b4569ac0d51ffc4c061b9@Perses.usurf.usu.edu> References: <9595d4e2ca4b4569ac0d51ffc4c061b9@Perses.usurf.usu.edu> Message-ID: <3C4DCF0D-769D-498F-9FC9-55D0D187A844@omniti.com> > On May 22, 2015, at 12:03 PM, Josh Barton wrote: > > I have been trying to install OmniOS on a HP Proliant Gen9 server (r151014) but it will only boot in Legacy boot mode. R151012 will boot but no disks are found when I try to install. Has anyone experienced these issues? R151012 worked with our Proliant Gen8, is this a driver issue or something else? 014 should also work on your Gen8. I don't know enough about the HW characteristics of Gen9 to tell you what exactly is wrong. Is "Legacy boot mode" using BIOS as opposed to EFI? OmniOS doesn't support EFI boot, just BIOS. Dan From danmcd at omniti.com Fri May 22 17:57:14 2015 From: danmcd at omniti.com (Dan McDonald) Date: Fri, 22 May 2015 13:57:14 -0400 Subject: [OmniOS-discuss] HP Proliant Gen9 In-Reply-To: References: <9595d4e2ca4b4569ac0d51ffc4c061b9@Perses.usurf.usu.edu> <3C4DCF0D-769D-498F-9FC9-55D0D187A844@omniti.com> Message-ID: <4F470FC2-2D9C-4C64-AF3A-8027C4113E7F@omniti.com> Keeping this on the list so people know. > On May 22, 2015, at 1:52 PM, Josh Barton wrote: > > Legacy Boot is just BIOS. So make sure you use that. > I am using: HP Smart Array P440ar Controller ? vendor: 103c ("Hewlett-Packard Company"), device: 3239 ("Smart Array Gen9 Controllers"), subvendor: 103c, subdevice: 21c0 ("P440ar") That entry isn't in /etc/driver_aliases for OmniOS. It is *possible* that the cpqary3 driver will work on this, but we would need to test it. Can you get to the shell from the r151014 install media? I can't remember if "lspci" is there, but if it isn't, "prtconf -d" output might be useful to share. Dan From danmcd at omniti.com Fri May 22 21:05:07 2015 From: danmcd at omniti.com (Dan McDonald) Date: Fri, 22 May 2015 17:05:07 -0400 Subject: [OmniOS-discuss] Kayak post illumos 5896-5897 Message-ID: A recent illumos bugfix broke the Kayak build (only on bloody for now). Kayak assumes that svccfg in an alternate root only requires a background svc.configd. It ALSO requires a background svc.startd. I discussed this with the author of 5896-7, and he recommended that Kayak use the "-native" versions of svccfg in illumos-{gate,omnios}, because that's how the ON build populates SMF repositories as well. The following webrev: http://kebe.com/~danmcd/webrevs/kayak-svccfg/ illustrates the changes. I don't know how many people use kayak to BUILD images, but if you do, you should please take a look at this before the next stable release comes out. Thanks, Dan From Josh.Barton at usurf.usu.edu Fri May 22 23:09:30 2015 From: Josh.Barton at usurf.usu.edu (Josh Barton) Date: Fri, 22 May 2015 23:09:30 +0000 Subject: [OmniOS-discuss] Proliant gen9 Message-ID: An update to my previous message: I am using only Legacy BIOS boot mode now and skipping UEFI entirely. Using the changes to the grub menu found in the link below I was able to get to the install screen using the r151014 image however I still get a no disk found error. The controller is : HP Smart Array P440ar Controller I have been trying to install OmniOS on a HP Proliant Gen9 server (r151014) but it will only boot in Legacy boot mode. R151012 will boot but no disks are found when I try to install. Has anyone experienced these issues? R151012 worked with our Proliant Gen8, is this a driver issue or something else? See: http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c04633840 Thanks, Josh Barton Utah Stage University Research Foundation -------------- next part -------------- An HTML attachment was scrubbed... URL: From nsmith at careyweb.com Fri May 22 23:37:57 2015 From: nsmith at careyweb.com (Nate Smith) Date: Fri, 22 May 2015 19:37:57 -0400 Subject: [OmniOS-discuss] =?utf-8?q?Proliant_gen9?= Message-ID: <1104717796-2240@mail.careyweb.com> What is the on board storage controller? That's probably the unsupported hardware. -Nate -----Original Message----- From: Josh Barton [Josh.Barton at usurf.usu.edu] Received: Friday, 22 May 2015, 7:10PM To: omnios-discuss at lists.omniti.com [omnios-discuss at lists.omniti.com] Subject: [OmniOS-discuss] Proliant gen9 An update to my previous message: I am using only Legacy BIOS boot mode now and skipping UEFI entirely. Using the changes to the grub menu found in the link below I was able to get to the install screen using the r151014 image however I still get a no disk found error. The controller is : HP Smart Array P440ar Controller I have been trying to install OmniOS on a HP Proliant Gen9 server (r151014) but it will only boot in Legacy boot mode. R151012 will boot but no disks are found when I try to install. Has anyone experienced these issues? R151012 worked with our Proliant Gen8, is this a driver issue or something else? See: http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c04633840 Thanks, Josh Barton Utah Stage University Research Foundation -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmabis at vmware.com Sat May 23 08:08:31 2015 From: mmabis at vmware.com (Matthew Mabis) Date: Sat, 23 May 2015 08:08:31 +0000 Subject: [OmniOS-discuss] SMB version with NTLM authentication version. In-Reply-To: <1BA5BE7A-5BEF-4882-8A85-0EAC713E2C80@omniti.com> References: <57413703-4516@mail.careyweb.com>, <1BA5BE7A-5BEF-4882-8A85-0EAC713E2C80@omniti.com> Message-ID: <8d6f305890664c13bcf43f30fb752067@EX13-MBX-017.vmware.com> Hey all, Wondering if you could verify with me the following information about the SMB protocol within OmniOs.. Is it currently using SMBv1 as a standard i saw a blog that discussed that doing further research it looks like SMBv3 or v2 is not built into Omni at this time (you can tell me if i am wrong) I had an issue today with my OSX 10.10.3 build where i have been connecting to my OmniOS NAS with no issues until today where it complained to me that i was using NLTMv1 and i had to use a workaround plist to get it working... Just wondering if this has been seen or is a known issue? Just seems to have popped up within the last month or so... Any other workarounds you might have on this would greatly be appreciated just wondering if this is known and if the SMB version has something to do with it? Thanks for your time Matt Mabis From danmcd at omniti.com Tue May 26 15:49:30 2015 From: danmcd at omniti.com (Dan McDonald) Date: Tue, 26 May 2015 11:49:30 -0400 Subject: [OmniOS-discuss] FLAG DAY for people who build Kayak images Message-ID: You may ignore this note if you do not use Kayak to create your own custom images. If you merely download the images from us, you're in good shape. This upstream illumos change: commit 2ba6d2b94a398caab9e751c277f0acbd1cc22c77 Author: Robert Mustacchi Date: Thu Apr 30 15:25:12 2015 -0700 5896 svccfg import returns before service can be used by svcadm 5897 improve comments for svc.startd Reviewed by: Jerry Jelinek Approved by: Dan McDonald introduces a flag day for users of OmniOS bloody who wish to build custom images using Kayak. The kayak build system uses svccfg(1M) and an alternate root to pre-populate disk images prior to their snapshotting and compression. Using the build machine's svccfg(1M) like this will cause a freeze in kayak. If you use native Kayak make, you merely need this update in your kayak repo: commit d8b5bbd76a85b6d54a471fc3021df27dd7b2e51e Author: Dan McDonald Date: Fri May 22 14:24:21 2015 -0400 Use PREBUILT_ILLUMOS's svccfg-native to stop lockups post-5896/5897 and a pre-built illumos-omnios whose path needs to be in the environment, or as an argument to gmake: gmake PREBUILT_ILLUMOS= I have a fix in omnios-build to export PREBUILT_ILLUMOS. If that does not work, then I will further alter omnios-builds kayak/build.sh to push PREBUILT_ILLUMOS right into gmake. Thanks, Dan From nsmith at careyweb.com Tue May 26 19:05:09 2015 From: nsmith at careyweb.com (Nate Smith) Date: Tue, 26 May 2015 15:05:09 -0400 Subject: [OmniOS-discuss] Proliant gen9 In-Reply-To: <3415d08010da42d2872b035f1c298dcf@Perses.usurf.usu.edu> References: <1104717796-2240@mail.careyweb.com> <3415d08010da42d2872b035f1c298dcf@Perses.usurf.usu.edu> Message-ID: Sorry. I missed the P440ar the first time I read through the thread. https://www.illumos.org/issues/5390 Details this issue. It doesn?t look like the patch to get this driver supported has been tested or upstreamed. (someone else will have to answer). Have you tried switching to HBA mode as detailed below? http://h20564.www2.hp.com/hpsc/doc/public/display?docId=c03909334 I know that HP lists it as supporting Solaris 11, but that doesn?t mean it?s OmniOS compatible http://h20564.www2.hp.com/hpsc/swd/public/readIndex?sp4ts.oid=7274897&swLangOid=8&swEnvOid=4167 This thread may be of some help. It looks like, at least for now, you have some bleeding edge hardware with unknown omnios support. https://forums.freenas.org/index.php?threads/hp-gen9-server-w-p840-hba-mode-no-drives-visible.28620/ You might have to get another compatible storage controller, or pay some support bucks to get it tested/integrated. The supported storage controllers for the HP Gen 9s are here. http://www.hp.com/hpinfo/newsroom/press_kits/2014/ComputeEra/HP_SmartStorage_ProLiantGen9_DataSheet.pdf The P441 and P840 might have better luck, but I?m not sure. Ideally, you can run with a storage controller listed in the HCL http://illumos.org/hcl Hope this helps. -Nate From: Josh Barton [mailto:Josh.Barton at usurf.usu.edu] Sent: Tuesday, May 26, 2015 1:11 PM To: Nate Smith Subject: RE: [OmniOS-discuss] Proliant gen9 All I can find for the Storage controller is: HP Smart Array P440ar Controller Thanks for taking the time to look at this Josh From: Nate Smith [mailto:nsmith at careyweb.com] Sent: Friday, May 22, 2015 5:38 PM To: omnios-discuss at lists.omniti.com; Josh Barton Subject: RE: [OmniOS-discuss] Proliant gen9 What is the on board storage controller? That's probably the unsupported hardware. -Nate -----Original Message----- From: Josh Barton [Josh.Barton at usurf.usu.edu] Received: Friday, 22 May 2015, 7:10PM To: omnios-discuss at lists.omniti.com [omnios-discuss at lists.omniti.com] Subject: [OmniOS-discuss] Proliant gen9 An update to my previous message: I am using only Legacy BIOS boot mode now and skipping UEFI entirely. Using the changes to the grub menu found in the link below I was able to get to the install screen using the r151014 image however I still get a no disk found error. The controller is : HP Smart Array P440ar Controller I have been trying to install OmniOS on a HP Proliant Gen9 server (r151014) but it will only boot in Legacy boot mode. R151012 will boot but no disks are found when I try to install. Has anyone experienced these issues? R151012 worked with our Proliant Gen8, is this a driver issue or something else? See: http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c04633840 Thanks, Josh Barton Utah Stage University Research Foundation -------------- next part -------------- An HTML attachment was scrubbed... URL: From anon at omniti.com Tue May 26 20:18:50 2015 From: anon at omniti.com (Anon) Date: Tue, 26 May 2015 16:18:50 -0400 Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then resumes Message-ID: Hi Matej, Do you have sar running on your system? I'd recommend maybe running it at a short interval so that you can get historical disk statistics. You can use this info to rule out if its the disks or not. You can also use iotop -P to get a real time view of %IO to see if it's the disks. You can also use zpool iostat -v 1. Also, do you have baseline benchmark of performance and know if you're meeting/exceeding it? The baseline should be for random and sequential IO; you can use bonnie++ to get this information. Are you able to share your ZFS configuration and iSCSI configuration? For iSCSI, can you take a look at this: http://docs.oracle.com/cd/E23824_01/html/821-1459/fpjwy.html#fsume Do you have detailed logs for the clients experiencing the issues? If not are you able to enable verbose logging (such as debug level logs)? Regards, Anon -------------- next part -------------- An HTML attachment was scrubbed... URL: From matej at zunaj.si Wed May 27 06:58:16 2015 From: matej at zunaj.si (Matej Zerovnik) Date: Wed, 27 May 2015 08:58:16 +0200 Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then resumes In-Reply-To: References: Message-ID: Hello Josten, > On 26 May 2015, at 22:18, Anon wrote: > > Hi Matej, > > Do you have sar running on your system? I'd recommend maybe running it at a short interval so that you can get historical disk statistics. You can use this info to rule out if its the disks or not. You can also use iotop -P to get a real time view of %IO to see if it's the disks. You can also use zpool iostat -v 1. I didn?t have sar or iotop running, but I had 'iostat -xn' and 'zpool iostat -v 1' running when things stopped working, but there is nothing unusual in there. Write ops suddenly fall to 0 and that?s it. Reads are still happening and according to network traffic, there is outgoing traffic when I?m unable to write to the ZFS FS (even locally on the server). I created a simple text file, so next time system hangs, I will be able to check if system is readable (currently, I only have iscsi volumes, so I?m unable to check that locally on server). > > Also, do you have baseline benchmark of performance and know if you're meeting/exceeding it? The baseline should be for random and sequential IO; you can use bonnie++ to get this information. I can, with 99,99% say, I?m exceeding performance of the pool itself. It?s a single raidz2 vdev with 50 hard drives and 70 connected clients. some are idling, but 10-20 clients are pushing data to server. I know zpool configuration is very bad, but that?s a legacy I can?t change easily. I?m already syncing data to another 7 vdev server, but since this server is so busy, transfers are happening VERY SLOW (read, zfs sync doing 10MB/s). > > Are you able to share your ZFS configuration and iSCSI configuration? Sure! Here are zfs settings: zfs get all data: NAME PROPERTY VALUE SOURCE data type filesystem - data creation Fri Oct 25 20:26 2013 - data used 104T - data available 61.6T - data referenced 1.09M - data compressratio 1.08x - data mounted yes - data quota none default data reservation none default data recordsize 128K default data mountpoint /volumes/data received data sharenfs off default data checksum on default data compression off received data atime off local data devices on default data exec on default data setuid on default data readonly off local data zoned off default data snapdir hidden default data aclmode discard default data aclinherit restricted default data canmount on default data xattr on default data copies 1 default data version 5 - data utf8only off - data normalization none - data casesensitivity sensitive - data vscan off default data nbmand off default data sharesmb off default data refquota none default data refreservation none default data primarycache all default data secondarycache all default data usedbysnapshots 0 - data usedbydataset 1.09M - data usedbychildren 104T - data usedbyrefreservation 0 - data logbias latency default data dedup off local data mlslabel none default data sync standard default data refcompressratio 1.00x - data written 1.09M - data logicalused 98.1T - data logicalreferenced 398K - data filesystem_limit none default data snapshot_limit none default data filesystem_count none default data snapshot_count none default data redundant_metadata all default data nms:dedup-dirty on received data nms:description datauporabnikov received I?m not sure what iSCSI configuration do you want/need? But as far as I figured out in the last 'freeze', iSCSI is not the problem, since I?m unable to write to ZFS volume even if I?m local on the server itself. > > For iSCSI, can you take a look at this: http://docs.oracle.com/cd/E23824_01/html/821-1459/fpjwy.html#fsume Interesting. I tried running 'iscsiadm list target' but it doesn?t return anything. There is also nothing in /var/adm/messages as usual:) But target service is online (according to svcs), clients are connected and having traffic. > > Do you have detailed logs for the clients experiencing the issues? If not are you able to enable verbose logging (such as debug level logs)? I have clients logs, but they mostly just report loosing connections and reconnecting: Example 1: Apr 29 10:33:53 eee kernel: connection1:0: detected conn error (1021) Apr 29 10:33:54 eee iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3) Apr 29 10:33:56 eee iscsid: connection1:0 is operational after recovery (1 attempts) Apr 29 10:36:37 eee kernel: connection1:0: detected conn error (1021) Apr 29 10:36:37 eee iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3) Apr 29 10:36:40 eee iscsid: connection1:0 is operational after recovery (1 attempts) Apr 29 10:36:50 eee kernel: sd 3:0:0:0: Device offlined - not ready after error recovery Apr 29 10:36:51 eee kernel: sd 3:0:0:0: Device offlined - not ready after error recovery Apr 29 10:36:51 eee kernel: sd 3:0:0:0: Device offlined - not ready after error recovery Example 2: Apr 16 08:41:40 vf kernel: connection1:0: pdu (op 0x5e itt 0x1) rejected. Reason code 0x7 Apr 16 08:43:11 vf kernel: connection1:0: pdu (op 0x5e itt 0x1) rejected. Reason code 0x7 Apr 16 08:44:13 vf kernel: connection1:0: pdu (op 0x5e itt 0x1) rejected. Reason code 0x7 Apr 16 08:45:51 vf kernel: connection1:0: detected conn error (1021) Apr 16 08:45:51 317 iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3) Apr 16 08:45:53 vf iscsid: connection1:0 is operational after recovery (1 attempts) I?m already in contact with OmniTI regarding our new build, but in the mean time, I would love for our clients to be able to use the storage so I?m trying to resolve the current issue somehow? Matej -------------- next part -------------- An HTML attachment was scrubbed... URL: From matej at zunaj.si Fri May 29 11:09:56 2015 From: matej at zunaj.si (Matej Zerovnik) Date: Fri, 29 May 2015 13:09:56 +0200 Subject: [OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then resumes In-Reply-To: References: Message-ID: <6C56A6D6-A5BB-46DA-A10B-510F11FEF7BE@zunaj.si> Today the server crashed again. I?m not sure if it?s because I was running SMART short self-tests or not, but it looks like it started around that time. I?m still running smart tests, but it looks like there are no errors on the drives, although some tests take up to 30min to finish? iostat -E also reports no errors. When it froze, I started iostat and tried to write a file to ZFS pool. As usual, it froze, but I left iostat running, hoping it will give me some infos? After 30 or something minutes, system become responsible again and this is how my iostat output looks like: http://pastebin.com/W4EWgnzq System got responsible at 'Fri May 29 11:38:45 CEST 2015'. It?s weird to say the least. It looks like there is something in write buffer that hogs the ZFS for quite some time and gets released or times-out after a certain time. But I?m not sure that it is and what thing has such a long timeout. It looks like freeze lasted for 15 minutes. Matej > On 28 May 2015, at 18:30, Anon wrote: > > Have you verified that your disks are not having any issues with smartctl and iostat -E ? > > I'd suggest running a short test on the disks: smartctl -d sat,12 -t short /path/to/disk (note: you may need to append s2 to the physical disk name). > > I built a test target and iSCSI initiator and wrote 1G from /dev/zero and ended up crashing the sesssion; are your sessions under load? > > On Wed, May 27, 2015 at 2:58 AM, Matej Zerovnik > wrote: > Hello Josten, > > >> On 26 May 2015, at 22:18, Anon > wrote: >> >> Hi Matej, >> >> Do you have sar running on your system? I'd recommend maybe running it at a short interval so that you can get historical disk statistics. You can use this info to rule out if its the disks or not. You can also use iotop -P to get a real time view of %IO to see if it's the disks. You can also use zpool iostat -v 1. > > I didn?t have sar or iotop running, but I had 'iostat -xn' and 'zpool iostat -v 1' running when things stopped working, but there is nothing unusual in there. Write ops suddenly fall to 0 and that?s it. Reads are still happening and according to network traffic, there is outgoing traffic when I?m unable to write to the ZFS FS (even locally on the server). I created a simple text file, so next time system hangs, I will be able to check if system is readable (currently, I only have iscsi volumes, so I?m unable to check that locally on server). > >> >> Also, do you have baseline benchmark of performance and know if you're meeting/exceeding it? The baseline should be for random and sequential IO; you can use bonnie++ to get this information. > > I can, with 99,99% say, I?m exceeding performance of the pool itself. It?s a single raidz2 vdev with 50 hard drives and 70 connected clients. some are idling, but 10-20 clients are pushing data to server. I know zpool configuration is very bad, but that?s a legacy I can?t change easily. I?m already syncing data to another 7 vdev server, but since this server is so busy, transfers are happening VERY SLOW (read, zfs sync doing 10MB/s). > >> >> Are you able to share your ZFS configuration and iSCSI configuration? > > Sure! Here are zfs settings: > > zfs get all data: > NAME PROPERTY VALUE SOURCE > data type filesystem - > data creation Fri Oct 25 20:26 2013 - > data used 104T - > data available 61.6T - > data referenced 1.09M - > data compressratio 1.08x - > data mounted yes - > data quota none default > data reservation none default > data recordsize 128K default > data mountpoint /volumes/data received > data sharenfs off default > data checksum on default > data compression off received > data atime off local > data devices on default > data exec on default > data setuid on default > data readonly off local > data zoned off default > data snapdir hidden default > data aclmode discard default > data aclinherit restricted default > data canmount on default > data xattr on default > data copies 1 default > data version 5 - > data utf8only off - > data normalization none - > data casesensitivity sensitive - > data vscan off default > data nbmand off default > data sharesmb off default > data refquota none default > data refreservation none default > data primarycache all default > data secondarycache all default > data usedbysnapshots 0 - > data usedbydataset 1.09M - > data usedbychildren 104T - > data usedbyrefreservation 0 - > data logbias latency default > data dedup off local > data mlslabel none default > data sync standard default > data refcompressratio 1.00x - > data written 1.09M - > data logicalused 98.1T - > data logicalreferenced 398K - > data filesystem_limit none default > data snapshot_limit none default > data filesystem_count none default > data snapshot_count none default > data redundant_metadata all default > data nms:dedup-dirty on received > data nms:description datauporabnikov received > > I?m not sure what iSCSI configuration do you want/need? But as far as I figured out in the last 'freeze', iSCSI is not the problem, since I?m unable to write to ZFS volume even if I?m local on the server itself. > >> >> For iSCSI, can you take a look at this: http://docs.oracle.com/cd/E23824_01/html/821-1459/fpjwy.html#fsume > > Interesting. I tried running 'iscsiadm list target' but it doesn?t return anything. There is also nothing in /var/adm/messages as usual:) But target service is online (according to svcs), clients are connected and having traffic. > >> >> Do you have detailed logs for the clients experiencing the issues? If not are you able to enable verbose logging (such as debug level logs)? > > I have clients logs, but they mostly just report loosing connections and reconnecting: > > Example 1: > Apr 29 10:33:53 eee kernel: connection1:0: detected conn error (1021) > Apr 29 10:33:54 eee iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3) > Apr 29 10:33:56 eee iscsid: connection1:0 is operational after recovery (1 attempts) > Apr 29 10:36:37 eee kernel: connection1:0: detected conn error (1021) > Apr 29 10:36:37 eee iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3) > Apr 29 10:36:40 eee iscsid: connection1:0 is operational after recovery (1 attempts) > Apr 29 10:36:50 eee kernel: sd 3:0:0:0: Device offlined - not ready after error recovery > Apr 29 10:36:51 eee kernel: sd 3:0:0:0: Device offlined - not ready after error recovery > Apr 29 10:36:51 eee kernel: sd 3:0:0:0: Device offlined - not ready after error recovery > > Example 2: > Apr 16 08:41:40 vf kernel: connection1:0: pdu (op 0x5e itt 0x1) rejected. Reason code 0x7 > Apr 16 08:43:11 vf kernel: connection1:0: pdu (op 0x5e itt 0x1) rejected. Reason code 0x7 > Apr 16 08:44:13 vf kernel: connection1:0: pdu (op 0x5e itt 0x1) rejected. Reason code 0x7 > Apr 16 08:45:51 vf kernel: connection1:0: detected conn error (1021) Apr 16 08:45:51 317 iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3) > Apr 16 08:45:53 vf iscsid: connection1:0 is operational after recovery (1 attempts) > > > I?m already in contact with OmniTI regarding our new build, but in the mean time, I would love for our clients to be able to use the storage so I?m trying to resolve the current issue somehow? > > Matej > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From heinz at licenser.net Sat May 30 15:58:57 2015 From: heinz at licenser.net (Heinz Nikolaus Gies) Date: Sat, 30 May 2015 17:58:57 +0200 Subject: [OmniOS-discuss] zpool list -p Message-ID: <1E73E845-FD72-496D-9E7C-650580B7E305@licenser.net> I was looking at the output of zpool list today, comparing it with what I?d get on SmartOS and noticed that when using the -p flag for parable output the deduplication factor is still presented as a string (or floatish type) instead of a integer vlaue. It seems to me a bit wrong for parable output. If there is a reason behind that decision it?s fine and I?ll gladly work around it, but it feels like an oversight. Cheers, Heinz Here a quick glance: OmniOS: /usr/sbin/zpool list -pH -oname,size,alloc,free,dedup,health data 7971459301376 6405101887488 1566357413888 1.00x ONLINE rpool 249108103168 121560741376 127547361792 1.00x ONLINE SmartOS: list -pH -oname,size,alloc,free,dedup,health zones 319975063552 51935040512 268040023040 100 ONLINE --- Cheers, Heinz Nikolaus Gies heinz at licenser.net -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From heinz at licenser.net Sat May 30 17:42:55 2015 From: heinz at licenser.net (Heinz Nikolaus Gies) Date: Sat, 30 May 2015 19:42:55 +0200 Subject: [OmniOS-discuss] zpool list -p In-Reply-To: References: <1E73E845-FD72-496D-9E7C-650580B7E305@licenser.net> Message-ID: <22AE6795-CD87-4FAA-B9BB-3C173012C7B7@licenser.net> zpool upgrade -v shows the same version on both systems. I would suspect that Joyent has modified the zpool utility, but it seems like a sensible change. --- Cheers, Heinz Nikolaus Gies heinz at licenser.net > On May 30, 2015, at 19:20, Krzysztof Grzempa wrote: > > Did you compare ZFS versions on both OS'es ? This might changed over some new version.. > > 2015-05-30 17:58 GMT+02:00 Heinz Nikolaus Gies >: > I was looking at the output of zpool list today, comparing it with what I?d get on SmartOS and noticed that when using the -p flag for parable output the deduplication factor is still presented as a string (or floatish type) instead of a integer vlaue. It seems to me a bit wrong for parable output. > > If there is a reason behind that decision it?s fine and I?ll gladly work around it, but it feels like an oversight. > > > Cheers, > Heinz > > Here a quick glance: > > OmniOS: > /usr/sbin/zpool list -pH -oname,size,alloc,free,dedup,health > data 7971459301376 6405101887488 1566357413888 1.00x ONLINE > rpool 249108103168 121560741376 127547361792 1.00x ONLINE > > SmartOS: > list -pH -oname,size,alloc,free,dedup,health > zones 319975063552 51935040512 268040023040 100 ONLINE > --- > Cheers, > Heinz Nikolaus Gies > heinz at licenser.net > > > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From heinz at licenser.net Sat May 30 18:55:44 2015 From: heinz at licenser.net (Heinz Nikolaus Gies) Date: Sat, 30 May 2015 20:55:44 +0200 Subject: [OmniOS-discuss] I think I broke part of the network stack Message-ID: <098DC8FA-29CD-4DA0-A87F-58DF4629A092@licenser.net> Hi, I got the feeling that I did break part of the networks tack on a server of mine. The list of commands I executed are attached, right now no diadem show-* command returns output (return code is still 0). I do suspect that was caused by adding a vmic with the same name twice but I am not sure and don?t want to reboot at this point so no evidence is destroyed. Cheers, Heinz 397 dladm show-vnic 398 dladm show-vnic -v 399 dladm show-phys 401 man dladm 402 dladm create-vnic -l bge0 net0 403 dladm show-phys 404 dladm show-vnic 405 dladm destroy-vnic -l bge0 net0 406 dladm delete-vnic -l bge0 net0 407 dladm delete-vnic net0 408 dladm create-vnic -l bge0 net0 -p zone=2398fe7c-032f-11e5-abb0-b33f9f953915 412 dladm create-vnic -l bge0 net0 416 dladm show-vnic 417 dladm create-vnic -l bge0 net0 419 dladm show-vnic 490 dladm show-vnic 491 dladm show-phys 492 dladm 493 dladm show-phys 494 dladm show-phys -v 495 dladm 496 dladm show-link 497 dladm show-link 499 dladm show-link --- Cheers, Heinz Nikolaus Gies heinz at licenser.net -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From heinz at licenser.net Sat May 30 18:57:49 2015 From: heinz at licenser.net (Heinz Nikolaus Gies) Date: Sat, 30 May 2015 20:57:49 +0200 Subject: [OmniOS-discuss] I think I broke part of the network stack In-Reply-To: <098DC8FA-29CD-4DA0-A87F-58DF4629A092@licenser.net> References: <098DC8FA-29CD-4DA0-A87F-58DF4629A092@licenser.net> Message-ID: <9486DC7C-75A1-45EF-A57B-9660444495FD@licenser.net> adding to that the only content of /etc/dladm/*.conf files is bge0 class=int,1;media=int,4;phyinst=int,1;phymaj=int,120;devname=string,bge0; net0 class=int,8;media=int,4;linkover=string,bge0;maddrtype=int,1;vrid=int,0;vraf=int,0;macaddr=string,2:8:20:c0:a7:c0; in /etc/dladm/datalink.conf (comments excluded) --- Cheers, Heinz Nikolaus Gies heinz at licenser.net > On May 30, 2015, at 20:55, Heinz Nikolaus Gies wrote: > > Hi, > > I got the feeling that I did break part of the networks tack on a server of mine. The list of commands I executed are attached, right now no diadem show-* command returns output (return code is still 0). I do suspect that was caused by adding a vmic with the same name twice but I am not sure and don?t want to reboot at this point so no evidence is destroyed. > > Cheers, > Heinz > > 397 dladm show-vnic > 398 dladm show-vnic -v > 399 dladm show-phys > 401 man dladm > 402 dladm create-vnic -l bge0 net0 > 403 dladm show-phys > 404 dladm show-vnic > 405 dladm destroy-vnic -l bge0 net0 > 406 dladm delete-vnic -l bge0 net0 > 407 dladm delete-vnic net0 > 408 dladm create-vnic -l bge0 net0 -p zone=2398fe7c-032f-11e5-abb0-b33f9f953915 > 412 dladm create-vnic -l bge0 net0 > 416 dladm show-vnic > 417 dladm create-vnic -l bge0 net0 > 419 dladm show-vnic > 490 dladm show-vnic > 491 dladm show-phys > 492 dladm > 493 dladm show-phys > 494 dladm show-phys -v > 495 dladm > 496 dladm show-link > 497 dladm show-link > 499 dladm show-link > > --- > Cheers, > Heinz Nikolaus Gies > heinz at licenser.net > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From heinz at licenser.net Sat May 30 20:55:55 2015 From: heinz at licenser.net (Heinz Nikolaus Gies) Date: Sat, 30 May 2015 22:55:55 +0200 Subject: [OmniOS-discuss] I think I broke part of the network stack In-Reply-To: <9486DC7C-75A1-45EF-A57B-9660444495FD@licenser.net> References: <098DC8FA-29CD-4DA0-A87F-58DF4629A092@licenser.net> <9486DC7C-75A1-45EF-A57B-9660444495FD@licenser.net> Message-ID: <0B916418-7689-465E-AB6E-0BC8C2F4C83F@licenser.net> Had to restart to get the system back into working condition, solved the issue but probably lost the state that caused it. Sorry --- Cheers, Heinz Nikolaus Gies heinz at licenser.net > On May 30, 2015, at 20:57, Heinz Nikolaus Gies wrote: > > adding to that the only content of /etc/dladm/*.conf files is > > bge0 class=int,1;media=int,4;phyinst=int,1;phymaj=int,120;devname=string,bge0; > net0 class=int,8;media=int,4;linkover=string,bge0;maddrtype=int,1;vrid=int,0;vraf=int,0;macaddr=string,2:8:20:c0:a7:c0; > > in /etc/dladm/datalink.conf (comments excluded) > --- > Cheers, > Heinz Nikolaus Gies > heinz at licenser.net > > > >> On May 30, 2015, at 20:55, Heinz Nikolaus Gies wrote: >> >> Hi, >> >> I got the feeling that I did break part of the networks tack on a server of mine. The list of commands I executed are attached, right now no diadem show-* command returns output (return code is still 0). I do suspect that was caused by adding a vmic with the same name twice but I am not sure and don?t want to reboot at this point so no evidence is destroyed. >> >> Cheers, >> Heinz >> >> 397 dladm show-vnic >> 398 dladm show-vnic -v >> 399 dladm show-phys >> 401 man dladm >> 402 dladm create-vnic -l bge0 net0 >> 403 dladm show-phys >> 404 dladm show-vnic >> 405 dladm destroy-vnic -l bge0 net0 >> 406 dladm delete-vnic -l bge0 net0 >> 407 dladm delete-vnic net0 >> 408 dladm create-vnic -l bge0 net0 -p zone=2398fe7c-032f-11e5-abb0-b33f9f953915 >> 412 dladm create-vnic -l bge0 net0 >> 416 dladm show-vnic >> 417 dladm create-vnic -l bge0 net0 >> 419 dladm show-vnic >> 490 dladm show-vnic >> 491 dladm show-phys >> 492 dladm >> 493 dladm show-phys >> 494 dladm show-phys -v >> 495 dladm >> 496 dladm show-link >> 497 dladm show-link >> 499 dladm show-link >> >> --- >> Cheers, >> Heinz Nikolaus Gies >> heinz at licenser.net >> >> >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: