From tim at multitalents.net  Sat Mar  1 00:22:36 2014
From: tim at multitalents.net (Tim Rice)
Date: Fri, 28 Feb 2014 16:22:36 -0800 (PST)
Subject: [OmniOS-discuss] How do you configure serial ports on OmniOS?
In-Reply-To: <20140228181645.624801A0BBB@apps0.cs.toronto.edu>
References: <20140228181645.624801A0BBB@apps0.cs.toronto.edu>
Message-ID: <Pine.UW2.4.64.1402281559230.4406@server01.int.multitalents.net>

On Fri, 28 Feb 2014, Chris Siebenmann wrote:

>  This question makes me feel silly but I'm lost in a confusing maze of
> documentation for sacadm, pmadm, and so on and I can't find anything
> with web searches. What I would like to do is configure what I believe is
> /dev/term/c ('ttyS3' in Linux) to run a getty or the OmniOS/Illumos/etc
> equivalent at 115200 baud.

Seems like there should be a command to change the baud rate.
I've always just edited the _pmtab file.

disable the port (assuming pmadm -l shows it enabled)
# pmadm -d -p zsmon -s ttyc

change the baud rate to what you want
# vi /etc/saf/zsmon/_pmtab

Enable the port
# pmadm -e -p zsmon -s ttyc

Sometimes you need to restart the port monitor. (will effect all ports on zsmon)

# sacadm -k -p zsmon

# sacadm -s -p zsmon


> 
>  Relatedly, I'd also like to change the getty-equivalent that 'pmadm -l'
> at least theoretically says is talking to /dev/term/a from 9600 baud to
> 115200 baud. How to do this is also, well, not obvious to me.
> 
> (Please note that I do not want to make this the serial console, a
> procedure for which there seems to be plenty of documentation. I just
> want to be able to log in over that serial port, or it and /dev/term/a.)
> 
>  Thanks in advance.
> 
> 	- cks

-- 
Tim Rice				Multitalents
tim at multitalents.net


From cks at cs.toronto.edu  Sat Mar  1 01:39:02 2014
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Fri, 28 Feb 2014 20:39:02 -0500
Subject: [OmniOS-discuss] How do you configure serial ports on OmniOS?
In-Reply-To: tim's message of Fri, 28 Feb 2014 16:22:36 -0800.
	<Pine.UW2.4.64.1402281559230.4406@server01.int.multitalents.net>
Message-ID: <20140301013902.1CCCD1A02C8@apps0.cs.toronto.edu>

| On Fri, 28 Feb 2014, Chris Siebenmann wrote:
| >  This question makes me feel silly but I'm lost in a confusing maze of
| > documentation for sacadm, pmadm, and so on and I can't find anything
| > with web searches. What I would like to do is configure what I believe is
| > /dev/term/c ('ttyS3' in Linux) to run a getty or the OmniOS/Illumos/etc
| > equivalent at 115200 baud.
| 
| Seems like there should be a command to change the baud rate.
| I've always just edited the _pmtab file.
| 
| disable the port (assuming pmadm -l shows it enabled)
| # pmadm -d -p zsmon -s ttyc

 I should clarify something here: the port isn't listed in pmadm -l at
all (although it exists in /dev/term). I assume that this means that
I need to create it with some arguments; the exact arguments needed to
set things up right (or right enough that I can edit files from there)
are one of the things that I'm lost about.

(The examples I've found of using pmadm to configure things seem to
leave a lot out of magic, and they're often old enough that I'm not
sure if things have changed since then.)

	- cks

From tim at multitalents.net  Sat Mar  1 07:21:24 2014
From: tim at multitalents.net (Tim Rice)
Date: Fri, 28 Feb 2014 23:21:24 -0800 (PST)
Subject: [OmniOS-discuss] How do you configure serial ports on OmniOS?
In-Reply-To: <20140301013902.1CCCD1A02C8@apps0.cs.toronto.edu>
References: <20140301013902.1CCCD1A02C8@apps0.cs.toronto.edu>
Message-ID: <Pine.UW2.4.64.1402282306030.11511@server01.int.multitalents.net>

On Fri, 28 Feb 2014, Chris Siebenmann wrote:

> | On Fri, 28 Feb 2014, Chris Siebenmann wrote:
> | >  This question makes me feel silly but I'm lost in a confusing maze of
> | > documentation for sacadm, pmadm, and so on and I can't find anything
> | > with web searches. What I would like to do is configure what I believe is
> | > /dev/term/c ('ttyS3' in Linux) to run a getty or the OmniOS/Illumos/etc
> | > equivalent at 115200 baud.
> | 
> | Seems like there should be a command to change the baud rate.
> | I've always just edited the _pmtab file.
> | 
> | disable the port (assuming pmadm -l shows it enabled)
> | # pmadm -d -p zsmon -s ttyc
> 
>  I should clarify something here: the port isn't listed in pmadm -l at
> all (although it exists in /dev/term). I assume that this means that
> I need to create it with some arguments; the exact arguments needed to
> set things up right (or right enough that I can edit files from there)
> are one of the things that I'm lost about.

Again, I'd just edit /etc/saf/zsmon/_pmtab and do a copy and 
paste of the ttyb line and make the necessary changes. Probably
want to start with ux (disabled) in the second field.
Then restart the port monitor and then enable the port.


> (The examples I've found of using pmadm to configure things seem to
> leave a lot out of magic, and they're often old enough that I'm not
> sure if things have changed since then.)


-- 
Tim Rice				Multitalents
tim at multitalents.net


From mir at miras.org  Sat Mar  1 13:46:22 2014
From: mir at miras.org (Michael Rasmussen)
Date: Sat, 1 Mar 2014 14:46:22 +0100
Subject: [OmniOS-discuss] ZFS trim support
Message-ID: <20140301144622.12a79ac6@sleipner.datanom.net>

Hi all,

Anybody knows the current status for trim support in Illumos?

It seems two solutions (FreeBSD and tracking the metaslab allocator) are
suggested but no final date:
http://open-zfs.org/wiki/Features#TRIM_Support

In lack of trim how do you then handle SSD's?

I have just ordered some Corsair for use as log and cache so this will
soon be a headache of mine as well;-)

-- 
Hilsen/Regards
Michael Rasmussen

Get my public GnuPG keys:
michael <at> rasmussen <dot> cc
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E
mir <at> datanom <dot> net
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C
mir <at> miras <dot> org
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917
--------------------------------------------------------------
/usr/games/fortune -es says:
HOORAY, Ronald!!  Now YOU can marry LINDA RONSTADT too!!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <https://omniosce.org/ml-archive/attachments/20140301/e9031476/attachment.bin>

From esproul at omniti.com  Sat Mar  1 15:32:25 2014
From: esproul at omniti.com (Eric Sproul)
Date: Sat, 1 Mar 2014 10:32:25 -0500
Subject: [OmniOS-discuss] ZFS trim support
In-Reply-To: <20140301144622.12a79ac6@sleipner.datanom.net>
References: <20140301144622.12a79ac6@sleipner.datanom.net>
Message-ID: <CA+QY2RTbb1i-2fB9RcA_H3EN486Gu_qH8On42=v26Vnkd_5pkA@mail.gmail.com>

On Sat, Mar 1, 2014 at 8:46 AM, Michael Rasmussen <mir at miras.org> wrote:
> In lack of trim how do you then handle SSD's?

I just treat them as regular drives.  I've got some 600G Intel 320s
that have been in service as primary pool devices for about 2.5 years,
having over 20TB written and still the media wear indicator has barely
budged.  Here's a graph of one of them:

https://share.circonus.com/shared/graphs/ae29a8e6-167c-47e4-aea6-d96e51f8e35f/gPUfX5

Gather some data on your daily write totals and do the math, then look
at the endurance spec for your drives and you'll have an idea of what
kind of life to expect from them.

Eric

From lists at mcintyreweb.com  Sat Mar  1 19:46:02 2014
From: lists at mcintyreweb.com (Hugh McIntyre)
Date: Sat, 01 Mar 2014 11:46:02 -0800
Subject: [OmniOS-discuss] illumos power management...again...
In-Reply-To: <OF2E1EA8BF.107A851D-ONC1257C8D.002607E8-C1257C8D.0027B5B6@inse.com>
References: <OF2E1EA8BF.107A851D-ONC1257C8D.002607E8-C1257C8D.0027B5B6@inse.com>
Message-ID: <531238FA.5040409@mcintyreweb.com>


Others have answered the disk power saving part of the question.

But regardless of the power management issues, if your backup disks are 
in the same server as the server disks themselves then your backups are 
not very disaster proof.  What if the home server is stolen or goes 
wrong in ways that destroy the backup as well as live disks?

Maybe this setup is OK for you, since a backup server in the same house 
also carries risks.  But worth considering if this is really for 
valuable backups.

Hugh.


On 2/27/14 11:13 PM, Johan Kragsterman wrote:
> Hi!
>
> I remember a discussion on this theme last year. I've been reading up on that, but that didn't answer my questions.
>
> I'm in the process of building a new main home server, and I would of coarse like it to be energy efficient. I don't use much space normally, so my daily working environment doesn't need much space, which means I can use all SSD's for that.
>
> Though I would like a backup/nfs environment, with more space on spinning disks, and I got two different scenarious to choose from here:
>
> One is building a separate machine for backup/nfs, and only start it when it needs to be started( with wake on LAN).
>
> The other is to have the spinning disks in my main home server, and use illumos power management to take care of powering down the disks when they are not in use.
> This would of coarse be the easiest way, if the power management system is efficient enough.
>
> I don't know if the system can/do power down the disks if the nfs server is active and the shares are mounted? (I don't have any problems with latencies/delays here, since it isn't in regular use)
>
> If so, good!
> If not, I could umount the shares and turn off the nfs server, and export the pool, if that would help spinning down disks...
>
> Someone got any insight and/or suggestions here...?
>
>
>
> Best regards from/Med v?nliga h?lsningar fr?n
>
> Johan Kragsterman
>
> Capvert
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>


From johan.kragsterman at capvert.se  Sun Mar  2 10:25:10 2014
From: johan.kragsterman at capvert.se (Johan Kragsterman)
Date: Sun, 2 Mar 2014 11:25:10 +0100
Subject: [OmniOS-discuss] illumos power management...again...
In-Reply-To: <531238FA.5040409@mcintyreweb.com>
References: <531238FA.5040409@mcintyreweb.com>,
	<OF2E1EA8BF.107A851D-ONC1257C8D.002607E8-C1257C8D.0027B5B6@inse.com>
Message-ID: <OFD04E0D89.B40634FD-ONC1257C8F.00377458-C1257C8F.00393C91@inse.com>

Hi!

Thanks, Hugh, but that is not one of my concerns. I am not afraid of that the system in any way will mess up my backup/nfs pool, and I am not afraid of thieves currently.

The VERY, VERY important backups I might also replicate to other places, but they are really very small, so no problems to fit them into whatever...

No, I'm more interested in power savings...probably going to measure the electricity difference between when nfs server is running and not running, as well as imported or exported pool.

Best regards from/Med v?nliga h?lsningar fr?n

Johan Kragsterman

Capvert


-----"OmniOS-discuss" <omnios-discuss-bounces at lists.omniti.com> skrev: -----
Till: omnios-discuss at lists.omniti.com
Fr?n: Hugh McIntyre 
S?nt av: "OmniOS-discuss" 
Datum: 2014-03-01 20:47
?rende: Re: [OmniOS-discuss] illumos power management...again...

Others have answered the disk power saving part of the question.

But regardless of the power management issues, if your backup disks are 
in the same server as the server disks themselves then your backups are 
not very disaster proof. ?What if the home server is stolen or goes 
wrong in ways that destroy the backup as well as live disks?

Maybe this setup is OK for you, since a backup server in the same house 
also carries risks. ?But worth considering if this is really for 
valuable backups.

Hugh.


On 2/27/14 11:13 PM, Johan Kragsterman wrote:
> Hi!
>
> I remember a discussion on this theme last year. I've been reading up on that, but that didn't answer my questions.
>
> I'm in the process of building a new main home server, and I would of coarse like it to be energy efficient. I don't use much space normally, so my daily working environment doesn't need much space, which means I can use all SSD's for that.
>
> Though I would like a backup/nfs environment, with more space on spinning disks, and I got two different scenarious to choose from here:
>
> One is building a separate machine for backup/nfs, and only start it when it needs to be started( with wake on LAN).
>
> The other is to have the spinning disks in my main home server, and use illumos power management to take care of powering down the disks when they are not in use.
> This would of coarse be the easiest way, if the power management system is efficient enough.
>
> I don't know if the system can/do power down the disks if the nfs server is active and the shares are mounted? (I don't have any problems with latencies/delays here, since it isn't in regular use)
>
> If so, good!
> If not, I could umount the shares and turn off the nfs server, and export the pool, if that would help spinning down disks...
>
> Someone got any insight and/or suggestions here...?
>
>
>
> Best regards from/Med v?nliga h?lsningar fr?n
>
> Johan Kragsterman
>
> Capvert
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>

_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


From jimklimov at cos.ru  Sun Mar  2 11:40:10 2014
From: jimklimov at cos.ru (Jim Klimov)
Date: Sun, 02 Mar 2014 12:40:10 +0100
Subject: [OmniOS-discuss] ZFS trim support
In-Reply-To: <20140301144622.12a79ac6@sleipner.datanom.net>
References: <20140301144622.12a79ac6@sleipner.datanom.net>
Message-ID: <5313189A.5000908@cos.ru>

On 2014-03-01 14:46, Michael Rasmussen wrote:
> Hi all,
>
> Anybody knows the current status for trim support in Illumos?
>
> It seems two solutions (FreeBSD and tracking the metaslab allocator) are
> suggested but no final date:
> http://open-zfs.org/wiki/Features#TRIM_Support
>
> In lack of trim how do you then handle SSD's?
>
> I have just ordered some Corsair for use as log and cache so this will
> soon be a headache of mine as well;-)

I believe one common approach is under-allocating the drives.
For example, I did this on my rig, with little if any "scientific"
approach like testing the results, other that seeing how much it
would take to wear down my drives... which is kinda irreversible :)
As a rule of thumb I took that the same hardware device is branded
and sold as two models (only one available in our shops) - higher
volume (120Gb in lowest size) or higher speed/endurance (100Gb -
and MUCH higher specs in endurance). So I took the 120Gb one and
partitioned using 100Gb and retaining 20Gb free.

According to the datasheet for this model, the difference in
endurance is 10-fold (total drive rewrites) for the small model
and 5-fold for larger ones, for the cost of a 20% difference
in size... Peak 4K random writes are also about 3x faster on
smaller brothers. I think it is worth not-using that little extra:

http://www.seagate.com/files/www-content/product-content/pulsar-fam/pulsar/enterprise-sata-ssd/en-us/docs/enterprise-sata-ssd-ds1775-1-1301us.pdf

The idea with under-allocation, the way I get it, is that those
20Gb are "required" to contain zeroes. So when the other 100Gb
have used and freed some flash blocks, there is pressure on the
device to free some of them up so as to accommodate the 20Gb of
zeroes before it completely runs out of all 120Gb worth of
user-addressable blocks. In effect this causes early trimming
(if this works at all, which I am unsure how can be measured)
at the devices discretion (it chooses when it can do the trick
so as to stay within the bounds of "guaranteed" storage space).

Unfortunately, I have no idea whether partitioning the 120Gb
drive to use only 100Gb magically turns it into the 100Gb model
for the endurance or performance (i.e. there might well be some
differences in firmware as well, such as other allocation
goals and optimizations, etc.)

HTH,
//Jim Klimov


From richard.elling at richardelling.com  Sun Mar  2 23:26:55 2014
From: richard.elling at richardelling.com (Richard Elling)
Date: Sun, 2 Mar 2014 15:26:55 -0800
Subject: [OmniOS-discuss] Pliant/Sandisk SSD ZIL
In-Reply-To: <530FB59D.3000703@cos.ru>
References: <530291DA.6050009@umiacs.umd.edu>
	<C7FCD61D-B246-4D2F-9712-90DF031D3E43@RichardElling.com>
	<5302BC07.9070304@umiacs.umd.edu>
	<348BDDFA-3B48-422E-A166-8427417EF432@RichardElling.com>
	<5303A925.6060209@cos.ru>
	<6D195400-4AA6-47B4-A7D6-1CF77D92A119@RichardElling.com>
	<530F4ACB.40607@cos.ru>
	<17E70147-79EB-471C-A9F4-496F8C99BF77@richardelling.com>
	<530FB59D.3000703@cos.ru>
Message-ID: <6F378AFA-45E4-473D-9D7D-266AF754735F@richardelling.com>


On Feb 27, 2014, at 2:01 PM, Jim Klimov <jimklimov at cos.ru> wrote:

> On 2014-02-27 20:39, Richard Elling wrote:
>>> I hope, NFS cached-data syncs and locks, and ZFS write-syncs are
>>> not very related in this case (i.e. zfs sync=disabled does not
>>> influence co-ordination of NFS data between hosts), right?
>> 
>> Right. The file system is consistent. The NFS sync is for the case when
>> the server reboots. As long as your server isn't rebooting, everything
>> should be consistent (assuming the clients are configured appropriately)
> 
> And "appropriately" is just how much different from "default"?

It depends on the client. Some people prefer performance over correctness
and default to caching strategies on the client that can make multiclient
sync more difficult.

> I.e. if I set sharenfs on the server dataset, and walk in with
> autofs + nfs/client from the build host, with no special configs
> other than those in solaris 10 or OpenIndiana, is that appropriate?
> Or should some specific stuff be configured on clients and servers?

For Solaris derivatives, check the attribute cache settings, noac, and nocto
in mount_nfs(1m) to see if the defaults suit your needs.
 -- richard


From cks at cs.toronto.edu  Tue Mar  4 23:03:13 2014
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Tue, 04 Mar 2014 18:03:13 -0500
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
Message-ID: <20140304230313.AE1AF1A02E9@apps0.cs.toronto.edu>

 I will ask my question to start with and then explain the background.
As far as I can tell from running truss on the 'zfs mount -a' in
/lib/svc/method/fs-local, this *does not* mount filesystems from pools
other than rpool. However the mounts are absent immediately before it
runs and present immediately afterwards. So: does anyone understand
how this works? I assume 'zfs mount -a' is doing some ZFS action that
activates non-rpool pools and causes them to magically mount their
filesystems?

 Thanks in advance if anyone knows this.

Background:
 I am having an extremely weird heisenbug problem where on boot[*] our
test OmniOS machine fails out at the ZFS mount stage with errors about:

	Reading ZFS config: done.
	Mounting ZFS filesystems: cannot mount 'fs3-test-01': mountmount or data is busy
	cannot mount '/fs3-test-02': directory is not empty
	cannot mount 'fs3-test-02/h/999': mountpoint or dataset is busy
	(20/20)
	svc:/system/filesystem/local:default: WARNING: /usr/sbin/zfs mount -a foiled: exit status 1
[failures go on]

The direct problem here is that as far as I can tell this is incorrect.
If I log in to the console after this failure, the pools and their
filesystems are present. If I hack up /lib/svc/method/fs-local to add
debugging stuff, all of the directories involved are empty (and unmounted)
before 'zfs mount -a' runs and magically present afterwards, even as 'zfs
mount -a' complains and errors out. That was when I started truss'ing
the 'zfs mount -a' itself and discovered that it normally doesn't mount
non-rpool filesystems. In fact, based on a truss trace I have during an
incident it appears that the problem happens exactly when 'zfs mount -a'
thinks that it *does* need to mount such a filesystem but finds that
the target directory already has things in it because the filesystem is
actually mounted already.

 Running truss on the 'zfs mount -a' seems to make this happen much less
frequently, especially a relatively verbose truss that is tracing calls
in libzfs as well as system calls. This makes me wonder if there is some
sort of a race involved.

	- cks
[*: the other problem is that the test OmniOS machine has stopped actually
    rebooting when I run 'reboot'; it hangs during shutdown and must be
    power cycled (and I have the magic fastboot settings turned off).
    Neither this nor the mount problem used to happen; both appeared this
    morning. No packages have been updated.
]

From mark at omniti.com  Tue Mar  4 23:29:42 2014
From: mark at omniti.com (Mark Harrison)
Date: Tue, 4 Mar 2014 18:29:42 -0500
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: <20140304230313.AE1AF1A02E9@apps0.cs.toronto.edu>
References: <20140304230313.AE1AF1A02E9@apps0.cs.toronto.edu>
Message-ID: <CABC3qWc6C1fp5jFVTYqu55gZNh4xEpJXrh6HEbKQYxctNSgtVA@mail.gmail.com>

You mention 'directories' being empty. Does /fs3-test-02 contain empty
directories before being mounted? If so, this will be why zfs thinks
it's isn't empty and then fail to mount it. However, the child
filesystems might still mount because their directories are empty,
giving the appearance of everything being mounted OK. I'm not sure why
you're not seeing truss show zfs trying to mount non-rpool
filesystems, but it should be doing so. My wild guess right now is
that it is due to zfs checking to see if the directory is empty first,
and only showing up that it's doing something in truss if the dir
isnt' empty.

We've had this happen before when someone runs mv on a directory that
is actually the root of a filesystem. When zfs remounts it on reboot,
it gets remounted at the old location, which may or may not have other
data in it at this point (this comes up a lot when doing something
like mv foo foo.old; mkdir foo; do_stuff_with foo). I've not tracked
down the exact pathology of this when it happens, but our solution
then has basically to be to unmount all affected filesystems, then run
rmdir on all the blank directories, move any non-blank directories
aside (keep them in case they have data that needs to be kept), then
run zfs mount -a to let it clean things up.


On Tue, Mar 4, 2014 at 6:03 PM, Chris Siebenmann <cks at cs.toronto.edu> wrote:
>  I will ask my question to start with and then explain the background.
> As far as I can tell from running truss on the 'zfs mount -a' in
> /lib/svc/method/fs-local, this *does not* mount filesystems from pools
> other than rpool. However the mounts are absent immediately before it
> runs and present immediately afterwards. So: does anyone understand
> how this works? I assume 'zfs mount -a' is doing some ZFS action that
> activates non-rpool pools and causes them to magically mount their
> filesystems?
>
>  Thanks in advance if anyone knows this.
>
> Background:
>  I am having an extremely weird heisenbug problem where on boot[*] our
> test OmniOS machine fails out at the ZFS mount stage with errors about:
>
>         Reading ZFS config: done.
>         Mounting ZFS filesystems: cannot mount 'fs3-test-01': mountmount or data is busy
>         cannot mount '/fs3-test-02': directory is not empty
>         cannot mount 'fs3-test-02/h/999': mountpoint or dataset is busy
>         (20/20)
>         svc:/system/filesystem/local:default: WARNING: /usr/sbin/zfs mount -a foiled: exit status 1
> [failures go on]
>
> The direct problem here is that as far as I can tell this is incorrect.
> If I log in to the console after this failure, the pools and their
> filesystems are present. If I hack up /lib/svc/method/fs-local to add
> debugging stuff, all of the directories involved are empty (and unmounted)
> before 'zfs mount -a' runs and magically present afterwards, even as 'zfs
> mount -a' complains and errors out. That was when I started truss'ing
> the 'zfs mount -a' itself and discovered that it normally doesn't mount
> non-rpool filesystems. In fact, based on a truss trace I have during an
> incident it appears that the problem happens exactly when 'zfs mount -a'
> thinks that it *does* need to mount such a filesystem but finds that
> the target directory already has things in it because the filesystem is
> actually mounted already.
>
>  Running truss on the 'zfs mount -a' seems to make this happen much less
> frequently, especially a relatively verbose truss that is tracing calls
> in libzfs as well as system calls. This makes me wonder if there is some
> sort of a race involved.
>
>         - cks
> [*: the other problem is that the test OmniOS machine has stopped actually
>     rebooting when I run 'reboot'; it hangs during shutdown and must be
>     power cycled (and I have the magic fastboot settings turned off).
>     Neither this nor the mount problem used to happen; both appeared this
>     morning. No packages have been updated.
> ]
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss


-- 
Mark Harrison
Lead Site Reliability Engineer
OmniTI

From mir at miras.org  Tue Mar  4 23:35:28 2014
From: mir at miras.org (Michael Rasmussen)
Date: Wed, 5 Mar 2014 00:35:28 +0100
Subject: [OmniOS-discuss] ZFS trim support
In-Reply-To: <20140301144622.12a79ac6@sleipner.datanom.net>
References: <20140301144622.12a79ac6@sleipner.datanom.net>
Message-ID: <20140305003528.1514e9be@sleipner.datanom.net>

On Sat, 1 Mar 2014 14:46:22 +0100
Michael Rasmussen <mir at miras.org> wrote:

Thanks for your answers. I have followed the advice of not partition
more than 80%.

For the part whether sequential writes have impact or not. From the
disk's point of view a write is a write and therefore any write will
impact the need for trim in one way or another.

-- 
Hilsen/Regards
Michael Rasmussen

Get my public GnuPG keys:
michael <at> rasmussen <dot> cc
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E
mir <at> datanom <dot> net
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C
mir <at> miras <dot> org
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917
--------------------------------------------------------------
/usr/games/fortune -es says:
Debian Hint #21: If your Debian box is behind a slow network connection,
but you have access to a fast one as well, check out the apt-zip
package.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <https://omniosce.org/ml-archive/attachments/20140305/beb2e346/attachment.bin>

From cks at cs.toronto.edu  Tue Mar  4 23:42:40 2014
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Tue, 04 Mar 2014 18:42:40 -0500
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: mark's message of Tue, 04 Mar 2014 18:29:42 -0500.
	<CABC3qWc6C1fp5jFVTYqu55gZNh4xEpJXrh6HEbKQYxctNSgtVA@mail.gmail.com>
Message-ID: <20140304234240.766661A02E9@apps0.cs.toronto.edu>

| You mention 'directories' being empty. Does /fs3-test-02 contain empty
| directories before being mounted?

 It doesn't. All of /fs3-test-01, /fs3-test-02, /h/281, and /h/999
are empty before 'zfs mount -a' runs (I've verified this with ls's
immediately before the 'zfs mount -a' in /lib/svc/method/fs-local).

| I'm not sure why you're not seeing truss show zfs trying to mount
| non-rpool filesystems, but it should be doing so.

 My truss traces on successful boot are quite definitive about this.
It clearly looks to see if a lot of fs's are mounted and finds that
they are. I've put one captured trace up here, if people are
interested:

	http://www.cs.toronto.edu/~cks/t/fs-local-truss-good-boot.txt

Notice that calls to libzfs:zfs_is_mounted() return either 0 or 1.
Calls that return 0 are followed by a call to libzfs:zfs_mount() (and an
actual mount operation); calls that return 1 aren't. Clearly 'zfs mount
-a' is checking a bunch more filesystems than it actually is mounting.

(I don't know if there's a way to make truss dump the first argument
to libzfs:zfs_is_mounted() as a string so that one can see what mount
points are being checked.)

 A truss from a bad boot is
	http://www.cs.toronto.edu/~cks/t/fs-local-truss-bad-boot.txt

This doesn't have the libzfs trace information, just the syscalls, but
you can see a similar sequence of syscall level operations right up
to the point where it does getdents64() on /h/281 and finds it *not*
empty (a 232-byte return value instead of a 48-byte one).  Based on
the information from the good trace, this is a safety check inside
libzfs:zfs_mount().

	- cks

From jimklimov at cos.ru  Wed Mar  5 08:56:50 2014
From: jimklimov at cos.ru (Jim Klimov)
Date: Wed, 05 Mar 2014 09:56:50 +0100
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: <CABC3qWc6C1fp5jFVTYqu55gZNh4xEpJXrh6HEbKQYxctNSgtVA@mail.gmail.com>
References: <20140304230313.AE1AF1A02E9@apps0.cs.toronto.edu>
	<CABC3qWc6C1fp5jFVTYqu55gZNh4xEpJXrh6HEbKQYxctNSgtVA@mail.gmail.com>
Message-ID: <5316E6D2.7020404@cos.ru>

On 2014-03-05 00:29, Mark Harrison wrote:
> You mention 'directories' being empty. Does /fs3-test-02 contain empty
> directories before being mounted? If so, this will be why zfs thinks
> it's isn't empty and then fail to mount it. However, the child
> filesystems might still mount because their directories are empty,
> giving the appearance of everything being mounted OK.

Just in case, such cases my be verified with df which returns the
actual mounted filesystem which provides the tested directory or file:

# df -k /lib/libzfs.so /lib/libc.so /var/log/syslog
Filesystem            kbytes    used   avail capacity  Mounted on
rpool/ROOT/sol10u10  30707712 1826637 7105279    21%    /
rpool/ROOT/sol10u10/usr
                      30707712  508738 7105279     7%    /usr
rpool/SHARED/var/log 4194304    1491 3638955     1%    /var/log


This way you can test for example if a directory is "standalone"
or an actively used mountpoint of a ZFS POSIX dataset.

I think a "zpool list" can help in your debugging to see if the
pools in question are in fact imported before "zfs mount -a",
or if some unexpected magic happens and the "zfs" command does
indeed trigger the imports.

On 2014-03-05 00:03, Chris Siebenmann wrote:
 > As far as I can tell from running truss on the 'zfs mount -a' in
 > /lib/svc/method/fs-local, this *does not* mount filesystems from pools
 > other than rpool. However the mounts are absent immediately before it
 > runs and present immediately afterwards. So: does anyone understand
 > how this works? I assume 'zfs mount -a' is doing some ZFS action that
 > activates non-rpool pools and causes them to magically mount their
 > filesystems?

Regarding the "zfs mount -a" - I am not sure why it errors out
in your case, I can only think of some extended attributes being
in use, or overlay-mounts, or stuff like that - though such things
are likely to come up in "strange" runtime cases to mostly block
un-mounts, not in orderly startup scenarios...

Namely, one thing that may be a problem is if a directory in
question is a current-working-dir for some process, or if a file
has been created, used, deleted (while it remains open by some
process) which is quite possible for the likes of /var/tmp paths.
But even so, it is likely to block unmounts but not over-mounts
as long as the directory is (seems) empty.

Also, as at least a workaround, you can switch the mountpoint
to "legacy" and refer the dataset from /etc/vfstab including
the "-O" option for overlay-mount. Unfortunately there is no
equivalent dataset attribute at the moment, so it is not a very
convenient solution for possible trees of datasets - but may
be quite acceptable for leaf datasets where you don't need to
automate any sub-mounts.
Vote for https://www.illumos.org/issues/997 ;)

And finally, I also don't know where the pools get imported,
but "zfs mount -a" *should* only mount datasets with canmount=on
and zoned=off (if in global zone) and a valid mountpoint path,
picked from any pools imported at the moment. The mounts from
different pools may be done in parallel, so if you need some
specific order of mounts (i.e. rpool/export/home and then
datapool/export/home/user... okay, there is in fact no problem
with these - but just to give *some* viable example) you may
have to specify stuff in /etc/vfstab.

I can guess (but would need to grok the code) that something
like "zpool import -N -a" is done in some part of the root
environment preparation to prepare all pools referenced in
/etc/zfs/zpool.cache, perhaps some time after the rpool is
imported and the chosen root dataset is mounted explicitly
to anchor the running kernel.

As another workaround, you can export the pool which contains
your "problematic" datasets so it is un-cached from zpool.cache
and is not automatically imported nor mounted during the system
bootup - so that the system becomes able to boot successfully
to the point of being accessible over ssh for example. Then you
import and mount that other pool as an SMF service, upon which
your other services can depend to proceed, see here for ideas
and code snippets:

http://wiki.openindiana.org/oi/Advanced+-+ZFS+Pools+as+SMF+services+and+iSCSI+loopback+mounts

HTH,
//Jim Klimov


From cks at cs.toronto.edu  Wed Mar  5 15:50:51 2014
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Wed, 05 Mar 2014 10:50:51 -0500
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: Your message of Wed, 05 Mar 2014 09:56:50 +0100.
	<5316E6D2.7020404@cos.ru>
Message-ID: <20140305155052.05A401A0826@apps0.cs.toronto.edu>

| I think a "zpool list" can help in your debugging to see if the pools
| in question are in fact imported before "zfs mount -a", or if some
| unexpected magic happens and the "zfs" command does indeed trigger the
| imports.

 Sorry for not mentioning this before: a 'zpool list' before the 'zfs
mount -a' lists the pools as visible, but both df and 'mount -v' do not
report any filesystems from the two additional pools (the ones that get
mount failures and so on).

| The mounts from different pools may be done in parallel, so if you
| need some specific order of mounts (i.e. rpool/export/home and then
| datapool/export/home/user... okay, there is in fact no problem with
| these - but just to give *some* viable example) you may have to
| specify stuff in /etc/vfstab.

 As far as I can tell from the Illumos code, this is not the case.
The code certainly seems to be single-threaded and it sorts the mount
list into order in a way that should put prerequisite mounts first
(eg you mount /a and then /a/b).

(This potential issue also doesn't apply to my case because all four
of the mounts from these pools are in the root filesystem, not in any
sub-filesystem.)

| I can guess (but would need to grok the code) that something
| like "zpool import -N -a" is done in some part of the root
| environment preparation to prepare all pools referenced in
| /etc/zfs/zpool.cache, perhaps some time after the rpool is
| imported and the chosen root dataset is mounted explicitly
| to anchor the running kernel.

 The last time I spelunked the OpenSolaris code some years ago, the
kernel read zpool.cache very early on but only sort of half-activated
pools then (eg it didn't check to see if all vdevs were present). Pools
were brought to full activation essentially as a side effect of doing
other operations with/to them.

 I don't know if this is still the state of affairs in Illumos/OmniOS
today and how such half-activated pools show up during early boot (eg
if they appear in 'zpool list', or even if simply running 'zpool list'
is enough to bring them to fully active status).

	- cks

From dswartz at druber.com  Wed Mar  5 16:29:56 2014
From: dswartz at druber.com (Dan Swartzendruber)
Date: Wed, 5 Mar 2014 11:29:56 -0500
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: <20140305155052.05A401A0826@apps0.cs.toronto.edu>
References: <20140305155052.05A401A0826@apps0.cs.toronto.edu>
Message-ID: <c3bead76526d1035e57ecfbc13a0bf24.squirrel@webmail.druber.com>


This is all very strange.  I saw stuff like this all the time when I was
using ZFS on Linux, due to timing where an HBA would not present devices
quickly enough, resulting in missing pools, missing/unmounted datasets,
etc, which would all get 'fixed' if you manually re-did them, but I've
never seen it in omniOS.


From bdha at mirrorshades.net  Wed Mar  5 17:17:27 2014
From: bdha at mirrorshades.net (Bryan Horstmann-Allen)
Date: Wed, 5 Mar 2014 12:17:27 -0500
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: <c3bead76526d1035e57ecfbc13a0bf24.squirrel@webmail.druber.com>
References: <20140305155052.05A401A0826@apps0.cs.toronto.edu>
	<c3bead76526d1035e57ecfbc13a0bf24.squirrel@webmail.druber.com>
Message-ID: <0635F3D6-C531-480C-8A79-83BE84D5DD79@mirrorshades.net>

I've seen that bug on SmartOS. Fixed in the last month or two. 
-- 
bdha

> On Mar 5, 2014, at 11:29, "Dan Swartzendruber" <dswartz at druber.com> wrote:
> 
> 
> This is all very strange.  I saw stuff like this all the time when I was
> using ZFS on Linux, due to timing where an HBA would not present devices
> quickly enough, resulting in missing pools, missing/unmounted datasets,
> etc, which would all get 'fixed' if you manually re-did them, but I've
> never seen it in omniOS.
> 
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

From dswartz at druber.com  Wed Mar  5 17:29:57 2014
From: dswartz at druber.com (Dan Swartzendruber)
Date: Wed, 5 Mar 2014 12:29:57 -0500
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: <0635F3D6-C531-480C-8A79-83BE84D5DD79@mirrorshades.net>
References: <20140305155052.05A401A0826@apps0.cs.toronto.edu>
	<c3bead76526d1035e57ecfbc13a0bf24.squirrel@webmail.druber.com>
	<0635F3D6-C531-480C-8A79-83BE84D5DD79@mirrorshades.net>
Message-ID: <f21efd818d815799f06515c3cf0d3743.squirrel@webmail.druber.com>

> I've seen that bug on SmartOS. Fixed in the last month or two.

Any explanation as to what was happening?


From bdha at mirrorshades.net  Wed Mar  5 17:46:43 2014
From: bdha at mirrorshades.net (Bryan Horstmann-Allen)
Date: Wed, 5 Mar 2014 12:46:43 -0500
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: <f21efd818d815799f06515c3cf0d3743.squirrel@webmail.druber.com>
References: <20140305155052.05A401A0826@apps0.cs.toronto.edu>
	<c3bead76526d1035e57ecfbc13a0bf24.squirrel@webmail.druber.com>
	<0635F3D6-C531-480C-8A79-83BE84D5DD79@mirrorshades.net>
	<f21efd818d815799f06515c3cf0d3743.squirrel@webmail.druber.com>
Message-ID: <20140305174643.GA16938@lab.pobox.com>

+------------------------------------------------------------------------------
| On 2014-03-05 12:29:57, Dan Swartzendruber wrote:
| 
| Any explanation as to what was happening?

This is the bug I was hitting: http://smartos.org/bugview/OS-2616

Devices wouldn't be available at boot, but would once the system was up.
-- 
bdha
cyberpunk is dead. long live cyberpunk.

From dswartz at druber.com  Wed Mar  5 17:53:02 2014
From: dswartz at druber.com (Dan Swartzendruber)
Date: Wed, 5 Mar 2014 12:53:02 -0500
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: <20140305174643.GA16938@lab.pobox.com>
References: <20140305155052.05A401A0826@apps0.cs.toronto.edu>
	<c3bead76526d1035e57ecfbc13a0bf24.squirrel@webmail.druber.com>
	<0635F3D6-C531-480C-8A79-83BE84D5DD79@mirrorshades.net>
	<f21efd818d815799f06515c3cf0d3743.squirrel@webmail.druber.com>
	<20140305174643.GA16938@lab.pobox.com>
Message-ID: <933d63e9b39154da4445a368a76d2279.squirrel@webmail.druber.com>

> +------------------------------------------------------------------------------
> | On 2014-03-05 12:29:57, Dan Swartzendruber wrote:
> |
> | Any explanation as to what was happening?
>
> This is the bug I was hitting: http://smartos.org/bugview/OS-2616
>
> Devices wouldn't be available at boot, but would once the system was up.

Interesting.  Thanks for posting this!


From danmcd at omniti.com  Wed Mar  5 17:53:32 2014
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 5 Mar 2014 12:53:32 -0500
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: <20140305174643.GA16938@lab.pobox.com>
References: <20140305155052.05A401A0826@apps0.cs.toronto.edu>
	<c3bead76526d1035e57ecfbc13a0bf24.squirrel@webmail.druber.com>
	<0635F3D6-C531-480C-8A79-83BE84D5DD79@mirrorshades.net>
	<f21efd818d815799f06515c3cf0d3743.squirrel@webmail.druber.com>
	<20140305174643.GA16938@lab.pobox.com>
Message-ID: <4D262C1E-E99B-4251-8DE0-C7FD955AC932@omniti.com>


On Mar 5, 2014, at 12:46 PM, Bryan Horstmann-Allen <bdha at mirrorshades.net> wrote:

> +------------------------------------------------------------------------------
> | On 2014-03-05 12:29:57, Dan Swartzendruber wrote:
> | 
> | Any explanation as to what was happening?
> 
> This is the bug I was hitting: http://smartos.org/bugview/OS-2616
> 
> Devices wouldn't be available at boot, but would once the system was up.

I believe that bugfix is in illumos-gate now as:

	https://www.illumos.org/issues/4500

which was fixed by this changeset:

	https://github.com/illumos/illumos-gate/commit/da5ab83fc888325fc812733d8a54bc5eab65c65c

and it *should* be in bloody now:

	https://github.com/omniti-labs/illumos-omnios/commit/da5ab83fc888325fc812733d8a54bc5eab65c65c

Dan


From ikaufman at eng.ucsd.edu  Wed Mar  5 18:03:12 2014
From: ikaufman at eng.ucsd.edu (Ian Kaufman)
Date: Wed, 5 Mar 2014 10:03:12 -0800
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: <20140304234240.766661A02E9@apps0.cs.toronto.edu>
References: <CABC3qWc6C1fp5jFVTYqu55gZNh4xEpJXrh6HEbKQYxctNSgtVA@mail.gmail.com>
	<20140304234240.766661A02E9@apps0.cs.toronto.edu>
Message-ID: <CAPJtH1idgHt5HfMspQJSYYYANEGiL4b+jYebr5TjzXuh32gRXw@mail.gmail.com>

>  It doesn't. All of /fs3-test-01, /fs3-test-02, /h/281, and /h/999
> are empty before 'zfs mount -a' runs (I've verified this with ls's
> immediately before the 'zfs mount -a' in /lib/svc/method/fs-local).
>

As a test, try renaming those "empty" directories and then reboot. We
saw this issue with Solaris 10, where on reboot, the filesystems did
not unmount cleanly, and failed to mount at boot.

Ian

-- 
Ian Kaufman
Research Systems Administrator
UC San Diego, Jacobs School of Engineering ikaufman AT ucsd DOT edu

From cks at cs.toronto.edu  Wed Mar  5 21:23:56 2014
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Wed, 05 Mar 2014 16:23:56 -0500
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: danmcd's message of Wed, 05 Mar 2014 12:53:32 -0500.
	<4D262C1E-E99B-4251-8DE0-C7FD955AC932@omniti.com>
Message-ID: <20140305212356.EE1AA1A0826@apps0.cs.toronto.edu>

 With the aid of DTrace (and Illumos source) I have traced down what is
going on and where the race is. The short version is that the 'zfs mount
-a' in /lib/svc/method/fs-local is racing with syseventd's ZFS module.
I have a dtrace capture (well, several of them) that shows this clearly:

	http://www.cs.toronto.edu/~cks/t/fs-local-mounttrace.txt

(produced by http://www.cs.toronto.edu/~cks/t/mounttrace.d which I
started at the top of /lib/svc/method/fs-local.)

 Looking at various things suggests that this may be happening partly
because these additional pools are on iSCSI disks and the iSCSI disks
seem to be taking a bit of time to show up (I've never fully understood
how iSCSI disks are probed by Illumos). This may make it spiritually
related to the bug that Bryan Horstmann-Allen mentioned in that both
result in delayed device appearances.

 The following is a longer explanation of the race and assumes you
have some familiarity with Illumos ZFS kernel internals.

- pools present in /etc/zfs/zpool.cache are loaded into the kernel
  very early in boot, but they are not initialized and activated.
  This is done in spa_config_load(), calling spa_add(), which sets
  them to spa->spa_state = POOL_STATE_UNINITIALIZED.

- inactive pools are activated through spa_activate(), which is
  called (among other times) whenever you open a pool. By a chain
  of calls this happens any time you make a ZFS IOCTL that involves
  a pool name.
	zfsdev_ioctl() -> pool_status_check() -> spa_open() -> etc.

- 'zfs mount -a' of course does ZFS IOCTLs that involve pools
  because it wants to get pool configurations to find out what
  datasets it might have to mount. As such, it activate all
  additional pools present in zpool.cache when it runs (assuming
  that their vdev configuration is good, of course).

- when a pool is activated this way in our environment, some sort of
  events are delivered to syseventd. I don't know enough about syseventd
  to say exactly what sort of event it is and it may well be iSCSI disk
  'device appeared' messages. I have a very verbose syseventd debugging
  dump but I don't know enough to see anything useful in it.

- when syseventd gets these events, its ZFS module decides that it
  too should mount (aka 'activate') all datasets for the newly-active
  pools.

 At this point a multithreaded syseventd and 'zfs mount -a' are
racing to see who can mount all of the pool datasets, creating two
failure modes for 'zfs mount -a'. The first failure mode is simply
that syseventd wins the race and fully mounts a filesystem before 'zfs
mount -a' looks at it, triggering a safety check of 'directory is not
empty'. The second failure mode is that syseventd and 'zfs mount -a'
both call mount() on the same filesystem at the same time and syseventd
is the one that succeeds. In this case mount() itself will return an
error and 'zfs mount -a' will report:

	cannot mount 'fs3-test-02': mountpoint or dataset is busy

	- cks

From cks at cs.toronto.edu  Wed Mar  5 21:38:09 2014
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Wed, 05 Mar 2014 16:38:09 -0500
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: cks's message of Wed, 05 Mar 2014 16:23:56 -0500.
	<20140305212356.EE1AA1A0826@apps0.cs.toronto.edu>
Message-ID: <20140305213809.3DC421A0826@apps0.cs.toronto.edu>

 It turns out that there is an unpleasant consequence to syseventd being
willing to mount ZFS filesystems for additional pools before the 'zfs
mount -a' has run: you can get unresolvable mount conflicts in some
situations.

 Suppose that you have /opt as a separate ZFS filesystem in your
root pool and you also have /opt/bigthing as a ZFS filesystem in
a second pool. You can set this up and everything looks right, but
if you reboot and syseventd beats 'zfs mount -a' for whatever reasons,
you get an explosion:

- we start with no additional filesystems mounted, including /opt
- syseventd grabs the second pool, starts mounting things, and
  mounts /opt/bigthing on the *bare* root filesystem, making /opt
  (if necessary) in the process.
- 'zfs mount -a' reaches /opt and attempts to mount it. However,
  because syseventd has already mounted /opt/bigthing, /opt is not
  empty. FAILURE.

As far as I can tell there is no particularly good cure for this.  To
me it really looks like syseventd should either not be started before
fs-local (although I don't know if anything breaks if its startup is
deferred) or that it should not be mounting ZFS filesystems (although I
can half-see the attraction of it doing so).

	- cks

From cks at cs.toronto.edu  Fri Mar  7 19:34:53 2014
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Fri, 07 Mar 2014 14:34:53 -0500
Subject: [OmniOS-discuss] Reproducible r151008j kernel crash with ZFS pools
	on iSCSI
Message-ID: <20140307193454.0E2911A0488@apps0.cs.toronto.edu>

 I have a reproducible kernel crash with OmniOS r151008j. The situation:

 The basic setup is a ZFS pool on mirrored pairs of iSCSI disks. The
iSCSI disks come from two different iSCSI targets, and all
targets are multipathed over two 10G networks. The pool is set to
'failmode=continue'.  If I start a large streaming write to the pool and
then take down both iSCSI interfaces on both targets (making all disks
in the pool completely unavailable), OmniOS panics after a couple of
minutes. Fortunately this doesn't happen if only a single target becomes
inaccessible.

 I have crash dumps and can run commands against them and so on. Just
tell me what to look at/do/etc. Since this is a test environment I can
also reproduce this on demand and I'm willing test things freely.

 One panic produced the following:

Mar  7 10:20:50 sanjuan ^Mpanic[cpu3]/thread=ffffff007c4dbc40: 
  BAD TRAP: type=8 (#df Double fault) rp=ffffff114dca2f10 addr=0
  
  zpool-fs3-test-0: 
  #df Double fault
  pid=463, pc=0xfffffffff7903bb8, sp=0xffffff007c4d7000, eflags=0x10086
  cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 426f8<osxsav,vmxe,xmme,fxsr,pge,mce,pae,pse,de>
  cr2: ffffff007c4d6ff8
  cr3: bc00000
  cr8: 0
  
  	rdi: ffffff1157501a80 rsi:                5 rdx: ffffff007c4d70c0
  	rcx:                5  r8: ffffff1275342d58  r9:                1
  	rax:                3 rbx: ffffff1157501a80 rbp: ffffff007c4d7050
  	r10:                0 r11:         ffffffff r12:                5
  	r13: ffffff1142c55b48 r14: ffffff007c4d70c0 r15:                5
  	fsb:                0 gsb: ffffff1157501a80  ds:               4b
  	 es:               4b  fs:                0  gs:              1c3
  	trp:                8 err:                0 rip: fffffffff7903bb8
  	 cs:               30 rfl:            10086 rsp: ffffff007c4d7000
  	 ss:               38
  tss.tss_rsp0:	0xffffff007c4dbc40
  tss.tss_rsp1:	0x0
  tss.tss_rsp2:	0x0
  tss.tss_ist1:	0xffffff114dca3000
  tss.tss_ist2:	0x0
  tss.tss_ist3:	0x0
  tss.tss_ist4:	0x0
  tss.tss_ist5:	0x0
  tss.tss_ist6:	0x0
  tss.tss_ist7:	0x0
  
  ffffff114dca2df0 unix:real_mode_stop_cpu_stage2_end+9de3 ()
  ffffff114dca2f00 unix:trap+ca5 ()
  ffffff007c4d7050 unix:_patch_xrstorq_rbx+196 ()
  ffffff007c4d70b0 apix:apix_do_interrupt+372 ()
  ffffff007c4d70c0 unix:cmnint+ba ()
  ffffff007c4d7200 genunix:avl_remove+197 ()
  ffffff007c4d7240 zfs:vdev_queue_io_remove+54 ()
  ffffff007c4d7600 zfs:vdev_queue_io_to_issue+133 ()
  ffffff007c4d7640 zfs:vdev_queue_io_done+88 ()
  ffffff007c4d7680 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d76c0 zfs:zio_execute+88 ()
  ffffff007c4d7700 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d7740 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d7780 zfs:zio_execute+88 ()
  ffffff007c4d77c0 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d7800 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d7840 zfs:zio_execute+88 ()
  ffffff007c4d7880 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d78c0 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d7900 zfs:zio_execute+88 ()
  ffffff007c4d7940 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d7980 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d79c0 zfs:zio_execute+88 ()
  ffffff007c4d7a00 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d7a40 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d7a80 zfs:zio_execute+88 ()
  ffffff007c4d7ac0 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d7b00 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d7b40 zfs:zio_execute+88 ()
  ffffff007c4d7b80 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d7bc0 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d7c00 zfs:zio_execute+88 ()
  ffffff007c4d7c40 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d7c80 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d7cc0 zfs:zio_execute+88 ()
  ffffff007c4d7d00 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d7d40 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d7d80 zfs:zio_execute+88 ()
  ffffff007c4d7dc0 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d7e00 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d7e40 zfs:zio_execute+88 ()
  ffffff007c4d7e80 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d7ec0 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d7f00 zfs:zio_execute+88 ()
  ffffff007c4d7f40 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d7f80 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d7fc0 zfs:zio_execute+88 ()
  ffffff007c4d8000 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d8040 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d8080 zfs:zio_execute+88 ()
  ffffff007c4d80c0 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d8100 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d8140 zfs:zio_execute+88 ()
  ffffff007c4d8180 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d81c0 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d8200 zfs:zio_execute+88 ()
  ffffff007c4d8240 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d8280 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d82c0 zfs:zio_execute+88 ()
  ffffff007c4d8300 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d8340 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d8380 zfs:zio_execute+88 ()
  ffffff007c4d83c0 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d8400 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d8440 zfs:zio_execute+88 ()
  ffffff007c4d8480 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d84c0 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d8500 zfs:zio_execute+88 ()
  ffffff007c4d8540 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d8580 zfs:zio_vdev_io_done+80 ()
  ffffff007c4d85c0 zfs:zio_execute+88 ()
  ffffff007c4d8600 zfs:vdev_queue_io_done+78 ()
  ffffff007c4d8640 zfs:zio_vdev_io_done+80 ()
  Warning: stack in the dump buffer may be incomplete
  ffffff007c4d8680 zfs:zio_execute+88 ()
  Warning: stack in the dump buffer may be incomplete
  ffffff007c4d86c0 zfs:vdev_queue_io_done+78 ()
  Warning: stack in the dump buffer may be incomplete
  ffffff007c4d8700 zfs:zio_vdev_io_done+80 ()
  Warning: stack in the dump buffer may be incomplete
[... repeats a lot ...]
  ffffff007c4db930 zfs:vdev_queue_io_done+78 ()
  Warning: stack in the dump buffer may be incomplete
  ffffff007c4db970 zfs:zio_vdev_io_done+80 ()
  Warning: stack in the dump buffer may be incomplete
  ffffff007c4db9b0 zfs:zio_execute+88 ()
  Warning: stack in the dump buffer may be incomplete
  ffffff007c4db9f0 zfs:vdev_queue_io_done+78 ()
  Warning: stack in the dump buffer may be incomplete
  ffffff007c4dba30 zfs:zio_vdev_io_done+80 ()
  Warning: stack in the dump buffer may be incomplete
  ffffff007c4dba70 zfs:zio_execute+88 ()
  Warning: stack in the dump buffer may be incomplete
  ffffff007c4dbb30 genunix:taskq_thread+2d0 ()
  Warning: stack in the dump buffer may be incomplete
  ffffff007c4dbb40 unix:thread_start+8 ()
  Warning: stack in the dump buffer may be incomplete
  
  syncing file systems...
   done

A second crash has a very similar backtrace but the front is different:
  ffffff1157a61df0 unix:real_mode_stop_cpu_stage2_end+9de3 ()
  ffffff1157a61f00 unix:trap+ca5 ()
  ffffff007b7ce000 unix:_patch_xrstorq_rbx+196 ()
  ffffff007b7ce070 genunix:avl_find+72 ()
  ffffff007b7ce0b0 genunix:avl_add+27 ()
  ffffff007b7ce0f0 zfs:vdev_queue_pending_add+4b ()
  ffffff007b7ce4b0 zfs:vdev_queue_io_to_issue+153 ()
  ffffff007b7ce4f0 zfs:vdev_queue_io_done+88 ()
  ffffff007b7ce530 zfs:zio_vdev_io_done+80 ()
  ffffff007b7ce570 zfs:zio_execute+88 ()
  ffffff007b7ce5b0 zfs:vdev_queue_io_done+78 ()
  ffffff007b7ce5f0 zfs:zio_vdev_io_done+80 ()
  ffffff007b7ce630 zfs:zio_execute+88 ()
  ffffff007b7ce670 zfs:vdev_queue_io_done+78 ()
  ffffff007b7ce6b0 zfs:zio_vdev_io_done+80 ()
[... repeating pattern repeats ...]

	- cks

From zembower at criterion.com  Fri Mar  7 20:28:11 2014
From: zembower at criterion.com (Chris Zembower)
Date: Fri, 7 Mar 2014 15:28:11 -0500
Subject: [OmniOS-discuss] System hangs every few days
Message-ID: <CAKbxKA8SMiGEyC+UB4xdfKZV+KwuY1ebEJ2ysio0Z1v1wSNr9A@mail.gmail.com>

I was at about 6 months of uptime, then added some new SSD's for cache to
the motherboard SATA ports. They weren't hot-plug recognized, so I rebooted
over the weekend. Added the caches, all seemed good.

Five days later, the system was locked. No kernel panic, just a frozen
console and no network access. Not ping-able.

Looking through the logs, I saw mostly just the typical (and benign?)
netatalk messages:

------

mDNSResponder: [ID 702911 daemon.error] ERROR: getOptRdata - unknown opt 4

mDNSResponder: [ID 702911 daemon.error] Correcting TTL from 4500 to 3600
for  312 nexus

-------


Etc.

But also, something new right before the crash:


------

Mar  7 11:18:28 colossus mac: [ID 486395 kern.info] NOTICE: igb3 link down

Mar  7 11:18:28 colossus mac: [ID 486395 kern.info] NOTICE: igb4 link down

Mar  7 11:18:28 colossus mac: [ID 486395 kern.info] NOTICE: igb2 link down

Mar  7 11:18:28 colossus mac: [ID 486395 kern.info] NOTICE: igb5 link down

Mar  7 11:18:28 colossus mac: [ID 486395 kern.info] NOTICE: aggr1000 link
down

Mar  7 11:18:30 colossus mac: [ID 435574 kern.info] NOTICE: igb3 link up,
1000 Mbps, full duplex

Mar  7 11:18:30 colossus mac: [ID 435574 kern.info] NOTICE: aggr1000 link
up, 1000 Mbps, full duplex

Mar  7 11:18:30 colossus mac: [ID 435574 kern.info] NOTICE: igb2 link up,
1000 Mbps, full duplex

Mar  7 11:18:30 colossus mac: [ID 435574 kern.info] NOTICE: igb4 link up,
1000 Mbps, full duplex

Mar  7 11:18:30 colossus mac: [ID 435574 kern.info] NOTICE: igb5 link up,
1000 Mbps, full duplex

Mar  7 11:18:35 colossus mac: [ID 486395 kern.info] NOTICE: igb3 link down

Mar  7 11:18:35 colossus mac: [ID 486395 kern.info] NOTICE: igb4 link down

Mar  7 11:18:35 colossus mac: [ID 486395 kern.info] NOTICE: igb2 link down

Mar  7 11:18:36 colossus mac: [ID 486395 kern.info] NOTICE: igb5 link down

Mar  7 11:18:36 colossus mac: [ID 486395 kern.info] NOTICE: aggr1000 link
down

------


This goes on indefinitely, interfaces going down, coming up, over and over.
All of the igb interfaces listed here are part of an aggregate group
(although it's actually called aggr1, not aggr1000?). The other interfaces
(2 additional igb's and 4 ixgbe's) did not log error messages, but by this
point the server is unresponsive via ssh over the network and at the
console. Interesting however, is that established file-sharing connections
over the unaffected interfaces continue to function for quite a whole after
the lockup, all night in one case. This includes AFP, SMB, and iSCSI
(giving me enough time to shut down my virtual machines and log off some
key clients). In other words, the zpools are functional, and so are enough
services to keep that particular type of access alive. Establishing new
connections over those protocols after the incident doesn't appear to be
possible. A hard reboot is necessary to regain access to the console and
permit new connections.

My initial thought was that it could be an issue with the switch, but that
seems unlikely because I have other LACP groups that are unaffected. I'm
also thinking that it can't be a coincidence that this only started
happening right after that initial reboot?

Since that reboot, this crash has happened three times. The first, as I
noted, was five days after the reconfiguration, but now they seem to be
happening slightly more frequently, although they're always several days
apart.

I'm considering reverting to a base install and rebuilding the system
config this weekend, as it's very basic.. but still curious if anyone has
seen this type of behavior before.

Regards,

Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140307/c4157b5f/attachment.html>

From cks at cs.toronto.edu  Fri Mar  7 20:49:44 2014
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Fri, 07 Mar 2014 15:49:44 -0500
Subject: [OmniOS-discuss] Bug: OmniOS r151008j terminates iSCSI initiator
	too early in shutdown
Message-ID: <20140307204944.5BB0A1A0488@apps0.cs.toronto.edu>

 In at least OmniOS r151008j, the iSCSI initiator and thus any iSCSI
disks it has established are shut down relatively early during a shutdown
or reboot. In specific they are terminated before halt et al runs
'/sbin/bootadm -ea update_all' (in halt.c's do_archives_update()).
Under some circumstances this will cause system shutdown to hang.

 Suppose that you have ZFS pools that are hosted on iSCSI disks and
those pools are set to the default 'failmode=wait'. When the iSCSI disks
go away due to initiator shutdown, those pools enter a state where any
IO to them will stall. Unfortunately bootadm does such IO (or at least
does something that stalls in ZFS-land) and as such will itself stall,
which stalls the shutdown process.

 Presumably either bootadm should be run earlier or iSCSI initiator
shutdown should happen later or both.

	- cks

From jimklimov at cos.ru  Sat Mar  8 11:50:07 2014
From: jimklimov at cos.ru (Jim Klimov)
Date: Sat, 08 Mar 2014 12:50:07 +0100
Subject: [OmniOS-discuss] Bug: OmniOS r151008j terminates iSCSI
 initiator too early in shutdown
In-Reply-To: <20140307204944.5BB0A1A0488@apps0.cs.toronto.edu>
References: <20140307204944.5BB0A1A0488@apps0.cs.toronto.edu>
Message-ID: <531B03EF.7070501@cos.ru>

On 2014-03-07 21:49, Chris Siebenmann wrote:
>   In at least OmniOS r151008j, the iSCSI initiator and thus any iSCSI
> disks it has established are shut down relatively early during a shutdown
> or reboot. In specific they are terminated before halt et al runs
> '/sbin/bootadm -ea update_all' (in halt.c's do_archives_update()).
> Under some circumstances this will cause system shutdown to hang.
>
>   Suppose that you have ZFS pools that are hosted on iSCSI disks and
> those pools are set to the default 'failmode=wait'. When the iSCSI disks
> go away due to initiator shutdown, those pools enter a state where any
> IO to them will stall. Unfortunately bootadm does such IO (or at least
> does something that stalls in ZFS-land) and as such will itself stall,
> which stalls the shutdown process.
>
>   Presumably either bootadm should be run earlier or iSCSI initiator
> shutdown should happen later or both.

I guess you can control the order of shutdown procedures with
SMF dependencies. In particular, it might be helpful to ensure
that your system completely exports the remote-hosted pools
before disabling iSCSI (and possibly networking, etc.).

I hope that my write-up on the OI wiki would be relevant here:
http://wiki.openindiana.org/oi/Advanced+-+ZFS+Pools+as+SMF+services+and+iSCSI+loopback+mounts

Likewise, any of your services which need data from this pool
and can be wrapped into SMF (like VM's, zones, etc.) can also be
sure to stop properly before you export the pool and stop iSCSI.

http://wiki.openindiana.org/display/oi/Zones+as+SMF+services

I do mean to brush up those articles and code samples into a
more proper form (a package or something), but in the meanwhile
the articles can do with some manual work on the user's side ;)

HTH,
//Jim


From jimklimov at cos.ru  Sat Mar  8 17:11:36 2014
From: jimklimov at cos.ru (Jim Klimov)
Date: Sat, 08 Mar 2014 18:11:36 +0100
Subject: [OmniOS-discuss] Reproducible r151008j kernel crash with ZFS
	pools	on iSCSI
In-Reply-To: <20140307193454.0E2911A0488@apps0.cs.toronto.edu>
References: <20140307193454.0E2911A0488@apps0.cs.toronto.edu>
Message-ID: <531B4F48.7060005@cos.ru>

On 2014-03-07 20:34, Chris Siebenmann wrote:
>   I have a reproducible kernel crash with OmniOS r151008j. The situation:
>
>   The basic setup is a ZFS pool on mirrored pairs of iSCSI disks. The
> iSCSI disks come from two different iSCSI targets, and all
> targets are multipathed over two 10G networks. The pool is set to
> 'failmode=continue'.  If I start a large streaming write to the pool and
> then take down both iSCSI interfaces on both targets (making all disks
> in the pool completely unavailable), OmniOS panics after a couple of
> minutes. Fortunately this doesn't happen if only a single target becomes
> inaccessible.

By "pointing my finger into the sky" I might guesstimate that since you
have some streaming writes and they do go on, some buffer space becomes
exhausted (perhaps the hanging ZIOs waiting for the storage backends to
come back). I would expect the write()'s to not return and thus throttle
the clients from pushing more data, but perhaps there are enough client
threads trying to write that their maximum buffer spaces combined would
overwhelm the particular server.

In short: when reproducing the bug, try something like "vmstat 1" in a
separate SSH shell, to see if your available memory plummets when you
disconnect the devices and/or the "sr" (scanrate, search for swapping)
increases substantially.

HTH,
//Jim Klimov


From cks at cs.toronto.edu  Sat Mar  8 21:31:16 2014
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Sat, 08 Mar 2014 16:31:16 -0500
Subject: [OmniOS-discuss] Bug: OmniOS r151008j terminates iSCSI
	initiator too early in shutdown
In-Reply-To: Your message of Sat, 08 Mar 2014 12:50:07 +0100.
	<531B03EF.7070501@cos.ru>
Message-ID: <20140308213116.740201A03A0@apps0.cs.toronto.edu>

| On 2014-03-07 21:49, Chris Siebenmann wrote:
| >   In at least OmniOS r151008j, the iSCSI initiator and thus any iSCSI
| > disks it has established are shut down relatively early during a shutdown
| > or reboot. In specific they are terminated before halt et al runs
| > '/sbin/bootadm -ea update_all' (in halt.c's do_archives_update()).
| > Under some circumstances this will cause system shutdown to hang.
[...]
| >   Presumably either bootadm should be run earlier or iSCSI initiator
| > shutdown should happen later or both.
| 
| I guess you can control the order of shutdown procedures with
| SMF dependencies. In particular, it might be helpful to ensure
| that your system completely exports the remote-hosted pools
| before disabling iSCSI (and possibly networking, etc.).

 Unfortunately exporting pools on shutdown is an ugly and potentially
fragile workaround with a number of side effects (and one that was not
necessary on Solaris 10).

 As far as I can tell from simply looking at things right now, even
an orderly shutdown on an OmniOS system will not avoid this. I don't see
anything that inactivates pools[*] or even unmounts ZFS filesystems even
in an orderly shutdown. And SMF shutdown procedures are deliberately
bypassed if you just run 'reboot' (it's in the manpage if you read all
the way to the end and ignore the fact that it doesn't talk about SMF).

	- cks
[*: partly because there is no user-level way to do this as far as I
    know. Explicitly exporting a pool is a different thing.]

From cks at cs.toronto.edu  Sat Mar  8 21:35:37 2014
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Sat, 08 Mar 2014 16:35:37 -0500
Subject: [OmniOS-discuss] Bug: OmniOS r151008j terminates iSCSI
	initiator too early in shutdown
In-Reply-To: cks's message of Sat, 08 Mar 2014 16:31:16 -0500.
	<20140308213116.740201A03A0@apps0.cs.toronto.edu>
Message-ID: <20140308213537.A2A1C1A03A0@apps0.cs.toronto.edu>

I wrote:
|  As far as I can tell from simply looking at things right now, even
| an orderly shutdown on an OmniOS system will not avoid this.

 I should clarify that: 'on a normal, stock setup OmniOS system'.
You can of course add SMF jobs to import and export ZFS pools and then
shim them into the dependency order, but an out of the box OmniOS system
does not do this right.

(I'm not convinced it's ever possible to do it right without the
administrator having to explicitly configure things, but maybe there's a
way to make all of the magic work if the right SMF jobs were present by
default.)

	- cks

From richard.elling at richardelling.com  Sun Mar  9 02:28:21 2014
From: richard.elling at richardelling.com (Richard Elling)
Date: Sat, 8 Mar 2014 18:28:21 -0800
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: <20140305213809.3DC421A0826@apps0.cs.toronto.edu>
References: <20140305213809.3DC421A0826@apps0.cs.toronto.edu>
Message-ID: <15D16E14-A80D-47B9-94AB-427A2F9D7BBA@RichardElling.com>


On Mar 5, 2014, at 1:38 PM, Chris Siebenmann <cks at cs.toronto.edu> wrote:

> It turns out that there is an unpleasant consequence to syseventd being
> willing to mount ZFS filesystems for additional pools before the 'zfs
> mount -a' has run: you can get unresolvable mount conflicts in some
> situations.

The basic problem affects other file systems, too. The general best practice
has always been to keep your hierarchy flat. But...

> 
> Suppose that you have /opt as a separate ZFS filesystem in your
> root pool and you also have /opt/bigthing as a ZFS filesystem in
> a second pool. You can set this up and everything looks right, but
> if you reboot and syseventd beats 'zfs mount -a' for whatever reasons,
> you get an explosion:
> 
> - we start with no additional filesystems mounted, including /opt
> - syseventd grabs the second pool, starts mounting things, and
>  mounts /opt/bigthing on the *bare* root filesystem, making /opt
>  (if necessary) in the process.
> - 'zfs mount -a' reaches /opt and attempts to mount it. However,
>  because syseventd has already mounted /opt/bigthing, /opt is not
>  empty. FAILURE.
> 
> As far as I can tell there is no particularly good cure for this.  To
> me it really looks like syseventd should either not be started before
> fs-local (although I don't know if anything breaks if its startup is
> deferred) or that it should not be mounting ZFS filesystems (although I
> can half-see the attraction of it doing so).

... a fix would necessitate building a multi-pool dependency tree. Where 
would this live?

How about if we put it in /etc?

This is effectively what vfstab does, though in a more simplistic manner: it 
simply sorts the list of file systems and mounts the short path first. The difference
between vfstab and ZFS automatic mounts is that the former can be multi-pool
aware, even if it doesn't know anything about pools at all. Hence the "solution"
is ZFS mountpoint=legacy and use vfstab.
 -- richard

--

Richard.Elling at RichardElling.com
+1-760-896-4422


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140308/d9a95191/attachment-0001.html>

From cks at cs.toronto.edu  Sun Mar  9 02:56:34 2014
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Sat, 08 Mar 2014 21:56:34 -0500
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: richard.elling's message of Sat, 08 Mar 2014 18:28:21 -0800.
	<15D16E14-A80D-47B9-94AB-427A2F9D7BBA@RichardElling.com>
Message-ID: <20140309025634.C81041A03A0@apps0.cs.toronto.edu>

| On Mar 5, 2014, at 1:38 PM, Chris Siebenmann <cks at cs.toronto.edu> wrote:
| > It turns out that there is an unpleasant consequence to syseventd
| > being willing to mount ZFS filesystems for additional pools before
| > the 'zfs mount -a' has run: you can get unresolvable mount conflicts
| > in some situations.
[...]
Richard Elling:
| ... a fix would necessitate building a multi-pool dependency
| tree. Where would this live?

 The thing is that ZFS already has a multi-pool dependency that works
perfectly well in this situation. 'zfs mount -a' processess all pools
at once and sorts the mount list so that /opt will be mounted before
/opt/bigthing. What makes this not work is that syseventd is willing
to mount filesystems from non-root pools before the rpool mounts have
completed (and also I believe to do pool mounts on a pool by pool basis).

 At a minimum I believe that syseventd should not be mounting
filesystems from non-rpool pools before all rpool mounts have
completed. I would prefer that syseventd not do mounts at all before
/system/filesystem/local finishes.

(You cannot in general defer syseventd until afterwards because there
are a number of dependencies in SMF today that I assume are there
for good reason. I have actually inventoried these in the process of
relocating syseventd to after fs-local so I can provide a list if people
want.[*])

	- cks
[*: This is where I wish SMF had a way to report the full dependency
    graph in one go in some format, so you did not have to play
    whack-a-mole when doing this sort of thing and also potentially
    blow up your system.]

From jimklimov at cos.ru  Mon Mar 10 15:13:06 2014
From: jimklimov at cos.ru (Jim Klimov)
Date: Mon, 10 Mar 2014 16:13:06 +0100
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: <20140309025634.C81041A03A0@apps0.cs.toronto.edu>
References: <20140309025634.C81041A03A0@apps0.cs.toronto.edu>
Message-ID: <531DD682.8000507@cos.ru>

On 2014-03-09 03:28, Richard Elling wrote:> The basic problem affects 
other file systems, too. The general best practice
 > has always been to keep your hierarchy flat. But...

That is a strange best practice, especially given that ZFS allows
and markets the ability of hierarchical datasets. But at least in
this case, this is irrelevant since Chris's setup used datasets
living just under the pool's root. Flatter than that is a private
pool per user, which is not quite the promoted ZFS way ;)

On 2014-03-09 03:56, Chris Siebenmann wrote:
> [*: This is where I wish SMF had a way to report the full dependency
>      graph in one go in some format, so you did not have to play
>      whack-a-mole when doing this sort of thing and also potentially
>      blow up your system.]

This one immediately came to mind:
"SMF Dependency Graph Generator"
https://java.net/projects/scfdot/pages/Home
https://java.net/projects/scfdot/sources/scfdot-src/show

I am not sure how alive or functional this project is today, and on
OmniOS (or any other non-Oracle distro) in particular. But IMHO it
is the best fit to your question (says so on the label ;) ).

//Jim


From richard.elling at richardelling.com  Mon Mar 10 15:56:45 2014
From: richard.elling at richardelling.com (Richard Elling)
Date: Mon, 10 Mar 2014 08:56:45 -0700
Subject: [OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?
In-Reply-To: <531DD682.8000507@cos.ru>
References: <20140309025634.C81041A03A0@apps0.cs.toronto.edu>
	<531DD682.8000507@cos.ru>
Message-ID: <7378CB27-D4A8-415C-A70A-C5668396C88A@RichardElling.com>


> On Mar 10, 2014, at 8:13 AM, Jim Klimov <jimklimov at cos.ru> wrote:
> 
> On 2014-03-09 03:28, Richard Elling wrote:> The basic problem affects other file systems, too. The general best practice
> > has always been to keep your hierarchy flat. But...
> 
> That is a strange best practice, especially given that ZFS allows
> and markets the ability of hierarchical datasets.

Hierarchial datasets work well. The problems occur with hierarchial pools.
 -- richard

> But at least in
> this case, this is irrelevant since Chris's setup used datasets
> living just under the pool's root. Flatter than that is a private
> pool per user, which is not quite the promoted ZFS way ;)
> 
>> On 2014-03-09 03:56, Chris Siebenmann wrote:
>> [*: This is where I wish SMF had a way to report the full dependency
>>     graph in one go in some format, so you did not have to play
>>     whack-a-mole when doing this sort of thing and also potentially
>>     blow up your system.]
> 
> This one immediately came to mind:
> "SMF Dependency Graph Generator"
> https://java.net/projects/scfdot/pages/Home
> https://java.net/projects/scfdot/sources/scfdot-src/show
> 
> I am not sure how alive or functional this project is today, and on
> OmniOS (or any other non-Oracle distro) in particular. But IMHO it
> is the best fit to your question (says so on the label ;) ).
> 
> //Jim
> 
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

From cks at cs.toronto.edu  Mon Mar 10 16:10:40 2014
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Mon, 10 Mar 2014 12:10:40 -0400
Subject: [OmniOS-discuss] Reproducible r151008j kernel crash with ZFS
	pools on iSCSI
In-Reply-To: Your message of Sat, 08 Mar 2014 18:11:36 +0100.
	<531B4F48.7060005@cos.ru>
Message-ID: <20140310161040.9263E1A053B@apps0.cs.toronto.edu>

| In short: when reproducing the bug, try something like "vmstat 1" in a
| separate SSH shell, to see if your available memory plummets when you
| disconnect the devices and/or the "sr" (scanrate, search for swapping)
| increases substantially.

 'vmstat 1' shows no sign of this. sr is flatlined at zero all through
and free is basically frozen (with roughly 39 GB free[*]). I also have
live monitoring of the user-level write rate of the IO source and it
stalls relatively early on in the process.

 To the extent that I can see anything from the call stack in the
panics, it really looks to me as if something is overrunning a kernel
stack size limit for some reason.

	- cks
[*: this is a 64 GB machine.]

From jimklimov at cos.ru  Mon Mar 10 16:37:35 2014
From: jimklimov at cos.ru (Jim Klimov)
Date: Mon, 10 Mar 2014 17:37:35 +0100
Subject: [OmniOS-discuss] Reproducible r151008j kernel crash with ZFS
 pools on iSCSI
In-Reply-To: <20140310161040.9263E1A053B@apps0.cs.toronto.edu>
References: <20140310161040.9263E1A053B@apps0.cs.toronto.edu>
Message-ID: <531DEA4F.2000709@cos.ru>

On 2014-03-10 17:10, Chris Siebenmann wrote:
>   To the extent that I can see anything from the call stack in the
> panics, it really looks to me as if something is overrunning a kernel
> stack size limit for some reason.

Also, just to be sure, you don't do anything non-standard, like ZFS
blocks over 128KB in size? There were experiments toward this which
according to the lists could lead to some overflow.
Probably not your case, but popped out in my head by some analogy ;)

//Jim


From cks at cs.toronto.edu  Mon Mar 10 16:41:35 2014
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Mon, 10 Mar 2014 12:41:35 -0400
Subject: [OmniOS-discuss] Reproducible r151008j kernel crash with ZFS
	pools on iSCSI
In-Reply-To: jimklimov's message of Mon, 10 Mar 2014 17:37:35 +0100.
	<531DEA4F.2000709@cos.ru>
Message-ID: <20140310164135.2E8861A053B@apps0.cs.toronto.edu>

| On 2014-03-10 17:10, Chris Siebenmann wrote:
| >   To the extent that I can see anything from the call stack in the
| > panics, it really looks to me as if something is overrunning a
| > kernel stack size limit for some reason.
|
| Also, just to be sure, you don't do anything non-standard, like ZFS
| blocks over 128KB in size? There were experiments toward this which
| according to the lists could lead to some overflow.  Probably not your
| case, but popped out in my head by some analogy ;)

 I've got nothing unusual in this way; the pool setup is plain and
ordinary. The pools are on 4k sector iSCSI disks (that are being
reported that way and the pool vdevs are ashift=12).

	- cks

From cj.keist at colostate.edu  Tue Mar 11 19:21:30 2014
From: cj.keist at colostate.edu (CJ Keist)
Date: Tue, 11 Mar 2014 13:21:30 -0600
Subject: [OmniOS-discuss] VirtIO drivers??
Message-ID: <531F623A.9080800@colostate.edu>

I saw question asked on this discussion list but no answer was given. Is 
there VirtIO driver for OmniOS? Wanting to run OmniOS on proxmox KVM 
with VirtIO for network driver.


-- 
C. J. Keist                     Email: cj.keist at colostate.edu
Systems Group Manager           Solaris 10 OS (SAI)
Engineering Network Services    Phone: 970-491-0630
College of Engineering, CSU     Fax:   970-491-5569
Ft. Collins, CO 80523-1301

All I want is a chance to prove 'Money can't buy happiness'

From danmcd at omniti.com  Tue Mar 11 19:43:22 2014
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 11 Mar 2014 15:43:22 -0400
Subject: [OmniOS-discuss] VirtIO drivers??
In-Reply-To: <531F623A.9080800@colostate.edu>
References: <531F623A.9080800@colostate.edu>
Message-ID: <190CFDCB-B118-422E-82D9-11A3ABE860EA@omniti.com>


On Mar 11, 2014, at 3:21 PM, CJ Keist <cj.keist at colostate.edu> wrote:

> I saw question asked on this discussion list but no answer was given. Is there VirtIO driver for OmniOS? Wanting to run OmniOS on proxmox KVM with VirtIO for network driver.

Check the illumos-nexenta repo:

	github.com/Nexenta/illumos-nexenta

They're further along on this front than upstream.  You'd be a hero if you tested it publically and upstreamed it!

Dan


From sim.ple at live.nl  Thu Mar 13 10:54:16 2014
From: sim.ple at live.nl (Randy S)
Date: Thu, 13 Mar 2014 11:54:16 +0100
Subject: [OmniOS-discuss] kayak problems
Message-ID: <DUB124-W1079E2B7FFAAC1E975A29EE0710@phx.gbl>

Hi,
I'm new to omnios and was testing to see how the kayak system works.
Just installed the latest stable omnios and followed the online standard
instructions to install kayak. Website reachable, everything seems fine.
Working image=r151008 en configuration file created. (Just to make my info complete).

I have noticed that more people have had problems with it and have I the idea that 
somehow they solved them with the help of your forum.
For me however, the suggestions didn't work for me ... yet.

The thing is that when I start a workstation to be installed by kayak, the process fails
because (I guess) the miniroot is missing libidn.so.11.6.11

What I did: (taken from http://comments.gmane.org/gmane.os.omnios.general/1660)
# gzip -d miniroot.gz
# cp miniroot /tmp
# mkdir /mnt/test
# mount -o nologging `lofiadm -a /tmp/miniroot` /mnt/test/

I then tried to copy the lib to the /mnt/test but could not because there is no space left
on the device. 
Can anybody please tell me how to solve this? 

Thanks in advance

Greetings

Randy
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140313/227aa96b/attachment-0001.html>

From doug at will.to  Thu Mar 13 14:07:02 2014
From: doug at will.to (Doug Hughes)
Date: Thu, 13 Mar 2014 10:07:02 -0400
Subject: [OmniOS-discuss] kayak problems
In-Reply-To: <DUB124-W1079E2B7FFAAC1E975A29EE0710@phx.gbl>
References: <DUB124-W1079E2B7FFAAC1E975A29EE0710@phx.gbl>
Message-ID: <92f4622d-02b8-419d-974d-8a65ee09bfae.maildroid@localhost>

Yes, it is missing a few key libraries, I reported that in a similar email back around december but it seems to have gone unfixed in the interim. If you find my posts, it indicates the things that I found and fixed. Luckily, it is easy to uncompress, mount and fix the miniroot with the missing libraries and symlinks.

Sent from my android device.

-----Original Message-----
From: Randy S <sim.ple at live.nl>
To: "omnios-discuss at lists.omniti.com" <omnios-discuss at lists.omniti.com>
Sent: Thu, 13 Mar 2014 7:00 AM
Subject: [OmniOS-discuss] kayak problems

Hi,
I'm new to omnios and was testing to see how the kayak system works.
Just installed the latest stable omnios and followed the online standard
instructions to install kayak. Website reachable, everything seems fine.
Working image=r151008 en configuration file created. (Just to make my info complete).

I have noticed that more people have had problems with it and have I the idea that 
somehow they solved them with the help of your forum.
For me however, the suggestions didn't work for me ... yet.

The thing is that when I start a workstation to be installed by kayak, the process fails
because (I guess) the miniroot is missing libidn.so.11.6.11

What I did: (taken from http://comments.gmane.org/gmane.os.omnios.general/1660)
# gzip -d miniroot.gz
# cp miniroot /tmp
# mkdir /mnt/test
# mount -o nologging `lofiadm -a /tmp/miniroot` /mnt/test/

I then tried to copy the lib to the /mnt/test but could not because there is no space left
on the device. 
Can anybody please tell me how to solve this? 

Thanks in advance

Greetings

Randy
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140313/5bf7fa5d/attachment.html>

From esproul at omniti.com  Thu Mar 13 14:16:56 2014
From: esproul at omniti.com (Eric Sproul)
Date: Thu, 13 Mar 2014 10:16:56 -0400
Subject: [OmniOS-discuss] kayak problems
In-Reply-To: <92f4622d-02b8-419d-974d-8a65ee09bfae.maildroid@localhost>
References: <DUB124-W1079E2B7FFAAC1E975A29EE0710@phx.gbl>
	<92f4622d-02b8-419d-974d-8a65ee09bfae.maildroid@localhost>
Message-ID: <CA+QY2RTsTeE2HD6A5R1g6dkJC+yhMtBCr2EPL56NiCJim7_ptg@mail.gmail.com>

On Thu, Mar 13, 2014 at 10:07 AM, Doug Hughes <doug at will.to> wrote:
> Yes, it is missing a few key libraries, I reported that in a similar email
> back around december but it seems to have gone unfixed in the interim. If
> you find my posts, it indicates the things that I found and fixed. Luckily,
> it is easy to uncompress, mount and fix the miniroot with the missing
> libraries and symlinks.

It's been fixed in the code.  Updated packages will likely be coming
soon, but if you're impatient, you can build it.

https://github.com/omniti-labs/kayak/commit/3eb2021a8bd5cd0e52fc34c7520ccf98a2ad6aa5

Eric

From doug at will.to  Thu Mar 13 14:16:48 2014
From: doug at will.to (Doug Hughes)
Date: Thu, 13 Mar 2014 10:16:48 -0400
Subject: [OmniOS-discuss] kayak problems
In-Reply-To: <CA+QY2RTsTeE2HD6A5R1g6dkJC+yhMtBCr2EPL56NiCJim7_ptg@mail.gmail.com>
References: <DUB124-W1079E2B7FFAAC1E975A29EE0710@phx.gbl>
	<92f4622d-02b8-419d-974d-8a65ee09bfae.maildroid@localhost>
	<CA+QY2RTsTeE2HD6A5R1g6dkJC+yhMtBCr2EPL56NiCJim7_ptg@mail.gmail.com>
Message-ID: <5fcb2063-606c-4542-8daa-88170197ecf4.maildroid@localhost>

Thanls, Eric!

Sent from my android device.

-----Original Message-----
From: Eric Sproul <esproul at omniti.com>
To: Doug Hughes <doug at will.to>
Cc: "omnios-discuss at lists.omniti.com" <omnios-discuss at lists.omniti.com>, Randy
 S <sim.ple at live.nl>
Sent: Thu, 13 Mar 2014 10:16 AM
Subject: Re: [OmniOS-discuss] kayak problems

On Thu, Mar 13, 2014 at 10:07 AM, Doug Hughes <doug at will.to> wrote:
> Yes, it is missing a few key libraries, I reported that in a similar email
> back around december but it seems to have gone unfixed in the interim. If
> you find my posts, it indicates the things that I found and fixed. Luckily,
> it is easy to uncompress, mount and fix the miniroot with the missing
> libraries and symlinks.

It's been fixed in the code.  Updated packages will likely be coming
soon, but if you're impatient, you can build it.

https://github.com/omniti-labs/kayak/commit/3eb2021a8bd5cd0e52fc34c7520ccf98a2ad6aa5

Eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140313/0619b9eb/attachment.html>

From alex.ranskis at gmail.com  Mon Mar 17 22:22:19 2014
From: alex.ranskis at gmail.com (Alex)
Date: Mon, 17 Mar 2014 23:22:19 +0100
Subject: [OmniOS-discuss] kayak problems
In-Reply-To: <CA+QY2RTsTeE2HD6A5R1g6dkJC+yhMtBCr2EPL56NiCJim7_ptg@mail.gmail.com>
References: <DUB124-W1079E2B7FFAAC1E975A29EE0710@phx.gbl>
	<92f4622d-02b8-419d-974d-8a65ee09bfae.maildroid@localhost>
	<CA+QY2RTsTeE2HD6A5R1g6dkJC+yhMtBCr2EPL56NiCJim7_ptg@mail.gmail.com>
Message-ID: <CA+VdLjC+X=vek95dR-Oj+FjYhc2EzRQMiBGx624OdOJf2z03Cw@mail.gmail.com>

On 13 March 2014 15:16, Eric Sproul <esproul at omniti.com> wrote:

> On Thu, Mar 13, 2014 at 10:07 AM, Doug Hughes <doug at will.to> wrote:
> > Yes, it is missing a few key libraries, I reported that in a similar
> email
> > back around december but it seems to have gone unfixed in the interim. If
> > you find my posts, it indicates the things that I found and fixed.
> Luckily,
> > it is easy to uncompress, mount and fix the miniroot with the missing
> > libraries and symlinks.
>
> It's been fixed in the code.  Updated packages will likely be coming
> soon, but if you're impatient, you can build it.
>
>
> https://github.com/omniti-labs/kayak/commit/3eb2021a8bd5cd0e52fc34c7520ccf98a2ad6aa5


I've also had issues with disk_help.sh, if using '<' or '>' to match for a
specific disk size.
Caused by :

    size=`prtvtoc $rdsk 2>/dev/null | awk '/bytes\/sector/{bps=$2}
/sectors\/cylinder/{bpc=bps*$2} /accessible sectors/{print
($2*bps)/1048576;} /accessible cylinders/{print int(($2*bpc)/1048576);}'`

awk will switch to scientific notation for large values and bash will fail
later while comparing that value to the one provided in the configuration.

switching from print to printf("%.0f", ..) fixed it.

Apologies if this has already been reported
Cheers,

-- 
alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140317/f1b81d2f/attachment.html>

From lotheac at iki.fi  Tue Mar 18 15:58:45 2014
From: lotheac at iki.fi (Lauri Tirkkonen)
Date: Tue, 18 Mar 2014 17:58:45 +0200
Subject: [OmniOS-discuss] pkgsend generate bug with spaces in file names
Message-ID: <20140318155845.GC21841@gutsman.lotheac.fi>

I ran into this while packaging Python 3.4.0 which apparently ships
files with space characters in them.

    % mkdir foo && touch 'foo/bar baz'
    % pkgsend generate foo | pkgmogrify
    pkgmogrify: File <stdin> line 1: Malformed action at position: 12:
        file bar baz group=bin mode=0644 owner=root path="bar baz"
                    ^

http://docs.oracle.com/cd/E26502_01/html/E21383/pkgcreate.html#gludq
suggests this has has possibly been fixed in Oracle's pkg.

-- 
Lauri Tirkkonen | +358 50 5341376 | lotheac @ IRCnet

From svavar at januar.is  Wed Mar 19 10:37:34 2014
From: svavar at januar.is (=?ISO-8859-1?Q?Svavar_=D6rn_Eysteinsson?=)
Date: Wed, 19 Mar 2014 10:37:34 +0000
Subject: [OmniOS-discuss] Trying to upgrade from r151006...
Message-ID: <CAEc-0iVry=u1547U50M00FJRAcUy3yrea_fTLvAhcfCr0+4eLg@mail.gmail.com>

Hello list.

I have a HP microserver that I have installed OmniOS on lately last year.
I havn't powered up the server for some time until yesterday.

Having some trouble upgrading the OS to the newest through the pkg command.

Current version is : OmniOS v11 r151006

pkg publishers configured :

omnios                                origin   online
http://pkg.omniti.com/omnios/release/

Now when I issue a pkg update -nv command I will receive the following
errors :


root at blackbox:~# pkg update -nv
Creating Plan |
pkg update: No solution was found to satisfy constraints
Plan Creation: Package solver has not found a solution to update to latest
available versions.
This may indicate an overly constrained set of packages are installed.

latest incorporations:

  pkg://omnios/consolidation/osnet/osnet-incorporation at 0.5.11
,5.11-0.151008:20131204T022427Z
  pkg://omnios/incorporation/jeos/omnios-userland at 11
,5.11-0.151008:20131206T160517Z
  pkg://omnios/entire at 11,5.11-0.151008:20131205T195242Z
  pkg://omnios/incorporation/jeos/illumos-gate at 11
,5.11-0.151008:20131204T024149Z

The following indicates why the system cannot update to the latest version:

  No suitable version of required package
pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151006:20140109T172403Z
found:
    Reject:  pkg://omnios/incorporation/jeos/omnios-userland at 11
,5.11-0.151006:20140109T172403Z
    Reason:  A version for 'incorporate' dependency on
pkg:/library/python-2/python-extra-26 at 0.5.11,5.11-0.151006 cannot be found
  No suitable version of required package
pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151006:20140113T224931Z
found:
    Reject:  pkg://omnios/incorporation/jeos/omnios-userland at 11
,5.11-0.151006:20140113T224931Z
    Reason:  A version for 'incorporate' dependency on
pkg:/library/python-2/python-extra-26 at 0.5.11,5.11-0.151006 cannot be found
  No suitable version of required package
pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151006:20140203T190027Z
found:
    Reject:  pkg://omnios/incorporation/jeos/omnios-userland at 11
,5.11-0.151006:20140203T190027Z
    Reason:  A version for 'incorporate' dependency on
pkg:/library/python-2/python-extra-26 at 0.5.11,5.11-0.151006 cannot be found


Does anyone know that the heck is going on ?

I have followed the procedures on :
http://omnios.omniti.com/wiki.php/Upgrade_r151006_r151008
but, surely when I issue pkg update command I will get these
errors/notification above.

Thanks in advance.

Best regards,

Svavar O - Reykjavik - Iceland
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140319/34553284/attachment-0001.html>

From svavar at januar.is  Wed Mar 19 16:53:11 2014
From: svavar at januar.is (=?ISO-8859-1?Q?Svavar_=D6rn_Eysteinsson?=)
Date: Wed, 19 Mar 2014 16:53:11 +0000
Subject: [OmniOS-discuss] Trying to upgrade from r151006...
In-Reply-To: <CAEc-0iVry=u1547U50M00FJRAcUy3yrea_fTLvAhcfCr0+4eLg@mail.gmail.com>
References: <CAEc-0iVry=u1547U50M00FJRAcUy3yrea_fTLvAhcfCr0+4eLg@mail.gmail.com>
Message-ID: <CAEc-0iU2qwrGyeG=ZB3egZL1FFS9nGwsHfOdMCec_JdCdnJQLg@mail.gmail.com>

I've managed to activate a older BE environment which is still a r151006
version.

When I issue a pkg update -nv it gives me the following packages to update :

root at blackbox:/tmp# pkg update -nv
            Packages to update:        11
     Estimated space available: 218.64 GB
Estimated space to be consumed: 153.88 MB
       Create boot environment:       Yes
     Activate boot environment:       Yes
Create backup boot environment:        No
          Rebuild boot archive:       Yes

Changed packages:
omnios
  developer/debug/mdb
    0.5.11,5.11-0.151006:20130731T194820Z ->
0.5.11,5.11-0.151006:20131019T183740Z
  driver/storage/mpt_sas
    0.5.11,5.11-0.151006:20130506T161108Z ->
0.5.11,5.11-0.151006:20130906T160306Z
  driver/storage/mr_sas
    0.5.11,5.11-0.151006:20130506T161108Z ->
0.5.11,5.11-0.151006:20130906T160306Z
  entire
    11,5.11-0.151006:20130507T204959Z -> 11,5.11-0.151006:20131210T224515Z
  incorporation/jeos/omnios-userland
    11,5.11-0.151006:20130716T202721Z -> 11,5.11-0.151006:20131030T205312Z
  library/python-2/python-extra-26
    0.5.11,5.11-0.151006:20130506T184813Z ->
0.5.11,5.11-0.151008:20131204T024250Z
  library/security/openssl
    1.0.1.5,5.11-0.151006:20130506T185419Z ->
1.0.1.6,5.11-0.151006:20140110T154549Z
  network/dns/bind
    9.9.2.2,5.11-0.151006:20130506T185915Z ->
9.9.3.2,5.11-0.151006:20130731T155125Z
  system/file-system/zfs
    0.5.11,5.11-0.151006:20130814T165834Z ->
0.5.11,5.11-0.151006:20131210T212000Z
  system/kernel
    0.5.11,5.11-0.151006:20130731T194843Z ->
0.5.11,5.11-0.151006:20131019T183804Z
perl.omniti.com
  omniti/perl/www-curl
    4.15,5.11-0.151002:20120807T165910Z ->
4.15,5.11-0.151006:20140312T201517Z


These are the publishers,

root at blackbox:/tmp# pkg publisher
PUBLISHER                             TYPE     STATUS   URI
omnios                                origin   online
http://pkg.omniti.com/omnios/release/
ms.omniti.com                         origin   online
http://pkg.omniti.com/omniti-ms/
perl.omniti.com                       origin   online
http://pkg.omniti.com/omniti-perl/


Is this the only updated files from 151006 to 151008 ?

Thanks in advance.


*SVAVAR ?RN EYSTEINSSON*Kerfisstj?ri
Gsm / mobile +354 862 1624
S?mi / tel +354 531 0101


*Jan?ar marka?sh?s*www.januar.is / Facebook<http://facebook.com/viderumjanuar>


On 19 March 2014 10:37, Svavar ?rn Eysteinsson <svavar at januar.is> wrote:

> Hello list.
>
> I have a HP microserver that I have installed OmniOS on lately last year.
> I havn't powered up the server for some time until yesterday.
>
> Having some trouble upgrading the OS to the newest through the pkg command.
>
> Current version is : OmniOS v11 r151006
>
> pkg publishers configured :
>
> omnios                                origin   online
> http://pkg.omniti.com/omnios/release/
>
> Now when I issue a pkg update -nv command I will receive the following
> errors :
>
>
> root at blackbox:~# pkg update -nv
> Creating Plan |
> pkg update: No solution was found to satisfy constraints
> Plan Creation: Package solver has not found a solution to update to latest
> available versions.
> This may indicate an overly constrained set of packages are installed.
>
> latest incorporations:
>
>   pkg://omnios/consolidation/osnet/osnet-incorporation at 0.5.11
> ,5.11-0.151008:20131204T022427Z
>   pkg://omnios/incorporation/jeos/omnios-userland at 11
> ,5.11-0.151008:20131206T160517Z
>   pkg://omnios/entire at 11,5.11-0.151008:20131205T195242Z
>   pkg://omnios/incorporation/jeos/illumos-gate at 11
> ,5.11-0.151008:20131204T024149Z
>
> The following indicates why the system cannot update to the latest version:
>
>   No suitable version of required package
> pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151006:20140109T172403Z
> found:
>     Reject:  pkg://omnios/incorporation/jeos/omnios-userland at 11
> ,5.11-0.151006:20140109T172403Z
>     Reason:  A version for 'incorporate' dependency on
> pkg:/library/python-2/python-extra-26 at 0.5.11,5.11-0.151006 cannot be found
>   No suitable version of required package
> pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151006:20140113T224931Z
> found:
>     Reject:  pkg://omnios/incorporation/jeos/omnios-userland at 11
> ,5.11-0.151006:20140113T224931Z
>     Reason:  A version for 'incorporate' dependency on
> pkg:/library/python-2/python-extra-26 at 0.5.11,5.11-0.151006 cannot be found
>   No suitable version of required package
> pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151006:20140203T190027Z
> found:
>     Reject:  pkg://omnios/incorporation/jeos/omnios-userland at 11
> ,5.11-0.151006:20140203T190027Z
>     Reason:  A version for 'incorporate' dependency on
> pkg:/library/python-2/python-extra-26 at 0.5.11,5.11-0.151006 cannot be found
>
>
> Does anyone know that the heck is going on ?
>
> I have followed the procedures on :
> http://omnios.omniti.com/wiki.php/Upgrade_r151006_r151008
> but, surely when I issue pkg update command I will get these
> errors/notification above.
>
> Thanks in advance.
>
> Best regards,
>
> Svavar O - Reykjavik - Iceland
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140319/58e2885f/attachment.html>

From mmabis at vmware.com  Fri Mar 21 15:46:27 2014
From: mmabis at vmware.com (Matthew Mabis)
Date: Fri, 21 Mar 2014 08:46:27 -0700 (PDT)
Subject: [OmniOS-discuss] Debating Swapping 2TB with 4TB drives in RaidZ2 or
 Create new Vol and clone
In-Reply-To: <CAEc-0iU2qwrGyeG=ZB3egZL1FFS9nGwsHfOdMCec_JdCdnJQLg@mail.gmail.com>
References: <CAEc-0iVry=u1547U50M00FJRAcUy3yrea_fTLvAhcfCr0+4eLg@mail.gmail.com>
	<CAEc-0iU2qwrGyeG=ZB3egZL1FFS9nGwsHfOdMCec_JdCdnJQLg@mail.gmail.com>
Message-ID: <1424115820.12054356.1395416787354.JavaMail.root@vmware.com>

Hey All, 

I am debating the idea of just swapping all my hard drives in my current 8x2TB RaidZ2 (all be it slowly) and let the environment resilver each drive than expand versus creating a new RaidZ2 on a different box and cloning the data over. 

Obviously i know of the Pros/Cons/Risks associated with that method. My question about debating deals with the new drives being 4K where as the old drives were 512b aligned My Current config is using (6x Hitachi HDS5C302 and 2x SAMSUNG HD203WI) where i will be switching over to ST4000VN000 drives all the way (purchased 4 already waiting a little time to see if i can purchase via a different batch [some ppl debate on this but to me its the way i have done it for a long time]) i don't wan't to us dissimilar models anymore as sometimes the samsung drives in this config went well lets call it NUTTY.... 

I use my environment for multiple things (Network Data Backups, NFS Backups for ESXi, Media Storage) my current environment is running down on space and with my projections ill run out of space within the next 6 months (~26% Free that includes the 1.08TB Reservation) so i am prepping for the transition. 

Just curious what you would do in my situation, replace the drives or build a new vDev and why? 

I have all the underlying hardware to handle it (SAS-2008 Controller, ECC, and a ZIL/SLOG. If needed i could use my infiniband backend to clone the data at 10Gb via IPoIB) 

Thanks 
Matt 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140321/205d8fa1/attachment.html>

From cks at cs.toronto.edu  Fri Mar 21 16:04:37 2014
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Fri, 21 Mar 2014 12:04:37 -0400
Subject: [OmniOS-discuss] Debating Swapping 2TB with 4TB drives in
	RaidZ2 or Create new Vol and clone
In-Reply-To: mmabis's message of Fri, 21 Mar 2014 08:46:27 -0700.
	<1424115820.12054356.1395416787354.JavaMail.root@vmware.com>
Message-ID: <20140321160437.2C8B11A04E9@apps0.cs.toronto.edu>

| I am debating the idea of just swapping all my hard drives in my
| current 8x2TB RaidZ2 (all be it slowly) and let the environment
| resilver each drive than expand versus creating a new RaidZ2 on a
| different box and cloning the data over.
|
| Obviously i know of the Pros/Cons/Risks associated with that
| method. My question about debating deals with the new drives being 4K
| where as the old drives were 512b aligned [...]

 As far as I know there is no question here: you simply cannot put 4K
drives in a vdev originally created with 512b drives[*]. You need to
make a new pool with the 4K drives.

 Even if you could get them into the existing pool, the performance
effects would likely be relatively bad. ZFS does a lot of unaligned
writes.

	- cks
[*: If we're being technical, it's possible to force OmniOS to think
    that they're all 512b drives.
]

From mmabis at vmware.com  Fri Mar 21 16:34:18 2014
From: mmabis at vmware.com (Matthew Mabis)
Date: Fri, 21 Mar 2014 09:34:18 -0700 (PDT)
Subject: [OmniOS-discuss] Debating Swapping 2TB with 4TB drives in
 RaidZ2 or Create new Vol and clone
In-Reply-To: <20140321160437.2C8B11A04E9@apps0.cs.toronto.edu>
References: <20140321160437.2C8B11A04E9@apps0.cs.toronto.edu>
Message-ID: <996286416.12067302.1395419658589.JavaMail.root@vmware.com>

I know the drive itself does 512b emulation but i would rather run 4K if theres a performance increase!

thanks

Matt 


----- Original Message -----
From: "Chris Siebenmann" <cks at cs.toronto.edu>
To: "Matthew Mabis" <mmabis at vmware.com>
Cc: omnios-discuss at lists.omniti.com
Sent: Friday, March 21, 2014 9:04:37 AM
Subject: Re: [OmniOS-discuss] Debating Swapping 2TB with 4TB drives in RaidZ2 or Create new Vol and clone

| I am debating the idea of just swapping all my hard drives in my
| current 8x2TB RaidZ2 (all be it slowly) and let the environment
| resilver each drive than expand versus creating a new RaidZ2 on a
| different box and cloning the data over.
|
| Obviously i know of the Pros/Cons/Risks associated with that
| method. My question about debating deals with the new drives being 4K
| where as the old drives were 512b aligned [...]

 As far as I know there is no question here: you simply cannot put 4K
drives in a vdev originally created with 512b drives[*]. You need to
make a new pool with the 4K drives.

 Even if you could get them into the existing pool, the performance
effects would likely be relatively bad. ZFS does a lot of unaligned
writes.

	- cks
[*: If we're being technical, it's possible to force OmniOS to think
    that they're all 512b drives.
]

From tobi at oetiker.ch  Fri Mar 21 16:48:14 2014
From: tobi at oetiker.ch (Tobias Oetiker)
Date: Fri, 21 Mar 2014 17:48:14 +0100 (CET)
Subject: [OmniOS-discuss] zpool degraded while smart sais disks are OK
Message-ID: <alpine.DEB.2.02.1403211743260.3125@froburg.oetiker.ch>

a zpool on one of our boxes has been degraded with several disks
faulted ...

* the disks are all sas direct attached
* according to smartctl the offending disks have no faults.
* zfs decided to fault the disks after the events below.

I have now told the pool to clear the errors and it is resilvering the disks ... (in progress)

any idea what is happening here ?

Mar  2 22:21:51 foo scsi: [ID 243001 kern.warning] WARNING: /pci at 0,0/pci8086,3c04 at 2/pci1000,3020 at 0 (mpt_sas0):
Mar  2 22:21:51 foo         mptsas_handle_event_sync: IOCStatus=0x8000, IOCLogInfo=0x31170000
Mar  2 22:21:51 foo scsi: [ID 243001 kern.warning] WARNING: /pci at 0,0/pci8086,3c04 at 2/pci1000,3020 at 0 (mpt_sas0):
Mar  2 22:21:51 foo         mptsas_handle_event: IOCStatus=0x8000, IOCLogInfo=0x31170000
Mar  2 22:21:51 foo scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,3c04 at 2/pci1000,3020 at 0 (mpt_sas0):
Mar  2 22:21:51 foo         Log info 0x31170000 received for target 11.
Mar  2 22:21:51 foo         scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc
Mar  2 22:21:51 foo scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,3c04 at 2/pci1000,3020 at 0 (mpt_sas0):
Mar  2 22:21:51 foo         Log info 0x31170000 received for target 11.
Mar  2 22:21:51 foo         scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc


Mar  5 02:20:53 foo scsi: [ID 243001 kern.warning] WARNING: /pci at 0,0/pci8086,3c06 at 2,2/pci1000,3020 at 0 (mpt_sas1):
Mar  5 02:20:53 foo         mptsas_handle_event_sync: IOCStatus=0x8000, IOCLogInfo=0x31170000
Mar  5 02:20:53 foo scsi: [ID 243001 kern.warning] WARNING: /pci at 0,0/pci8086,3c06 at 2,2/pci1000,3020 at 0 (mpt_sas1):
Mar  5 02:20:53 foo         mptsas_handle_event: IOCStatus=0x8000, IOCLogInfo=0x31170000
Mar  5 02:20:53 foo scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,3c06 at 2,2/pci1000,3020 at 0 (mpt_sas1):
Mar  5 02:20:53 foo         Log info 0x31170000 received for target 10.
Mar  5 02:20:53 foo         scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc
Mar  5 02:20:53 foo scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,3c06 at 2,2/pci1000,3020 at 0 (mpt_sas1):
Mar  5 02:20:53 foo         Log info 0x31170000 received for target 10.
Mar  5 02:20:53 foo         scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc

-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
www.oetiker.ch tobi at oetiker.ch +41 62 775 9902
*** We are hiring IT staff: www.oetiker.ch/jobs ***

From cks at cs.toronto.edu  Fri Mar 21 17:04:50 2014
From: cks at cs.toronto.edu (Chris Siebenmann)
Date: Fri, 21 Mar 2014 13:04:50 -0400
Subject: [OmniOS-discuss] Debating Swapping 2TB with 4TB drives in
	RaidZ2 or Create new Vol and clone
In-Reply-To: mmabis's message of Fri, 21 Mar 2014 09:34:18 -0700.
	<996286416.12067302.1395419658589.JavaMail.root@vmware.com>
Message-ID: <20140321170450.89A941A04E9@apps0.cs.toronto.edu>

| I know the drive itself does 512b emulation but i would rather run 4K
| if theres a performance increase!

 What matters for OmniOS is what the drive reports as. If it reports
honestly that it has a 4k physical sector size, ZFS will say 'nope!'
even if the drive will accept 512b reads and writes.

 This is a very unfortunate limitation these days since it's increasingly
hard to get drives that do not have 4k physical sector drives. But that's
life.

	- cks

From richard.elling at richardelling.com  Fri Mar 21 19:50:45 2014
From: richard.elling at richardelling.com (Richard Elling)
Date: Fri, 21 Mar 2014 12:50:45 -0700
Subject: [OmniOS-discuss] zpool degraded while smart sais disks are OK
In-Reply-To: <alpine.DEB.2.02.1403211743260.3125@froburg.oetiker.ch>
References: <alpine.DEB.2.02.1403211743260.3125@froburg.oetiker.ch>
Message-ID: <39B55A5A-AA04-4C56-8A74-5B9316861405@RichardElling.com>


On Mar 21, 2014, at 9:48 AM, Tobias Oetiker <tobi at oetiker.ch> wrote:

> a zpool on one of our boxes has been degraded with several disks
> faulted ...
> 
> * the disks are all sas direct attached
> * according to smartctl the offending disks have no faults.
> * zfs decided to fault the disks after the events below.
> 
> I have now told the pool to clear the errors and it is resilvering the disks ... (in progress)
> 
> any idea what is happening here ?
> 
> Mar  2 22:21:51 foo scsi: [ID 243001 kern.warning] WARNING: /pci at 0,0/pci8086,3c04 at 2/pci1000,3020 at 0 (mpt_sas0):
> Mar  2 22:21:51 foo         mptsas_handle_event_sync: IOCStatus=0x8000, IOCLogInfo=0x31170000
> Mar  2 22:21:51 foo scsi: [ID 243001 kern.warning] WARNING: /pci at 0,0/pci8086,3c04 at 2/pci1000,3020 at 0 (mpt_sas0):
> Mar  2 22:21:51 foo         mptsas_handle_event: IOCStatus=0x8000, IOCLogInfo=0x31170000
> Mar  2 22:21:51 foo scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,3c04 at 2/pci1000,3020 at 0 (mpt_sas0):
> Mar  2 22:21:51 foo         Log info 0x31170000 received for target 11.
> Mar  2 22:21:51 foo         scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc
> Mar  2 22:21:51 foo scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,3c04 at 2/pci1000,3020 at 0 (mpt_sas0):
> Mar  2 22:21:51 foo         Log info 0x31170000 received for target 11.
> Mar  2 22:21:51 foo         scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc

These are command aborted reports from the target device. You will see these every 60 seconds if the disk
is not responding and the subsequent reset of the disk aborts the commands that are not responding.
 -- richard

> 
> 
> Mar  5 02:20:53 foo scsi: [ID 243001 kern.warning] WARNING: /pci at 0,0/pci8086,3c06 at 2,2/pci1000,3020 at 0 (mpt_sas1):
> Mar  5 02:20:53 foo         mptsas_handle_event_sync: IOCStatus=0x8000, IOCLogInfo=0x31170000
> Mar  5 02:20:53 foo scsi: [ID 243001 kern.warning] WARNING: /pci at 0,0/pci8086,3c06 at 2,2/pci1000,3020 at 0 (mpt_sas1):
> Mar  5 02:20:53 foo         mptsas_handle_event: IOCStatus=0x8000, IOCLogInfo=0x31170000
> Mar  5 02:20:53 foo scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,3c06 at 2,2/pci1000,3020 at 0 (mpt_sas1):
> Mar  5 02:20:53 foo         Log info 0x31170000 received for target 10.
> Mar  5 02:20:53 foo         scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc
> Mar  5 02:20:53 foo scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,3c06 at 2,2/pci1000,3020 at 0 (mpt_sas1):
> Mar  5 02:20:53 foo         Log info 0x31170000 received for target 10.
> Mar  5 02:20:53 foo         scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc
> 
> -- 
> Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
> www.oetiker.ch tobi at oetiker.ch +41 62 775 9902
> *** We are hiring IT staff: www.oetiker.ch/jobs ***
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

--

Richard.Elling at RichardElling.com
+1-760-896-4422


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140321/64514169/attachment.html>

From zmalone at omniti.com  Fri Mar 21 20:23:40 2014
From: zmalone at omniti.com (Zach Malone)
Date: Fri, 21 Mar 2014 16:23:40 -0400
Subject: [OmniOS-discuss] zpool degraded while smart sais disks are OK
In-Reply-To: <39B55A5A-AA04-4C56-8A74-5B9316861405@RichardElling.com>
References: <alpine.DEB.2.02.1403211743260.3125@froburg.oetiker.ch>
	<39B55A5A-AA04-4C56-8A74-5B9316861405@RichardElling.com>
Message-ID: <CALLZ=vXbhBpBwvN5ninNy_ehYPujRa6WmFH3w_oze7pxX67Urw@mail.gmail.com>

On Fri, Mar 21, 2014 at 3:50 PM, Richard Elling
<richard.elling at richardelling.com> wrote:
>
> On Mar 21, 2014, at 9:48 AM, Tobias Oetiker <tobi at oetiker.ch> wrote:
>
> a zpool on one of our boxes has been degraded with several disks
> faulted ...
>
> * the disks are all sas direct attached
> * according to smartctl the offending disks have no faults.
> * zfs decided to fault the disks after the events below.
>
> I have now told the pool to clear the errors and it is resilvering the disks
> ... (in progress)
>
> any idea what is happening here ?

...

Did all the disks fault at the same time, or was it spread out over a
longer period?  I'd suspect your power supply or disk controller.
What are your zpool errors?

From tobi at oetiker.ch  Fri Mar 21 22:23:28 2014
From: tobi at oetiker.ch (Tobias Oetiker)
Date: Fri, 21 Mar 2014 23:23:28 +0100 (CET)
Subject: [OmniOS-discuss] zpool degraded while smart sais disks are OK
In-Reply-To: <CALLZ=vXbhBpBwvN5ninNy_ehYPujRa6WmFH3w_oze7pxX67Urw@mail.gmail.com>
References: <alpine.DEB.2.02.1403211743260.3125@froburg.oetiker.ch>
	<39B55A5A-AA04-4C56-8A74-5B9316861405@RichardElling.com>
	<CALLZ=vXbhBpBwvN5ninNy_ehYPujRa6WmFH3w_oze7pxX67Urw@mail.gmail.com>
Message-ID: <alpine.DEB.2.02.1403212322540.3125@froburg.oetiker.ch>

Today Zach Malone wrote:

> On Fri, Mar 21, 2014 at 3:50 PM, Richard Elling
> <richard.elling at richardelling.com> wrote:
> >
> > On Mar 21, 2014, at 9:48 AM, Tobias Oetiker <tobi at oetiker.ch> wrote:
> >
> > a zpool on one of our boxes has been degraded with several disks
> > faulted ...
> >
> > * the disks are all sas direct attached
> > * according to smartctl the offending disks have no faults.
> > * zfs decided to fault the disks after the events below.
> >
> > I have now told the pool to clear the errors and it is resilvering the disks
> > ... (in progress)
> >
> > any idea what is happening here ?
>
> ...
>
> Did all the disks fault at the same time, or was it spread out over a
> longer period?  I'd suspect your power supply or disk controller.
> What are your zpool errors?

it happened over time as you can see from the timestamps in the
log. The errors from zfs's point of view were 1 read and about 30 write

but according to smart the disks are without flaw

cheers
tobi
>
>

-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
www.oetiker.ch tobi at oetiker.ch +41 62 775 9902
*** We are hiring IT staff: www.oetiker.ch/jobs ***

From richard.elling at richardelling.com  Fri Mar 21 23:37:50 2014
From: richard.elling at richardelling.com (Richard Elling)
Date: Fri, 21 Mar 2014 16:37:50 -0700
Subject: [OmniOS-discuss] zpool degraded while smart sais disks are OK
In-Reply-To: <alpine.DEB.2.02.1403212322540.3125@froburg.oetiker.ch>
References: <alpine.DEB.2.02.1403211743260.3125@froburg.oetiker.ch>
	<39B55A5A-AA04-4C56-8A74-5B9316861405@RichardElling.com>
	<CALLZ=vXbhBpBwvN5ninNy_ehYPujRa6WmFH3w_oze7pxX67Urw@mail.gmail.com>
	<alpine.DEB.2.02.1403212322540.3125@froburg.oetiker.ch>
Message-ID: <0D51CBC0-D049-4A12-A733-7DDB6320BD82@richardelling.com>


On Mar 21, 2014, at 3:23 PM, Tobias Oetiker <tobi at oetiker.ch> wrote:

> Today Zach Malone wrote:
> 
>> On Fri, Mar 21, 2014 at 3:50 PM, Richard Elling
>> <richard.elling at richardelling.com> wrote:
>>> 
>>> On Mar 21, 2014, at 9:48 AM, Tobias Oetiker <tobi at oetiker.ch> wrote:
>>> 
>>> a zpool on one of our boxes has been degraded with several disks
>>> faulted ...
>>> 
>>> * the disks are all sas direct attached
>>> * according to smartctl the offending disks have no faults.
>>> * zfs decided to fault the disks after the events below.
>>> 
>>> I have now told the pool to clear the errors and it is resilvering the disks
>>> ... (in progress)
>>> 
>>> any idea what is happening here ?
>> 
>> ...
>> 
>> Did all the disks fault at the same time, or was it spread out over a
>> longer period?  I'd suspect your power supply or disk controller.
>> What are your zpool errors?
> 
> it happened over time as you can see from the timestamps in the
> log. The errors from zfs's point of view were 1 read and about 30 write
> 
> but according to smart the disks are without flaw

Actually, SMART is pretty dumb. In most cases, it only looks for uncorrectable
errors that are related to media or heads. For a clue to more permanent errors,
you will want to look at the read/write error reports for errors that are 
corrected with possible delays. You can also look at the grown defects list.

This behaviour is expected for drives with errors that are not being quickly 
corrected or have firmware bugs (horrors!) and where the disk does not do TLER
(or its vendor's equivalent)
 -- richard


From tobi at oetiker.ch  Sat Mar 22 05:13:25 2014
From: tobi at oetiker.ch (Tobias Oetiker)
Date: Sat, 22 Mar 2014 06:13:25 +0100 (CET)
Subject: [OmniOS-discuss] zpool degraded while smart sais disks are OK
In-Reply-To: <0D51CBC0-D049-4A12-A733-7DDB6320BD82@richardelling.com>
References: <alpine.DEB.2.02.1403211743260.3125@froburg.oetiker.ch>
	<39B55A5A-AA04-4C56-8A74-5B9316861405@RichardElling.com>
	<CALLZ=vXbhBpBwvN5ninNy_ehYPujRa6WmFH3w_oze7pxX67Urw@mail.gmail.com>
	<alpine.DEB.2.02.1403212322540.3125@froburg.oetiker.ch>
	<0D51CBC0-D049-4A12-A733-7DDB6320BD82@richardelling.com>
Message-ID: <alpine.DEB.2.02.1403220606480.3125@froburg.oetiker.ch>

Yesterday Richard Elling wrote:

>
> On Mar 21, 2014, at 3:23 PM, Tobias Oetiker <tobi at oetiker.ch> wrote:

[...]
> >
> > it happened over time as you can see from the timestamps in the
> > log. The errors from zfs's point of view were 1 read and about 30 write
> >
> > but according to smart the disks are without flaw
>
> Actually, SMART is pretty dumb. In most cases, it only looks for uncorrectable
> errors that are related to media or heads. For a clue to more permanent errors,
> you will want to look at the read/write error reports for errors that are
> corrected with possible delays. You can also look at the grown defects list.
>
> This behaviour is expected for drives with errors that are not being quickly
> corrected or have firmware bugs (horrors!) and where the disk does not do TLER
> (or its vendor's equivalent)
>  -- richard

the error counters look like this:


Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:       3494        0         0      3494      44904        530.879           0
write:         0        0         0         0      39111       1793.323           0
verify:        0        0         0         0       8133          0.000           0

the disk vendor is HGST in case anyone has further ideas ... the system has 20 of these disks and the problems occured with
three of them. The system has been running fine for two months previously.

Vendor:               HGST
Product:              HUS724030ALS640
Revision:             A152
User Capacity:        3,000,592,982,016 bytes [3.00 TB]
Logical block size:   512 bytes
Serial number:        P8J20SNV
Device type:          disk
Transport protocol:   SAS

cheers
tobi
>
>

-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
www.oetiker.ch tobi at oetiker.ch +41 62 775 9902
*** We are hiring IT staff: www.oetiker.ch/jobs ***

From bfriesen at simple.dallas.tx.us  Sat Mar 22 15:15:05 2014
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Sat, 22 Mar 2014 10:15:05 -0500 (CDT)
Subject: [OmniOS-discuss] Debating Swapping 2TB with 4TB drives in
 RaidZ2 or Create new Vol and clone
In-Reply-To: <996286416.12067302.1395419658589.JavaMail.root@vmware.com>
References: <20140321160437.2C8B11A04E9@apps0.cs.toronto.edu>
	<996286416.12067302.1395419658589.JavaMail.root@vmware.com>
Message-ID: <alpine.GSO.2.01.1403221005410.1735@freddy.simplesystems.org>

On Fri, 21 Mar 2014, Matthew Mabis wrote:

> I know the drive itself does 512b emulation but i would rather run 4K if theres a performance increase!

Does Illumos really have a "4k" path?  It is my impression that 
knowledge of "4k" influences offsets and allocated block sizes but 
that otherwise things are really still done in terms of 512 byte 
sectors.

A drive which can only support I/O in 4k sectors would not be very 
usable on most systems.

Regardless, I can not imagine why someone would want to replace 2TB 
drives with 4TB drives.  The resilver rate is no better with the 4TB 
drive than with the 2TB drive so the time to resilver is doubled and 
there are limits to what is tolerable.  I/O performance would not 
improve and in fact it may diminish with the larger drives.  It is 
much better to add more spindles to the pool (i.e. another raidz2 
vdev).

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From jimklimov at cos.ru  Sun Mar 23 14:49:11 2014
From: jimklimov at cos.ru (Jim Klimov)
Date: Sun, 23 Mar 2014 15:49:11 +0100
Subject: [OmniOS-discuss] Debating Swapping 2TB with 4TB drives in
 RaidZ2 or Create new Vol and clone
In-Reply-To: <alpine.GSO.2.01.1403221005410.1735@freddy.simplesystems.org>
References: <20140321160437.2C8B11A04E9@apps0.cs.toronto.edu>
	<996286416.12067302.1395419658589.JavaMail.root@vmware.com>
	<alpine.GSO.2.01.1403221005410.1735@freddy.simplesystems.org>
Message-ID: <fb98ca14-06fc-40d1-8e40-4095c7a6c694@email.android.com>

22 ????? 2014??. 16:15:05 CET, Bob Friesenhahn <bfriesen at simple.dallas.tx.us> ?????:
>On Fri, 21 Mar 2014, Matthew Mabis wrote:
>
>> I know the drive itself does 512b emulation but i would rather run 4K
>if theres a performance increase!
>
>Does Illumos really have a "4k" path?  It is my impression that 
>knowledge of "4k" influences offsets and allocated block sizes but 
>that otherwise things are really still done in terms of 512 byte 
>sectors.
>
>A drive which can only support I/O in 4k sectors would not be very 
>usable on most systems.


Alas (or not), that's what does happen with "honest 4k native" disks - the minimal logical io request is 4k as well as the hardware sector size, unlike the 512e drives including those which do and don't honestly report the hardware sector size which can be used i.e. to influence better alignment of system data (fs headers, etc.)

In this 4k-native case, minimal zfs block size is 4k, with some consequences in slack data overheads, fragmentation, metadata-to-data ratios, etc. There may be more visible drawbacks to such allocation on raidz than on mirrors.

In case of 512e drives, the 512b sized blocks may be used, but writes cause RMW cycles in hardware, which may reduce reliability (theoretically - just another failure mode and bug nest in logical paths; no statistics to prove practical weaknesses) and performance (once said a 30% hit for random io).

Since many OSes and FSes use 4k clusters or blocks anyway, given proper alignment to avoid RMW, they don't care or notice - they long haven't used the smaller io sizes anyway.

>
>Regardless, I can not imagine why someone would want to replace 2TB 
>drives with 4TB drives.

Limited number of disk bays? ;)

>  The resilver rate is no better with the 4TB 
>drive than with the 2TB drive so the time to resilver is doubled and 
>there are limits to what is tolerable.  I/O performance would not 
>improve and in fact it may diminish with the larger drives.  It is 
>much better to add more spindles to the pool (i.e. another raidz2 
>vdev).
>
>Bob


--
Typos courtesy of K-9 Mail on my Samsung Android

From bfriesen at simple.dallas.tx.us  Sun Mar 23 15:53:38 2014
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Sun, 23 Mar 2014 10:53:38 -0500 (CDT)
Subject: [OmniOS-discuss] Debating Swapping 2TB with 4TB drives in
 RaidZ2 or Create new Vol and clone
In-Reply-To: <fb98ca14-06fc-40d1-8e40-4095c7a6c694@email.android.com>
References: <20140321160437.2C8B11A04E9@apps0.cs.toronto.edu>
	<996286416.12067302.1395419658589.JavaMail.root@vmware.com>
	<alpine.GSO.2.01.1403221005410.1735@freddy.simplesystems.org>
	<fb98ca14-06fc-40d1-8e40-4095c7a6c694@email.android.com>
Message-ID: <alpine.GSO.2.01.1403231049030.1735@freddy.simplesystems.org>

On Sun, 23 Mar 2014, Jim Klimov wrote:
>>
>> Regardless, I can not imagine why someone would want to replace 2TB
>> drives with 4TB drives.
>
> Limited number of disk bays? ;)

That would be the only reason.  The cost of replacing existing 2TB 
drives with 4TB drives seems pretty high.  Performace would only go 
down, and if the physical block size increases, then storage 
efficiency would decrease.  Disk bays are not necessarily all that 
expensive as long as there is a place nearby to put it.

An existing disk chassis could be replaced with one which supports 
more slots and the existing drives re-used as long as they are 
physically compatible with the new chassis.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From doug at will.to  Sun Mar 23 17:02:54 2014
From: doug at will.to (Doug Hughes)
Date: Sun, 23 Mar 2014 13:02:54 -0400
Subject: [OmniOS-discuss] Debating Swapping 2TB with 4TB drives in
 RaidZ2 or Create new Vol and clone
In-Reply-To: <alpine.GSO.2.01.1403231049030.1735@freddy.simplesystems.org>
References: <20140321160437.2C8B11A04E9@apps0.cs.toronto.edu>
	<996286416.12067302.1395419658589.JavaMail.root@vmware.com>
	<alpine.GSO.2.01.1403221005410.1735@freddy.simplesystems.org>
	<fb98ca14-06fc-40d1-8e40-4095c7a6c694@email.android.com>
	<alpine.GSO.2.01.1403231049030.1735@freddy.simplesystems.org>
Message-ID: <532F13BE.4010204@will.to>

On 3/23/2014 11:53 AM, Bob Friesenhahn wrote:
> On Sun, 23 Mar 2014, Jim Klimov wrote:
>>>
>>> Regardless, I can not imagine why someone would want to replace 2TB
>>> drives with 4TB drives.
>>
>> Limited number of disk bays? ;)
>
> That would be the only reason.  The cost of replacing existing 2TB
> drives with 4TB drives seems pretty high.  Performace would only go
> down, and if the physical block size increases, then storage efficiency
> would decrease.  Disk bays are not necessarily all that expensive as
> long as there is a place nearby to put it.
>
> An existing disk chassis could be replaced with one which supports more
> slots and the existing drives re-used as long as they are physically
> compatible with the new chassis.
>
> Bob

or just they need a lot of capacity for e.g. video or audio media. The 
modest loss of performance would go roughly unnoticed. (home storage, 
for instance)

From jimklimov at cos.ru  Sun Mar 23 17:09:03 2014
From: jimklimov at cos.ru (Jim Klimov)
Date: Sun, 23 Mar 2014 18:09:03 +0100
Subject: [OmniOS-discuss] Debating Swapping 2TB with 4TB drives in
 RaidZ2 or Create new Vol and clone
In-Reply-To: <alpine.GSO.2.01.1403231049030.1735@freddy.simplesystems.org>
References: <20140321160437.2C8B11A04E9@apps0.cs.toronto.edu>
	<996286416.12067302.1395419658589.JavaMail.root@vmware.com>
	<alpine.GSO.2.01.1403221005410.1735@freddy.simplesystems.org>
	<fb98ca14-06fc-40d1-8e40-4095c7a6c694@email.android.com>
	<alpine.GSO.2.01.1403231049030.1735@freddy.simplesystems.org>
Message-ID: <b8e5749d-2666-4025-90b9-112328582e56@email.android.com>

23 ????? 2014??. 16:53:38 CET, Bob Friesenhahn <bfriesen at simple.dallas.tx.us> ?????:
>On Sun, 23 Mar 2014, Jim Klimov wrote:
>>>
>>> Regardless, I can not imagine why someone would want to replace 2TB
>>> drives with 4TB drives.
>>
>> Limited number of disk bays? ;)
>
>That would be the only reason.  The cost of replacing existing 2TB 
>drives with 4TB drives seems pretty high.  Performace would only go 
>down, and if the physical block size increases, then storage 
>efficiency would decrease.  Disk bays are not necessarily all that 
>expensive as long as there is a place nearby to put it.
>
>An existing disk chassis could be replaced with one which supports 
>more slots and the existing drives re-used as long as they are 
>physically compatible with the new chassis.
>
>Bob

Engineering is a matter of compromise. Something good for one usecase is not suitable for another.

Consider the users of the popular HP Microserver series limited by 4-5 data disks. Consider the low-power rigs where more spindles might soon double the power draw (think of TCO over time vs. raw price of purchase). For a home nas peak performance might matter less than available volume, and even a "less efficient" storage in terms of slack space might be more efficient for mechanical performance by enforcing smaller fragmentation.

So... YMMV ;)

//Jim
--
Typos courtesy of K-9 Mail on my Samsung Android

From richard.elling at richardelling.com  Sun Mar 23 23:32:15 2014
From: richard.elling at richardelling.com (Richard Elling)
Date: Sun, 23 Mar 2014 16:32:15 -0700
Subject: [OmniOS-discuss] zpool degraded while smart sais disks are OK
In-Reply-To: <alpine.DEB.2.02.1403220606480.3125@froburg.oetiker.ch>
References: <alpine.DEB.2.02.1403211743260.3125@froburg.oetiker.ch>
	<39B55A5A-AA04-4C56-8A74-5B9316861405@RichardElling.com>
	<CALLZ=vXbhBpBwvN5ninNy_ehYPujRa6WmFH3w_oze7pxX67Urw@mail.gmail.com>
	<alpine.DEB.2.02.1403212322540.3125@froburg.oetiker.ch>
	<0D51CBC0-D049-4A12-A733-7DDB6320BD82@richardelling.com>
	<alpine.DEB.2.02.1403220606480.3125@froburg.oetiker.ch>
Message-ID: <DF7112ED-1B47-437F-98C7-5C72CB794349@richardelling.com>


On Mar 21, 2014, at 10:13 PM, Tobias Oetiker <tobi at oetiker.ch> wrote:

> Yesterday Richard Elling wrote:
> 
>> 
>> On Mar 21, 2014, at 3:23 PM, Tobias Oetiker <tobi at oetiker.ch> wrote:
> 
> [...]
>>> 
>>> it happened over time as you can see from the timestamps in the
>>> log. The errors from zfs's point of view were 1 read and about 30 write
>>> 
>>> but according to smart the disks are without flaw
>> 
>> Actually, SMART is pretty dumb. In most cases, it only looks for uncorrectable
>> errors that are related to media or heads. For a clue to more permanent errors,
>> you will want to look at the read/write error reports for errors that are
>> corrected with possible delays. You can also look at the grown defects list.
>> 
>> This behaviour is expected for drives with errors that are not being quickly
>> corrected or have firmware bugs (horrors!) and where the disk does not do TLER
>> (or its vendor's equivalent)
>> -- richard
> 
> the error counters look like this:
> 
> 
> Error counter log:
>           Errors Corrected by           Total   Correction     Gigabytes    Total
>               ECC          rereads/    errors   algorithm      processed    uncorrected
>           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
> read:       3494        0         0      3494      44904        530.879           0
> write:         0        0         0         0      39111       1793.323           0
> verify:        0        0         0         0       8133          0.000           0

Errors corrected without delay looks good. The problem lies elsewhere.

> 
> the disk vendor is HGST in case anyone has further ideas ... the system has 20 of these disks and the problems occured with
> three of them. The system has been running fine for two months previously.

...and yet there are aborted commands, likely due to a reset after a timeout.
Resets aren't issued without cause.

There are two different resets issued by the sd driver: LU and bus. If the
LU reset doesn't work, the resets are escalated to bus. This is, of course,
tunable, but is rarely tuned. A bus reset for SAS is a questionable practice,
since SAS is a fabric, not a bus. But the effect of a device in the fabric
being reset could be seen as aborted commands by more than one target. To
troubleshoot these cases, you need to look at all of the devices in the data
path and map the common causes: HBAs, expanders, enclosures, etc. Traverse
the devices looking for errors, as you did with the disks. Useful tools:
sasinfo, lsiutil/sas2ircu, smp_utils, sg3_utils, mpathadm, fmtopo.
 -- richard


> 
> Vendor:               HGST
> Product:              HUS724030ALS640
> Revision:             A152
> User Capacity:        3,000,592,982,016 bytes [3.00 TB]
> Logical block size:   512 bytes
> Serial number:        P8J20SNV
> Device type:          disk
> Transport protocol:   SAS
> 
> cheers
> tobi
>> 
>> 
> 
> -- 
> Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
> www.oetiker.ch tobi at oetiker.ch +41 62 775 9902
> *** We are hiring IT staff: www.oetiker.ch/jobs ***


From geoffn at gnaa.net  Mon Mar 24 23:13:13 2014
From: geoffn at gnaa.net (Geoff Nordli)
Date: Mon, 24 Mar 2014 16:13:13 -0700
Subject: [OmniOS-discuss] anyone using SaltStack
Message-ID: <5330BC09.1080500@gnaa.net>

Is anyone is using SaltStack (http://www.saltstack.com/) on OmniOS.

If so, how you are getting it installed?

thanks,

Geoff

From bfriesen at simple.dallas.tx.us  Tue Mar 25 01:45:20 2014
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Mon, 24 Mar 2014 20:45:20 -0500 (CDT)
Subject: [OmniOS-discuss] Debating Swapping 2TB with 4TB drives in
 RaidZ2 or Create new Vol and clone
In-Reply-To: <1424115820.12054356.1395416787354.JavaMail.root@vmware.com>
References: <CAEc-0iVry=u1547U50M00FJRAcUy3yrea_fTLvAhcfCr0+4eLg@mail.gmail.com>
	<CAEc-0iU2qwrGyeG=ZB3egZL1FFS9nGwsHfOdMCec_JdCdnJQLg@mail.gmail.com>
	<1424115820.12054356.1395416787354.JavaMail.root@vmware.com>
Message-ID: <alpine.GSO.2.01.1403242037050.1735@freddy.simplesystems.org>

On Fri, 21 Mar 2014, Matthew Mabis wrote:
> 
> Just curious what you would do in my situation, replace the drives or build a new vDev and why?

I would add a new vdev for three reasons:

  1) It is usually best to let sleeping dogs lie.
  2) Takes a whole lot less time.
  3) About twice the total performance.

Drawbacks are:

  1) More cost (but time is money).
  2) More hardware.
  3) More physical space.
  4) More power consumption.
  5) Imbalanced vdevs (in terms of space).

The imbalanced vdevs might be helped by a number of sends/receives to 
the same pool but this depends on how the filesystems are organized.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From Rob at Logan.com  Tue Mar 25 04:01:32 2014
From: Rob at Logan.com (Rob Logan)
Date: Tue, 25 Mar 2014 00:01:32 -0400
Subject: [OmniOS-discuss] Debating Swapping 2TB with 4TB drives in
	RaidZ2 or Create new Vol and clone
In-Reply-To: <alpine.GSO.2.01.1403242037050.1735@freddy.simplesystems.org>
References: <CAEc-0iVry=u1547U50M00FJRAcUy3yrea_fTLvAhcfCr0+4eLg@mail.gmail.com>
	<CAEc-0iU2qwrGyeG=ZB3egZL1FFS9nGwsHfOdMCec_JdCdnJQLg@mail.gmail.com>
	<1424115820.12054356.1395416787354.JavaMail.root@vmware.com>
	<alpine.GSO.2.01.1403242037050.1735@freddy.simplesystems.org>
Message-ID: <A0C3C057-6ECD-4F2A-973C-CFAFD8A9FAF9@Logan.com>


>> Just curious what you would do in my situation, replace the drives or build a new vDev and why?

I attach both 4T drives to the 2T mirrored pair, 
when its done resilvering, I detach the org 2T drives and 
add them to the 1T mirrored pair, when that?s done, I use
to add the smallest disk pair as a new vdev, but now I toss
them in the trash to minimize the number of devices.

This takes way too much wall clock time, but only a few mins
every night for three nights. my 512e drives are living fine 
in the 512n ashift=9 pool.

rob at nas:~# zpool history z | head
History for 'z':
2012-01-04.21:26:14 zpool create zz c4t7d0 c4t2d0 c5t5d0 c4t0d0 c7t2d0
2012-01-04.21:27:10 zfs set compression=on zz
2012-01-04.21:27:39 zfs recv -vd zz
2012-01-04.21:31:37 zfs recv -vd zz
2012-01-04.21:38:35 zfs recv -vd zz
rob at nas:~# zpool iostat -v
               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
rpool       7.30G   132G      0      0    756  1.72K
  mirror    7.30G   132G      0      0    756  1.72K
    c7t0d0s0      -      -      0      0    663  1.73K
    c4t6d0s0      -      -      0      0    664  1.73K
----------  -----  -----  -----  -----  -----  -----
z           7.28T  2.69T     12      3  1.18M   115K
  mirror    1.94T  1.69T      3      0   302K  25.1K
    c7t1d0      -      -      0      3  17.6K   338K
    c7t3d0      -      -      0      3  17.6K   338K
  mirror    1.80T  10.8G      1      0   164K  16.3K
    c5t2d0      -      -      1      0   150K  16.3K
    c4t2d0      -      -      1      0   150K  16.3K
  mirror    1.06T   772G      2      0   220K  18.3K
    c4t7d0      -      -      0      1  11.1K   189K
    c5t7d0      -      -      0      1  11.1K   189K
  mirror     925G  3.14G      1      0  78.7K  6.02K
    c4t0d0      -      -      0      0  73.6K  6.04K
    c5t0d0      -      -      0      0  73.7K  6.04K
  mirror    1.58T   237G      4      1   444K  49.3K
    c4t5d0      -      -      2      0   316K  49.0K
    c5t5d0      -      -      0      2  24.5K   286K
----------  -----  -----  -----  -----  -----  -----

rob at nas:~# hd
                                                                         
Device    Serial        Vendor   Model             Rev  Temperature     
------    ------        ------   -----             ---- -----------     
c4t0d0p0  WMATV00864xx  ATA      WDC WD1001FALS-0  0K05 36 C (96 F)
c4t2d0p0  H7JR0C7006xx  ATA      SAMSUNG HD204UI   0001 28 C (82 F)
c4t5d0p0  WMC3005796xx  ATA      WDC WD20EFRX-68A  0A80 29 C (84 F)
c4t6d0p0  WCAMR24049xx  ATA      WDC WD3200JD-00K  5J08 0 C (32 F)
c4t7d0p0  H7J90B9245xx  ATA      SAMSUNG HD204UI   0001 27 C (80 F)
c5t0d0p0  WMATV00783xx  ATA      WDC WD1001FALS-0  0K05 36 C (96 F)
c5t2d0p0  H7JD5B1029xx  ATA      SAMSUNG HD204UI   0001 27 C (80 F)
c5t5d0p0  WMC3005605xx  ATA      WDC WD20EFRX-68A  0A80 29 C (84 F)
c5t7d0p0  H7JD5B1029xx  ATA      SAMSUNG HD204UI   0001 25 C (77 F)
c7t0d0p0  WMAP419728xx  ATA      WDC WD1500AHFD-0  7QR5 33 C (91 F)
c7t1d0p0  WCC4E03420xx  ATA      WDC WD40EFRX-68W  0A80 30 C (86 F)
c7t3d0p0  WCC4E04257xx  ATA      WDC WD40EFRX-68W  0A80 30 C (86 F)
c7t4d0p0  H7J9HBA008xx  ATA      SAMSUNG HD204UI   0001 25 C (77 F)

rob at nas:~# zdb | grep ashift
            ashift: 9
            ashift: 9
            ashift: 9
            ashift: 9
            ashift: 9
            ashift: 9


From henk at hlangeveld.nl  Tue Mar 25 07:39:02 2014
From: henk at hlangeveld.nl (Henk Langeveld)
Date: Tue, 25 Mar 2014 08:39:02 +0100
Subject: [OmniOS-discuss] anyone using SaltStack
In-Reply-To: <5330BC09.1080500@gnaa.net>
References: <5330BC09.1080500@gnaa.net>
Message-ID: <53313296.2050508@hlangeveld.nl>

On 03/25/2014 12:13 AM, Geoff Nordli wrote:
> Is anyone is using SaltStack (http://www.saltstack.com/) on OmniOS.
>
> If so, how you are getting it installed?

Hi Geoff,

I'm not using Salt, but according to the installation guide
(http://docs.saltstack.com/en/latest/topics/installation/index.html)
various versions of Solaris are supported.

In addition, the salt-bootstrap.sh script 
(https://github.com/saltstack/salt-bootstrap) appears to support SmartOS.

What do you need (minion or master), and what have you tried?

Cheers,
Henk


From geoffn at gnaa.net  Tue Mar 25 15:51:30 2014
From: geoffn at gnaa.net (Geoff Nordli)
Date: Tue, 25 Mar 2014 08:51:30 -0700
Subject: [OmniOS-discuss] anyone using SaltStack
In-Reply-To: <53313296.2050508@hlangeveld.nl>
References: <5330BC09.1080500@gnaa.net> <53313296.2050508@hlangeveld.nl>
Message-ID: <5331A602.8070302@gnaa.net>

On 14-03-25 12:39 AM, Henk Langeveld wrote:
> On 03/25/2014 12:13 AM, Geoff Nordli wrote:
>> Is anyone is using SaltStack (http://www.saltstack.com/) on OmniOS.
>>
>> If so, how you are getting it installed?
>
> Hi Geoff,
>
> I'm not using Salt, but according to the installation guide
> (http://docs.saltstack.com/en/latest/topics/installation/index.html)
> various versions of Solaris are supported.
>
> In addition, the salt-bootstrap.sh script 
> (https://github.com/saltstack/salt-bootstrap) appears to support SmartOS.
>
> What do you need (minion or master), and what have you tried?
>

Hi Henk.

I tried running the bootstrap which failed, because 0mq failed to 
compiled due to a missing header file.

I looked at the bootstrap source file and smartOS uses pkgsrc to install 
the dependencies.

The saltstack docs for solaris have you use opencsw.

When I look at the options most likely I am going to go down the pkgsrc 
path.

Just wondering what others have done.

thanks,

Geoff


From carlb at flamewarestudios.com  Tue Mar 25 16:27:01 2014
From: carlb at flamewarestudios.com (Carl Brunning)
Date: Tue, 25 Mar 2014 16:27:01 +0000
Subject: [OmniOS-discuss] Problem with using omnios-build
Message-ID: <29D2E00A4E2C9B4893E610929CA990A966A6F515@srv01.cblinux.co.uk>

HI
am playing with the omnios-build
but when trying to do pkg build i find on the git clone it having a problem
the line is this
logcmd $GIT clone -b $PKG_BRANCH src at src.omniti.com:~omnios/core/pkg<mailto:src at src.omniti.com:~omnios/core/pkg>

the branch is r151006

this is wanting a password but I don't know what it is
i did see on the latest version of the build you have changed to this

logcmd  $GIT clone -b omni anon at src.omniti.com:~omnios/core/illumos-omni-os<mailto:anon at src.omniti.com:~omnios/core/illumos-omni-os>

but this is just not finding anything

Cloning into 'illumos-omni-os'...
Non-existant repo core/illumos-omni-os
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.


so why it asking for a password for the first one lol
and what is it
and i hope you fix it all

thanks

Carl Brunning


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140325/27cfd749/attachment.html>

From danmcd at omniti.com  Tue Mar 25 17:15:13 2014
From: danmcd at omniti.com (Dan McDonald)
Date: Tue, 25 Mar 2014 13:15:13 -0400
Subject: [OmniOS-discuss] Problem with using omnios-build
In-Reply-To: <29D2E00A4E2C9B4893E610929CA990A966A6F515@srv01.cblinux.co.uk>
References: <29D2E00A4E2C9B4893E610929CA990A966A6F515@srv01.cblinux.co.uk>
Message-ID: <77965151-05DF-41EA-992A-6C48992AFBBE@omniti.com>


On Mar 25, 2014, at 12:27 PM, Carl Brunning <carlb at flamewarestudios.com> wrote:

> HI
> am playing with the omnios-build 
> but when trying to do pkg build i find on the git clone it having a problem
> the line is this 
> logcmd $GIT clone -b $PKG_BRANCH src at src.omniti.com:~omnios/core/pkg
> 
> the branch is r151006

I fixed some things in "master" to this.  It should've just swapped out "src@" for "anon@".

> this is wanting a password but I don't know what it is
> i did see on the latest version of the build you have changed to this 
> 
> logcmd  $GIT clone -b omni anon at src.omniti.com:~omnios/core/illumos-omni-os 

Hmmm.  This is a very old artifact.  The "master" version has this changeset:

osdev2(build/pkg)[0]% git show --stat 85f25c28
commit 85f25c28969c859fb9bc838a6779dd3a70286896
Author: Theo Schlossnagle <jesus at omniti.com>
Date:   Sat Mar 24 20:06:12 2012 +0000

    move pkg to git and make it work

 build/pkg/build.sh | 47 ++++++++++++++++++++++++++++-------------------
 1 file changed, 28 insertions(+), 19 deletions(-)
osdev2(build/pkg)[0]% 

that eliminates the need for the clone to happen.

Generally speaking, though, it's illumos-omnios:

	anon at src.omniti.com:~omnios/core/illumos-omnios

Perhaps fixing that will work?

> so why it asking for a password for the first one lol

That's a bug.  I didn't fix it in anywhere other than master, though.

> and what is it 
> and i hope you fix it all 

I'm curious if pkg can be re-rewhacked to use headers from a completed omnios build?  I'm introducing new features into omnios-build to make it completely fire-and-forget.  One new feature not yet back is the PREBUILT_OMNIOS feature, that can point packages that depend on a populated illumos-omnios proto area.  Perhaps I should include PREBUILT_OMNIOS support in pkg as well?!

Pardon any slowness on my part.  I'm new at pkg, and at the build outside of illumos-omnios itself.

Thanks,
Dan


From carlb at flamewarestudios.com  Wed Mar 26 11:13:16 2014
From: carlb at flamewarestudios.com (Carl Brunning)
Date: Wed, 26 Mar 2014 11:13:16 +0000
Subject: [OmniOS-discuss] Problem with using omnios-build
In-Reply-To: <77965151-05DF-41EA-992A-6C48992AFBBE@omniti.com>
References: <29D2E00A4E2C9B4893E610929CA990A966A6F515@srv01.cblinux.co.uk>
	<77965151-05DF-41EA-992A-6C48992AFBBE@omniti.com>
Message-ID: <29D2E00A4E2C9B4893E610929CA990A966AC38EC@srv01.cblinux.co.uk>

Thanks for that yes that help me a little more
Now I just got to work out why it has a problem uploading to the repo
This is the error I get 

PATH=/tmp/build_admin/pkg-1.0/pkg/src/pkg/../../proto/root_i386/usr/bin:/usr/sbin:/usr/bin  PYTHONPATH=/tmp/build_admin/pkg-1.0/pkg/src/pkg/../../proto/root_i386/usr/lib/python2.6/vendor-packages pkgsend -s http://repo.flamewarestudios.com:10000/ publish -d /tmp/build_admin/pkg-1.0/pkg/src/pkg/../../proto/root_i386 \
        -d license_files -T \*.py --fmri-in-manifest  \
        pkgtmp/SUNWipkg-brand.dep.res
pkgsend: Transfer from 'http://repo.removeed failed: api_errors.InvalidP5IFile:. (happened 4 times)

the repo is a openindiana system 
could this be my problem now
I have got most of the other package compile and up load with no problems 
Just pkg one is the problem

Thanks

Carl Brunning
-----Original Message-----
From: Dan McDonald [mailto:danmcd at omniti.com] 
Sent: 25 March 2014 17:15
To: Carl Brunning
Cc: omnios-discuss at lists.omniti.com
Subject: Re: [OmniOS-discuss] Problem with using omnios-build


On Mar 25, 2014, at 12:27 PM, Carl Brunning <carlb at flamewarestudios.com> wrote:

> HI
> am playing with the omnios-build
> but when trying to do pkg build i find on the git clone it having a 
> problem the line is this logcmd $GIT clone -b $PKG_BRANCH 
> src at src.omniti.com:~omnios/core/pkg
> 
> the branch is r151006

I fixed some things in "master" to this.  It should've just swapped out "src@" for "anon@".

> this is wanting a password but I don't know what it is i did see on 
> the latest version of the build you have changed to this
> 
> logcmd  $GIT clone -b omni 
> anon at src.omniti.com:~omnios/core/illumos-omni-os

Hmmm.  This is a very old artifact.  The "master" version has this changeset:

osdev2(build/pkg)[0]% git show --stat 85f25c28 commit 85f25c28969c859fb9bc838a6779dd3a70286896
Author: Theo Schlossnagle <jesus at omniti.com>
Date:   Sat Mar 24 20:06:12 2012 +0000

    move pkg to git and make it work

 build/pkg/build.sh | 47 ++++++++++++++++++++++++++++-------------------
 1 file changed, 28 insertions(+), 19 deletions(-) osdev2(build/pkg)[0]% 

that eliminates the need for the clone to happen.

Generally speaking, though, it's illumos-omnios:

	anon at src.omniti.com:~omnios/core/illumos-omnios

Perhaps fixing that will work?

> so why it asking for a password for the first one lol

That's a bug.  I didn't fix it in anywhere other than master, though.

> and what is it
> and i hope you fix it all

I'm curious if pkg can be re-rewhacked to use headers from a completed omnios build?  I'm introducing new features into omnios-build to make it completely fire-and-forget.  One new feature not yet back is the PREBUILT_OMNIOS feature, that can point packages that depend on a populated illumos-omnios proto area.  Perhaps I should include PREBUILT_OMNIOS support in pkg as well?!

Pardon any slowness on my part.  I'm new at pkg, and at the build outside of illumos-omnios itself.

Thanks,
Dan


From nitram at konsortit.se  Wed Mar 26 13:05:07 2014
From: nitram at konsortit.se (Nitram Grebredna)
Date: Wed, 26 Mar 2014 14:05:07 +0100
Subject: [OmniOS-discuss] Multipathing,
 only one path visible - there ought to be two, what am i doing wrong?
Message-ID: <CAEPZ3ejqKfxXH7w+a-dcnmURBEQdj5r759Ga0bkXUVvmt0BG5w@mail.gmail.com>

Hi!

I'm having issues with multipathing, and i cant seem to figure out what is
wrong.

The setup is a Supermicro 24 disk-box with 3 controllers (1 pcs internal
SAS2308, two 9207i-cards, same firmware on all units), identified as
follows:

Num   Ctlr            FW Ver        NVDATA        x86-BIOS         PCI Addr
----------------------------------------------------------------------------

0  SAS2308_2(D1)   18.00.00.00    11.00.00.05    07.33.00.00     00:01:00:00
1  SAS2308_2(D1)   18.00.00.00    11.00.00.05    07.33.00.00     00:02:00:00
2  SAS2308_2(D1)   18.00.00.00    11.00.00.05    07.33.00.00     00:03:00:00

The machine has 12 ST4000NM0023 (seagate 4TB DP) disks in it and a couple
or bootdisks. The controllers are connected via 2 cables per controller to
the backplane/expander. I've Installed latest stable omnios on it.

Excerpt from dmesg:

[...]

genunix: [ID 936769 kern.info] mpt_sas2 is /pci at 0,0/pci8086,e06 at 2
,2/pci1000,3020 at 0
scsi: [ID 583861 kern.info] mpt_sas7 at mpt_sas2: scsi-iport 4
genunix: [ID 936769 kern.info] mpt_sas7 is /pci at 0,0/pci8086,e06 at 2
,2/pci1000,3020 at 0/iport at 4
genunix: [ID 408114 kern.info] /pci at 0,0/pci8086,e06 at 2,2/pci1000,3020 at 0
/iport at 4 (mpt_sas7) online
scsi: [ID 583861 kern.info] sd11 at scsi_vhci0: unit-address
g5000c50057c1fce3: conf f_sym
genunix: [ID 936769 kern.info] sd11 is /scsi_vhci/disk at g5000c50057c1fce3
genunix: [ID 408114 kern.info] /scsi_vhci/disk at g5000c50057c1fce3 (sd11)
online
genunix: [ID 483743 kern.info] /scsi_vhci/disk at g5000c50057c1fce3 (sd11)
multipath status: degraded: path 4 mpt_sas15/disk at w5000c50057c1fce1,0 is
online
scsi: [ID 583861 kern.info] mpt_sas11 at mpt_sas1: scsi-iport 2
genunix: [ID 936769 kern.info] mpt_sas11 is /pci at 0,0/pci8086,e04 at 2
/pci1000,3020 at 0/iport at 2
genunix: [ID 408114 kern.info]
/pci at 0,0/pci8086,e04 at 2/pci1000,3020 at 0/iport at 2(mpt_sas11) online
scsi: [ID 583861 kern.info] sd2 at scsi_vhci0: unit-address
g5000c50057ca74bb: conf f_sym
genunix: [ID 936769 kern.info] sd2 is /scsi_vhci/disk at g5000c50057ca74bb
genunix: [ID 408114 kern.info] /scsi_vhci/disk at g5000c50057ca74bb (sd2)
online

[...]

Excerpt from mpathadm:

# mpathadm list lu
        /dev/rdsk/c1t5000C5006C0BF63Fd0s2
                Total Path Count: 1
                Operational Path Count: 1
        /dev/rdsk/c1t5000C5006C0B29C7d0s2
                Total Path Count: 1
                Operational Path Count: 1
        /dev/rdsk/c1t5000C50057C1F5BFd0s2
                Total Path Count: 1

[...]


Excerpt from format:

# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c1t5000C5006C0B29C7d0 <SEAGATE-ST3300657SS-0008 cyl 36469 alt 2
hd 255 sec 63>
          /scsi_vhci/disk at g5000c5006c0b29c7
       1. c1t5000C5006C0BF63Fd0 <SEAGATE-ST3300657SS-0008 cyl 36469 alt 2
hd 255 sec 63>
          /scsi_vhci/disk at g5000c5006c0bf63f
       2. c1t5000C50057C1F5BFd0 <SEAGATE-ST4000NM0023-0003-3.64TB>
          /scsi_vhci/disk at g5000c50057c1f5bf
       3. c1t5000C50057C1FCE3d0 <SEAGATE-ST4000NM0023-0003-3.64TB>
          /scsi_vhci/disk at g5000c50057c1fce3

[...]

If i set mpxio-disable="yes"; in mpt_sas.conf the error above obviously
dissapears and also i can see the 'real' device/controller id's when
issuing the format command.

If things were working correctly i assume i would see a total path count of
2 per disk, and the multipath status wouldn't be set as degraded in the
log? What am i doing wrong? I've asked google and since they dont know the
answer to the question i'd thought i'd try a post here ;)

Thanks in advance for any help,

Martin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140326/7e0e5f87/attachment.html>

From svavar at januar.is  Wed Mar 26 15:47:51 2014
From: svavar at januar.is (=?ISO-8859-1?Q?Svavar_=D6rn_Eysteinsson?=)
Date: Wed, 26 Mar 2014 15:47:51 +0000
Subject: [OmniOS-discuss] Networking Performance Tips on HP Microserver N40L
	?
Message-ID: <CAEc-0iU6B=JDE5zA5omkEbWMqhdZau=HpnA1rzAzyLkSGhXiDg@mail.gmail.com>

Hello people.
I recently installed my first true NAS box at home, which is a HP
Microserver N40L
with 16GB in RAM, 1x250GB for OS and 4x 2TB Enterprise SATA disks provided
by HP in a RAIDZ.

I'm using the newest/updated OmniOS v11 r151008 and also Napp-it and other
services.
What I would like to know is, have there been any issues/problems and do
people
have some performance tuning tips regarding networking issues on the BC5723
controller provided
by the HP Microserver ? It's the bge module/driver ?

Sometimes I find the speeds to the BOX will rock up & down. I haven't
configured
a gigabit network, thats on the plan this weekend. I have full-duplex and
flowctrl enabled.
For an example, I noticed after building my small ipf firewall rules and
enabled the firewall
the speed did go down, specially with CIFS and NFS(didn't test the AFP).

So, any performance tips out there ?

Thanks in advance.

Best regards,

Svavar O - Reykjavik - Iceland
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140326/f078c12e/attachment.html>

From danmcd at omniti.com  Wed Mar 26 16:01:34 2014
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 26 Mar 2014 12:01:34 -0400
Subject: [OmniOS-discuss] Networking Performance Tips on HP Microserver
	N40L ?
In-Reply-To: <CAEc-0iU6B=JDE5zA5omkEbWMqhdZau=HpnA1rzAzyLkSGhXiDg@mail.gmail.com>
References: <CAEc-0iU6B=JDE5zA5omkEbWMqhdZau=HpnA1rzAzyLkSGhXiDg@mail.gmail.com>
Message-ID: <EE11D7CE-01D8-46B5-AD42-5CC3303668F9@omniti.com>


On Mar 26, 2014, at 11:47 AM, Svavar ?rn Eysteinsson <svavar at januar.is> wrote:

> Hello people.
> I recently installed my first true NAS box at home, which is a HP Microserver N40L
> with 16GB in RAM, 1x250GB for OS and 4x 2TB Enterprise SATA disks provided by HP in a RAIDZ.
> 
> I'm using the newest/updated OmniOS v11 r151008 and also Napp-it and other services.
> What I would like to know is, have there been any issues/problems and do people
> have some performance tuning tips regarding networking issues on the BC5723 controller provided
> by the HP Microserver ? It's the bge module/driver ?
> 
> Sometimes I find the speeds to the BOX will rock up & down. I haven't configured
> a gigabit network, thats on the plan this weekend. I have full-duplex and flowctrl enabled.
> For an example, I noticed after building my small ipf firewall rules and enabled the firewall
> the speed did go down, specially with CIFS and NFS(didn't test the AFP).

Was performance okay pre-ipf?  If so, it's probably ipf that's tripping you up.

> So, any performance tips out there ?

I have to ask, are you using ipf to protect the box?  Or for NAT?  If just to protect the box, you may be able to use something NOT ipf to help you out, depending on the problem(s) you're trying to solve.

Dan


From svavar at januar.is  Wed Mar 26 17:12:45 2014
From: svavar at januar.is (=?ISO-8859-1?Q?Svavar_=D6rn_Eysteinsson?=)
Date: Wed, 26 Mar 2014 17:12:45 +0000
Subject: [OmniOS-discuss] Networking Performance Tips on HP Microserver
 N40L ?
In-Reply-To: <EE11D7CE-01D8-46B5-AD42-5CC3303668F9@omniti.com>
References: <CAEc-0iU6B=JDE5zA5omkEbWMqhdZau=HpnA1rzAzyLkSGhXiDg@mail.gmail.com>
	<EE11D7CE-01D8-46B5-AD42-5CC3303668F9@omniti.com>
Message-ID: <CAEc-0iU=+9LoFyESj9YmcLZhp-aGxJcyN5_XY76mJ-OE8cnSVQ@mail.gmail.com>

No, the performance was a little shaky before, and after the ipf activation.
So I just disabled the firewall part.

The reason I activated the firewall is not for NAT, just to protect the box.
As I have configured my router to portmap some ports into the HP server,
and I use ipf to deny/accept by source.
As my stupid router firewall configuration never works.

The rules I used where :

# my HP server is 192.168.1.1
# anti spoofing rule
block in   log  quick on bge0  from 192.168.1.1  to any
#
# Allow everything on loopbak
# Rule  1 (lo0)
pass  in  quick on lo0 proto icmp  from any  to any keep state
pass  in  quick on lo0 proto tcp  from any  to any keep state
pass  in  quick on lo0 proto udp  from any  to any keep state
pass  in  quick on lo0  from any  to any
pass  out quick on lo0 proto icmp  from any  to any keep state
pass  out quick on lo0 proto tcp  from any  to any keep state
pass  out quick on lo0 proto udp  from any  to any keep state
pass  out quick on lo0  from any  to any
#
# Rule  2 (global)
# SSH Access to the host; useful ICMP
# types; ping request
pass  in  quick proto icmp  from any  to 192.168.1.1 icmp-type 3  keep state
pass  in  quick proto icmp  from any  to 192.168.1.1 icmp-type 0 code 0
 keep state
pass  in  quick proto icmp  from any  to 192.168.1.1 icmp-type 8 code 0
 keep state
pass  in  quick proto icmp  from any  to 192.168.1.1 icmp-type 11 code 0
 keep state
pass  in  quick proto icmp  from any  to 192.168.1.1 icmp-type 11 code 1
 keep state
#
# Rule  4 (global)
# Allow everything from these management hosts.
# blackbox:Policy:4: warning: Changing rule direction due to self reference
pass  in  quick proto icmp  from MANAGENETWORK_1  to 192.168.1.1 keep state
pass  in  quick proto icmp  from MANAGENETWORK_2  to 192.168.1.1 keep state
pass  in  quick proto icmp  from MANAGEHOST_1  to 192.168.1.1 keep state
pass  in  quick proto tcp  from MANAGENETWORK_1   to 192.168.1.1 keep state
pass  in  quick proto tcp  from MANAGENETWORK_2  to 192.168.1.1 keep state
pass  in  quick proto tcp  from MANAGEHOST_1  to 192.168.1.1 keep state
pass  in  quick proto udp  from MANAGENETWORK_1   to 192.168.1.1 keep state
pass  in  quick proto udp  from MANAGENETWORK_2  to 192.168.1.1 keep state
pass  in  quick proto udp  from MANAGEHOST_1  to 192.168.1.1 keep state
pass  in  quick  from MANAGENETWORK_1  to 192.168.1.1
pass  in  quick  from MANAGENETWORK_2  to 192.168.1.1
pass  in  quick  from MANAGEHOST_1  to 192.168.1.1
#
# Rule  5 (global)
# Allow everything from the HP Server itself
# blackbox:Policy:5: warning: Changing rule direction due to self reference
pass  out quick proto icmp  from 192.168.1.1  to any keep state
pass  out quick proto tcp  from 192.168.1.1  to any keep state
pass  out quick proto udp  from 192.168.1.1  to any keep state
pass  out quick  from 192.168.1.1  to any
#
# Rule  6 (global)
block in   log  quick  from any  to any
block out  log  quick  from any  to any
#
# Rule  fallback rule
#    fallback rule
block in  quick  from any  to any
block out quick  from any  to any


*SVAVAR ?RN EYSTEINSSON*Kerfisstj?ri
Gsm / mobile +354 862 1624
S?mi / tel +354 531 0101


*Jan?ar marka?sh?s*www.januar.is / Facebook<http://facebook.com/viderumjanuar>


On 26 March 2014 16:01, Dan McDonald <danmcd at omniti.com> wrote:

>
> On Mar 26, 2014, at 11:47 AM, Svavar ?rn Eysteinsson <svavar at januar.is>
> wrote:
>
> > Hello people.
> > I recently installed my first true NAS box at home, which is a HP
> Microserver N40L
> > with 16GB in RAM, 1x250GB for OS and 4x 2TB Enterprise SATA disks
> provided by HP in a RAIDZ.
> >
> > I'm using the newest/updated OmniOS v11 r151008 and also Napp-it and
> other services.
> > What I would like to know is, have there been any issues/problems and do
> people
> > have some performance tuning tips regarding networking issues on the
> BC5723 controller provided
> > by the HP Microserver ? It's the bge module/driver ?
> >
> > Sometimes I find the speeds to the BOX will rock up & down. I haven't
> configured
> > a gigabit network, thats on the plan this weekend. I have full-duplex
> and flowctrl enabled.
> > For an example, I noticed after building my small ipf firewall rules and
> enabled the firewall
> > the speed did go down, specially with CIFS and NFS(didn't test the AFP).
>
> Was performance okay pre-ipf?  If so, it's probably ipf that's tripping
> you up.
>
> > So, any performance tips out there ?
>
> I have to ask, are you using ipf to protect the box?  Or for NAT?  If just
> to protect the box, you may be able to use something NOT ipf to help you
> out, depending on the problem(s) you're trying to solve.
>
> Dan
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140326/99fd8f41/attachment-0001.html>

From danmcd at omniti.com  Wed Mar 26 17:19:42 2014
From: danmcd at omniti.com (Dan McDonald)
Date: Wed, 26 Mar 2014 13:19:42 -0400
Subject: [OmniOS-discuss] Networking Performance Tips on HP Microserver
	N40L ?
In-Reply-To: <CAEc-0iU=+9LoFyESj9YmcLZhp-aGxJcyN5_XY76mJ-OE8cnSVQ@mail.gmail.com>
References: <CAEc-0iU6B=JDE5zA5omkEbWMqhdZau=HpnA1rzAzyLkSGhXiDg@mail.gmail.com>
	<EE11D7CE-01D8-46B5-AD42-5CC3303668F9@omniti.com>
	<CAEc-0iU=+9LoFyESj9YmcLZhp-aGxJcyN5_XY76mJ-OE8cnSVQ@mail.gmail.com>
Message-ID: <CF705284-9C44-4739-AC30-8F90A9145C02@omniti.com>


On Mar 26, 2014, at 1:12 PM, Svavar ?rn Eysteinsson <svavar at januar.is> wrote:

> No, the performance was a little shaky before, and after the ipf activation.
> So I just disabled the firewall part.
> 
> The reason I activated the firewall is not for NAT, just to protect the box.
> As I have configured my router to portmap some ports into the HP server, and I use ipf to deny/accept by source.
> As my stupid router firewall configuration never works.

Okay.  Just checking.  I tend to use ipsecconf(1M) and drop actions for this sort of thing, but it's stateless, and it appears some of your FW rules are stateful.

Yes, bge is the driver for what you have.  I do know that bge needs some updating, but nobody's been contributing in the Illumos community on that front.

Sorry I can't be of more immediate assistance,
Dan


From cf at ferebee.net  Wed Mar 26 17:33:17 2014
From: cf at ferebee.net (Chris Ferebee)
Date: Wed, 26 Mar 2014 18:33:17 +0100
Subject: [OmniOS-discuss] Multipathing,
	only one path visible - there ought to be two,
	what am i doing wrong?
In-Reply-To: <CAEPZ3ejqKfxXH7w+a-dcnmURBEQdj5r759Ga0bkXUVvmt0BG5w@mail.gmail.com>
References: <CAEPZ3ejqKfxXH7w+a-dcnmURBEQdj5r759Ga0bkXUVvmt0BG5w@mail.gmail.com>
Message-ID: <9A1D4D35-8F4D-4175-BFF7-A887846FCEBA@ferebee.net>

Martin,

Are you sure you have SAS expanders in your backplane? Supermicro will sell you the same chassis with or without expanders, with almost identical model numbers.

You?ve described a typical JBOD configuration (i. e., no expanders): Each LSI 2308 has 8 SAS/SATA ports, 4 ports on each of 2 Mini-SAS SFF8087 connectors. Thus, with 3 controllers you are running 3 x 8 = 24 SAS ports to the backplane.

Do you have the exact model number of the chassis or backplane?

Best,
Chris

> Am 26.03.2014 um 14:05 schrieb Nitram Grebredna <nitram at konsortit.se>:
> 
> Hi!
> 
> I'm having issues with multipathing, and i cant seem to figure out what is wrong. 
> 
> The setup is a Supermicro 24 disk-box with 3 controllers (1 pcs internal SAS2308, two 9207i-cards, same firmware on all units), identified as follows:
> 
> Num   Ctlr            FW Ver        NVDATA        x86-BIOS         PCI Addr
> ----------------------------------------------------------------------------
> 
> 0  SAS2308_2(D1)   18.00.00.00    11.00.00.05    07.33.00.00     00:01:00:00
> 1  SAS2308_2(D1)   18.00.00.00    11.00.00.05    07.33.00.00     00:02:00:00
> 2  SAS2308_2(D1)   18.00.00.00    11.00.00.05    07.33.00.00     00:03:00:00
> 
> The machine has 12 ST4000NM0023 (seagate 4TB DP) disks in it and a couple or bootdisks. The controllers are connected via 2 cables per controller to the backplane/expander. I've Installed latest stable omnios on it.
> 
> Excerpt from dmesg:
> 
> [...]
> 
> genunix: [ID 936769 kern.info] mpt_sas2 is /pci at 0,0/pci8086,e06 at 2,2/pci1000,3020 at 0
> scsi: [ID 583861 kern.info] mpt_sas7 at mpt_sas2: scsi-iport 4
> genunix: [ID 936769 kern.info] mpt_sas7 is /pci at 0,0/pci8086,e06 at 2,2/pci1000,3020 at 0/iport at 4
> genunix: [ID 408114 kern.info] /pci at 0,0/pci8086,e06 at 2,2/pci1000,3020 at 0/iport at 4 (mpt_sas7) online
> scsi: [ID 583861 kern.info] sd11 at scsi_vhci0: unit-address g5000c50057c1fce3: conf f_sym
> genunix: [ID 936769 kern.info] sd11 is /scsi_vhci/disk at g5000c50057c1fce3
> genunix: [ID 408114 kern.info] /scsi_vhci/disk at g5000c50057c1fce3 (sd11) online
> genunix: [ID 483743 kern.info] /scsi_vhci/disk at g5000c50057c1fce3 (sd11) multipath status: degraded: path 4 mpt_sas15/disk at w5000c50057c1fce1,0 is online
> scsi: [ID 583861 kern.info] mpt_sas11 at mpt_sas1: scsi-iport 2
> genunix: [ID 936769 kern.info] mpt_sas11 is /pci at 0,0/pci8086,e04 at 2/pci1000,3020 at 0/iport at 2
> genunix: [ID 408114 kern.info] /pci at 0,0/pci8086,e04 at 2/pci1000,3020 at 0/iport at 2 (mpt_sas11) online
> scsi: [ID 583861 kern.info] sd2 at scsi_vhci0: unit-address g5000c50057ca74bb: conf f_sym
> genunix: [ID 936769 kern.info] sd2 is /scsi_vhci/disk at g5000c50057ca74bb
> genunix: [ID 408114 kern.info] /scsi_vhci/disk at g5000c50057ca74bb (sd2) online
> 
> [...]
> 
> Excerpt from mpathadm:
> 
> # mpathadm list lu
>         /dev/rdsk/c1t5000C5006C0BF63Fd0s2
>                 Total Path Count: 1
>                 Operational Path Count: 1
>         /dev/rdsk/c1t5000C5006C0B29C7d0s2
>                 Total Path Count: 1
>                 Operational Path Count: 1
>         /dev/rdsk/c1t5000C50057C1F5BFd0s2
>                 Total Path Count: 1
> 
> [...]
> 
> 
> Excerpt from format:
> 
> # format
> Searching for disks...done
> 
> 
> AVAILABLE DISK SELECTIONS:
>        0. c1t5000C5006C0B29C7d0 <SEAGATE-ST3300657SS-0008 cyl 36469 alt 2 hd 255 sec 63>
>           /scsi_vhci/disk at g5000c5006c0b29c7
>        1. c1t5000C5006C0BF63Fd0 <SEAGATE-ST3300657SS-0008 cyl 36469 alt 2 hd 255 sec 63>
>           /scsi_vhci/disk at g5000c5006c0bf63f
>        2. c1t5000C50057C1F5BFd0 <SEAGATE-ST4000NM0023-0003-3.64TB>
>           /scsi_vhci/disk at g5000c50057c1f5bf
>        3. c1t5000C50057C1FCE3d0 <SEAGATE-ST4000NM0023-0003-3.64TB>
>           /scsi_vhci/disk at g5000c50057c1fce3
> 
> [...]
> 
> If i set mpxio-disable="yes"; in mpt_sas.conf the error above obviously dissapears and also i can see the 'real' device/controller id's when issuing the format command. 
> 
> If things were working correctly i assume i would see a total path count of 2 per disk, and the multipath status wouldn't be set as degraded in the log? What am i doing wrong? I've asked google and since they dont know the answer to the question i'd thought i'd try a post here ;)
> 
> Thanks in advance for any help,
> 
> Martin
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140326/86360971/attachment.html>

From russhan at new-swankton.net  Wed Mar 26 19:02:23 2014
From: russhan at new-swankton.net (Russell Hansen)
Date: Wed, 26 Mar 2014 19:02:23 +0000
Subject: [OmniOS-discuss] Multipathing,
 only one path visible - there ought to be two, what am i doing wrong?
In-Reply-To: <9A1D4D35-8F4D-4175-BFF7-A887846FCEBA@ferebee.net>
References: <CAEPZ3ejqKfxXH7w+a-dcnmURBEQdj5r759Ga0bkXUVvmt0BG5w@mail.gmail.com>,
	<9A1D4D35-8F4D-4175-BFF7-A887846FCEBA@ferebee.net>
Message-ID: <0AE3E26797567E4AAB5C53C304D024455DA1A23D@ns-ex2010.new-swankton.lan>

Because those disks don't have Sun/Oracle firmware I believe you need to update /kernel/drv/scsi_vhci.conf

scsi-vhci-failover-override =
"SEAGATE ST3300657SS", "f_sym",
"SEAGATE ST4000NM0023", "f_sym";

You can double-check the VID/PID string by running format -> disk# -> inquiry

-Russ


From: OmniOS-discuss [omnios-discuss-bounces at lists.omniti.com] on behalf of Chris Ferebee [cf at ferebee.net]

Sent: Wednesday, March 26, 2014 10:33 AM

To: omnios-discuss at lists.omniti.com

Subject: Re: [OmniOS-discuss] Multipathing, only one path visible - there ought to be two, what am i doing wrong?


Martin,


Are you sure you have SAS expanders in your backplane? Supermicro will sell you the same chassis with or without expanders, with almost identical model numbers.


You?ve described a typical JBOD configuration (i. e., no expanders): Each LSI 2308 has 8 SAS/SATA ports, 4 ports on each of 2 Mini-SAS SFF8087 connectors. Thus, with 3 controllers you are running 3 x 8 = 24 SAS ports to the backplane.


Do you have the exact model number of the chassis or backplane?


Best,
Chris


Am 26.03.2014 um 14:05 schrieb Nitram Grebredna <nitram at konsortit.se>:


Hi!


I'm having issues with multipathing, and i cant seem to figure out what is wrong.


The setup is a Supermicro 24 disk-box with 3 controllers (1 pcs internal SAS2308, two 9207i-cards, same firmware on all units), identified as follows:


Num   Ctlr            FW Ver        NVDATA        x86-BIOS         PCI Addr

----------------------------------------------------------------------------


0  SAS2308_2(D1)   18.00.00.00    11.00.00.05    07.33.00.00     00:01:00:00

1  SAS2308_2(D1)   18.00.00.00    11.00.00.05    07.33.00.00     00:02:00:00

2  SAS2308_2(D1)   18.00.00.00    11.00.00.05    07.33.00.00     00:03:00:00


The machine has 12 ST4000NM0023 (seagate 4TB DP) disks in it and a couple or bootdisks. The controllers are connected via 2 cables per controller to the backplane/expander. I've Installed latest stable omnios on it.


Excerpt from dmesg:


[...]


genunix: [ID 936769 
kern.info] mpt_sas2 is /pci at 0,0/pci8086,e06 at 2,2/pci1000,3020 at 0

scsi: [ID 583861 
kern.info] mpt_sas7 at mpt_sas2: scsi-iport 4

genunix: [ID 936769 
kern.info] mpt_sas7 is /pci at 0,0/pci8086,e06 at 2,2/pci1000,3020 at 0/iport at 4

genunix: [ID 408114 
kern.info] /pci at 0,0/pci8086,e06 at 2,2/pci1000,3020 at 0/iport at 4 (mpt_sas7) online

scsi: [ID 583861 
kern.info] sd11 at scsi_vhci0: unit-address g5000c50057c1fce3: conf f_sym

genunix: [ID 936769 
kern.info] sd11 is /scsi_vhci/disk at g5000c50057c1fce3

genunix: [ID 408114 
kern.info] /scsi_vhci/disk at g5000c50057c1fce3 (sd11) online

genunix: [ID 483743 
kern.info] /scsi_vhci/disk at g5000c50057c1fce3 (sd11) multipath status: degraded: path 4 mpt_sas15/disk at w5000c50057c1fce1,0 is online

scsi: [ID 583861 
kern.info] mpt_sas11 at mpt_sas1: scsi-iport 2

genunix: [ID 936769 
kern.info] mpt_sas11 is /pci at 0,0/pci8086,e04 at 2/pci1000,3020 at 0/iport at 2

genunix: [ID 408114 
kern.info] /pci at 0,0/pci8086,e04 at 2/pci1000,3020 at 0/iport at 2 (mpt_sas11) online

scsi: [ID 583861 
kern.info] sd2 at scsi_vhci0: unit-address g5000c50057ca74bb: conf f_sym

genunix: [ID 936769 
kern.info] sd2 is /scsi_vhci/disk at g5000c50057ca74bb

genunix: [ID 408114 
kern.info] /scsi_vhci/disk at g5000c50057ca74bb (sd2) online


[...]


Excerpt from mpathadm:


# mpathadm list lu

        /dev/rdsk/c1t5000C5006C0BF63Fd0s2

                Total Path Count: 1

                Operational Path Count: 1

        /dev/rdsk/c1t5000C5006C0B29C7d0s2

                Total Path Count: 1

                Operational Path Count: 1

        /dev/rdsk/c1t5000C50057C1F5BFd0s2

                Total Path Count: 1


[...]


Excerpt from format:


# format

Searching for disks...done


AVAILABLE DISK SELECTIONS:

       0. c1t5000C5006C0B29C7d0 <SEAGATE-ST3300657SS-0008 cyl 36469 alt 2 hd 255 sec 63>

          /scsi_vhci/disk at g5000c5006c0b29c7

       1. c1t5000C5006C0BF63Fd0 <SEAGATE-ST3300657SS-0008 cyl 36469 alt 2 hd 255 sec 63>

          /scsi_vhci/disk at g5000c5006c0bf63f

       2. c1t5000C50057C1F5BFd0 <SEAGATE-ST4000NM0023-0003-3.64TB>

          /scsi_vhci/disk at g5000c50057c1f5bf

       3. c1t5000C50057C1FCE3d0 <SEAGATE-ST4000NM0023-0003-3.64TB>

          /scsi_vhci/disk at g5000c50057c1fce3


[...]


If i set mpxio-disable="yes"; in mpt_sas.conf the error above obviously dissapears and also i can see the 'real' device/controller id's when issuing the format command.


If things were working correctly i assume i would see a total path count of 2 per disk, and the multipath status wouldn't be set as degraded in the log? What am i doing wrong? I've asked google and since they dont know the answer to the question i'd thought
 i'd try a post here ;)


Thanks in advance for any help,


Martin


_______________________________________________

OmniOS-discuss mailing list

OmniOS-discuss at lists.omniti.com

http://lists.omniti.com/mailman/listinfo/omnios-discuss


From carlb at flamewarestudios.com  Wed Mar 26 22:45:49 2014
From: carlb at flamewarestudios.com (Carl Brunning)
Date: Wed, 26 Mar 2014 22:45:49 +0000
Subject: [OmniOS-discuss] Problem with using omnios-build
In-Reply-To: <29D2E00A4E2C9B4893E610929CA990A966AC38EC@srv01.cblinux.co.uk>
References: <29D2E00A4E2C9B4893E610929CA990A966A6F515@srv01.cblinux.co.uk>
	<77965151-05DF-41EA-992A-6C48992AFBBE@omniti.com>
	<29D2E00A4E2C9B4893E610929CA990A966AC38EC@srv01.cblinux.co.uk>
Message-ID: <29D2E00A4E2C9B4893E610929CA990A966AC817F@srv01.cblinux.co.uk>

Anyone 
Sorry if am pain just want to learn from this 
And why it not work 

Thanks
carl brunning

-----Original Message-----
From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] On Behalf Of Carl Brunning

Sent: 26 March 2014 11:13
To: Dan McDonald
Cc: omnios-discuss at lists.omniti.com
Subject: Re: [OmniOS-discuss] Problem with using omnios-build

Thanks for that yes that help me a little more Now I just got to work out why it has a problem uploading to the repo This is the error I get 

PATH=/tmp/build_admin/pkg-1.0/pkg/src/pkg/../../proto/root_i386/usr/bin:/usr/sbin:/usr/bin  PYTHONPATH=/tmp/build_admin/pkg-1.0/pkg/src/pkg/../../proto/root_i386/usr/lib/python2.6/vendor-packages pkgsend -s http://repo.flamewarestudios.com:10000/ publish -d /tmp/build_admin/pkg-1.0/pkg/src/pkg/../../proto/root_i386 \

        -d license_files -T \*.py --fmri-in-manifest  \
        pkgtmp/SUNWipkg-brand.dep.res
pkgsend: Transfer from 'http://repo.removeed failed: api_errors.InvalidP5IFile:. (happened 4 times)

the repo is a openindiana system
could this be my problem now
I have got most of the other package compile and up load with no problems Just pkg one is the problem

Thanks

Carl Brunning
-----Original Message-----
From: Dan McDonald [mailto:danmcd at omniti.com]
Sent: 25 March 2014 17:15
To: Carl Brunning
Cc: omnios-discuss at lists.omniti.com
Subject: Re: [OmniOS-discuss] Problem with using omnios-build


On Mar 25, 2014, at 12:27 PM, Carl Brunning <carlb at flamewarestudios.com> wrote:

> HI
> am playing with the omnios-build
> but when trying to do pkg build i find on the git clone it having a 
> problem the line is this logcmd $GIT clone -b $PKG_BRANCH 
> src at src.omniti.com:~omnios/core/pkg
> 
> the branch is r151006

I fixed some things in "master" to this.  It should've just swapped out "src@" for "anon@".

> this is wanting a password but I don't know what it is i did see on 
> the latest version of the build you have changed to this
> 
> logcmd  $GIT clone -b omni 
> anon at src.omniti.com:~omnios/core/illumos-omni-os

Hmmm.  This is a very old artifact.  The "master" version has this changeset:

osdev2(build/pkg)[0]% git show --stat 85f25c28 commit 85f25c28969c859fb9bc838a6779dd3a70286896
Author: Theo Schlossnagle <jesus at omniti.com>
Date:   Sat Mar 24 20:06:12 2012 +0000

    move pkg to git and make it work

 build/pkg/build.sh | 47 ++++++++++++++++++++++++++++-------------------
 1 file changed, 28 insertions(+), 19 deletions(-) osdev2(build/pkg)[0]% 

that eliminates the need for the clone to happen.

Generally speaking, though, it's illumos-omnios:

	anon at src.omniti.com:~omnios/core/illumos-omnios

Perhaps fixing that will work?

> so why it asking for a password for the first one lol

That's a bug.  I didn't fix it in anywhere other than master, though.

> and what is it
> and i hope you fix it all

I'm curious if pkg can be re-rewhacked to use headers from a completed omnios build?  I'm introducing new features into omnios-build to make it completely fire-and-forget.  One new feature not yet back is the PREBUILT_OMNIOS feature, that can point packages that depend on a populated illumos-omnios proto area.  Perhaps I should include PREBUILT_OMNIOS support in pkg as well?!

Pardon any slowness on my part.  I'm new at pkg, and at the build outside of illumos-omnios itself.

Thanks,
Dan


_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


From carlb at flamewarestudios.com  Thu Mar 27 16:56:35 2014
From: carlb at flamewarestudios.com (Carl Brunning)
Date: Thu, 27 Mar 2014 16:56:35 +0000
Subject: [OmniOS-discuss] Problem with using omnios-build
In-Reply-To: <29D2E00A4E2C9B4893E610929CA990A966AC817F@srv01.cblinux.co.uk>
References: <29D2E00A4E2C9B4893E610929CA990A966A6F515@srv01.cblinux.co.uk>
	<77965151-05DF-41EA-992A-6C48992AFBBE@omniti.com>
	<29D2E00A4E2C9B4893E610929CA990A966AC38EC@srv01.cblinux.co.uk>,
	<29D2E00A4E2C9B4893E610929CA990A966AC817F@srv01.cblinux.co.uk>
Message-ID: <29D2E00A4E2C9B4893E610929CA990A966ACA4E6@srv01.cblinux.co.uk>

HI just to say I've fixed it
so it all good now

thanks
Carl Brunning
________________________________________
From: Carl Brunning
Sent: 26 March 2014 22:45
To: Carl Brunning; Dan McDonald
Cc: omnios-discuss at lists.omniti.com
Subject: RE: [OmniOS-discuss] Problem with using omnios-build

Anyone
Sorry if am pain just want to learn from this
And why it not work

Thanks
carl brunning

-----Original Message-----
From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] On Behalf Of Carl Brunning
Sent: 26 March 2014 11:13
To: Dan McDonald
Cc: omnios-discuss at lists.omniti.com
Subject: Re: [OmniOS-discuss] Problem with using omnios-build

Thanks for that yes that help me a little more Now I just got to work out why it has a problem uploading to the repo This is the error I get

PATH=/tmp/build_admin/pkg-1.0/pkg/src/pkg/../../proto/root_i386/usr/bin:/usr/sbin:/usr/bin  PYTHONPATH=/tmp/build_admin/pkg-1.0/pkg/src/pkg/../../proto/root_i386/usr/lib/python2.6/vendor-packages pkgsend -s http://repo.flamewarestudios.com:10000/ publish -d /tmp/build_admin/pkg-1.0/pkg/src/pkg/../../proto/root_i386 \
        -d license_files -T \*.py --fmri-in-manifest  \
        pkgtmp/SUNWipkg-brand.dep.res
pkgsend: Transfer from 'http://repo.removeed failed: api_errors.InvalidP5IFile:. (happened 4 times)

the repo is a openindiana system
could this be my problem now
I have got most of the other package compile and up load with no problems Just pkg one is the problem

Thanks

Carl Brunning
-----Original Message-----
From: Dan McDonald [mailto:danmcd at omniti.com]
Sent: 25 March 2014 17:15
To: Carl Brunning
Cc: omnios-discuss at lists.omniti.com
Subject: Re: [OmniOS-discuss] Problem with using omnios-build


On Mar 25, 2014, at 12:27 PM, Carl Brunning <carlb at flamewarestudios.com> wrote:

> HI
> am playing with the omnios-build
> but when trying to do pkg build i find on the git clone it having a
> problem the line is this logcmd $GIT clone -b $PKG_BRANCH
> src at src.omniti.com:~omnios/core/pkg
>
> the branch is r151006

I fixed some things in "master" to this.  It should've just swapped out "src@" for "anon@".

> this is wanting a password but I don't know what it is i did see on
> the latest version of the build you have changed to this
>
> logcmd  $GIT clone -b omni
> anon at src.omniti.com:~omnios/core/illumos-omni-os

Hmmm.  This is a very old artifact.  The "master" version has this changeset:

osdev2(build/pkg)[0]% git show --stat 85f25c28 commit 85f25c28969c859fb9bc838a6779dd3a70286896
Author: Theo Schlossnagle <jesus at omniti.com>
Date:   Sat Mar 24 20:06:12 2012 +0000

    move pkg to git and make it work

 build/pkg/build.sh | 47 ++++++++++++++++++++++++++++-------------------
 1 file changed, 28 insertions(+), 19 deletions(-) osdev2(build/pkg)[0]%

that eliminates the need for the clone to happen.

Generally speaking, though, it's illumos-omnios:

        anon at src.omniti.com:~omnios/core/illumos-omnios

Perhaps fixing that will work?

> so why it asking for a password for the first one lol

That's a bug.  I didn't fix it in anywhere other than master, though.

> and what is it
> and i hope you fix it all

I'm curious if pkg can be re-rewhacked to use headers from a completed omnios build?  I'm introducing new features into omnios-build to make it completely fire-and-forget.  One new feature not yet back is the PREBUILT_OMNIOS feature, that can point packages that depend on a populated illumos-omnios proto area.  Perhaps I should include PREBUILT_OMNIOS support in pkg as well?!

Pardon any slowness on my part.  I'm new at pkg, and at the build outside of illumos-omnios itself.

Thanks,
Dan


_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


From danmcd at omniti.com  Thu Mar 27 17:10:46 2014
From: danmcd at omniti.com (Dan McDonald)
Date: Thu, 27 Mar 2014 13:10:46 -0400
Subject: [OmniOS-discuss] Problem with using omnios-build
In-Reply-To: <29D2E00A4E2C9B4893E610929CA990A966ACA4E6@srv01.cblinux.co.uk>
References: <29D2E00A4E2C9B4893E610929CA990A966A6F515@srv01.cblinux.co.uk>
	<77965151-05DF-41EA-992A-6C48992AFBBE@omniti.com>
	<29D2E00A4E2C9B4893E610929CA990A966AC38EC@srv01.cblinux.co.uk>,
	<29D2E00A4E2C9B4893E610929CA990A966AC817F@srv01.cblinux.co.uk>
	<29D2E00A4E2C9B4893E610929CA990A966ACA4E6@srv01.cblinux.co.uk>
Message-ID: <9B51789D-9DB2-4406-B57C-B694AE245C82@omniti.com>


On Mar 27, 2014, at 12:56 PM, Carl Brunning <carlb at flamewarestudios.com> wrote:

> HI just to say I've fixed it
> so it all good now

Good!  Sorry I didn't respond earlier.  A recent push into omnios-build has cause one of my works-in-progress some problems, so I've been debugging that.

How did you fix your problem?  Is it something we need to fix properly?  Do you want to contribute if it is?

Thanks,
Dan


From carlb at flamewarestudios.com  Thu Mar 27 17:38:48 2014
From: carlb at flamewarestudios.com (Carl Brunning)
Date: Thu, 27 Mar 2014 17:38:48 +0000
Subject: [OmniOS-discuss] Problem with using omnios-build
In-Reply-To: <9B51789D-9DB2-4406-B57C-B694AE245C82@omniti.com>
References: <29D2E00A4E2C9B4893E610929CA990A966A6F515@srv01.cblinux.co.uk>
	<77965151-05DF-41EA-992A-6C48992AFBBE@omniti.com>
	<29D2E00A4E2C9B4893E610929CA990A966AC38EC@srv01.cblinux.co.uk>,
	<29D2E00A4E2C9B4893E610929CA990A966AC817F@srv01.cblinux.co.uk>
	<29D2E00A4E2C9B4893E610929CA990A966ACA4E6@srv01.cblinux.co.uk>
	<9B51789D-9DB2-4406-B57C-B694AE245C82@omniti.com>
Message-ID: <29D2E00A4E2C9B4893E610929CA990A966ACB534@srv01.cblinux.co.uk>

Hay not a problem
What I found I had bad clock skew
So reset the clock on the build machine and on the repo machine so far has fixed that problem

Did not know bad clock can casue so much build problems lol
Even caused my illumos build problems as well

Anyway am all good now
Just to to see what my next problem is and fix it 

Keep up the good work 
I like the build scripts, have to see if they can be used for other os build lol

Thanks
Carl Brunning 

-----Original Message-----
From: Dan McDonald [mailto:danmcd at omniti.com] 
Sent: 27 March 2014 17:11
To: Carl Brunning
Cc: omnios-discuss at lists.omniti.com
Subject: Re: [OmniOS-discuss] Problem with using omnios-build


On Mar 27, 2014, at 12:56 PM, Carl Brunning <carlb at flamewarestudios.com> wrote:

> HI just to say I've fixed it
> so it all good now

Good!  Sorry I didn't respond earlier.  A recent push into omnios-build has cause one of my works-in-progress some problems, so I've been debugging that.

How did you fix your problem?  Is it something we need to fix properly?  Do you want to contribute if it is?

Thanks,
Dan


From nitram at konsortit.se  Fri Mar 28 15:33:17 2014
From: nitram at konsortit.se (Nitram Grebredna)
Date: Fri, 28 Mar 2014 16:33:17 +0100
Subject: [OmniOS-discuss] Multipathing,
 only one path visible - there ought to be two, what am i doing wrong?
In-Reply-To: <0AE3E26797567E4AAB5C53C304D024455DA1A23D@ns-ex2010.new-swankton.lan>
References: <CAEPZ3ejqKfxXH7w+a-dcnmURBEQdj5r759Ga0bkXUVvmt0BG5w@mail.gmail.com>
	<9A1D4D35-8F4D-4175-BFF7-A887846FCEBA@ferebee.net>
	<0AE3E26797567E4AAB5C53C304D024455DA1A23D@ns-ex2010.new-swankton.lan>
Message-ID: <CAEPZ3eiRx06DgWPp6fc3HfROowsLyPz=nQ8EAKzoVMsYacxL-g@mail.gmail.com>

Hi guys!

Thanks for the input. I've had a second look at the backplane and you were
right Chris, it's not dual port. Should i set mpxio-disable="yes" to
disable mpxio on the sas driver or should i leave it as is? My guess would
be to disable mpxio to actually see which controller holds which disk, this
to be able to spread the mirror-sets across multiple controllers. Would you
agree?

Best regards,

Martin


On Wed, Mar 26, 2014 at 8:02 PM, Russell Hansen <russhan at new-swankton.net>wrote:

> Because those disks don't have Sun/Oracle firmware I believe you need to
> update /kernel/drv/scsi_vhci.conf
>
> scsi-vhci-failover-override =
> "SEAGATE ST3300657SS", "f_sym",
> "SEAGATE ST4000NM0023", "f_sym";
>
> You can double-check the VID/PID string by running format -> disk# ->
> inquiry
>
> -Russ
>
>
>
> From: OmniOS-discuss [omnios-discuss-bounces at lists.omniti.com] on behalf
> of Chris Ferebee [cf at ferebee.net]
>
> Sent: Wednesday, March 26, 2014 10:33 AM
>
> To: omnios-discuss at lists.omniti.com
>
> Subject: Re: [OmniOS-discuss] Multipathing, only one path visible - there
> ought to be two, what am i doing wrong?
>
>
>
>
>
>
> Martin,
>
>
>
> Are you sure you have SAS expanders in your backplane? Supermicro will
> sell you the same chassis with or without expanders, with almost identical
> model numbers.
>
>
>
> You've described a typical JBOD configuration (i. e., no expanders): Each
> LSI 2308 has 8 SAS/SATA ports, 4 ports on each of 2 Mini-SAS SFF8087
> connectors. Thus, with 3 controllers you are running 3 x 8 = 24 SAS ports
> to the backplane.
>
>
>
> Do you have the exact model number of the chassis or backplane?
>
>
>
> Best,
> Chris
>
>
> Am 26.03.2014 um 14:05 schrieb Nitram Grebredna <nitram at konsortit.se>:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Hi!
>
>
>
>
> I'm having issues with multipathing, and i cant seem to figure out what is
> wrong.
>
>
>
>
>
> The setup is a Supermicro 24 disk-box with 3 controllers (1 pcs internal
> SAS2308, two 9207i-cards, same firmware on all units), identified as
> follows:
>
>
>
> Num   Ctlr            FW Ver        NVDATA        x86-BIOS         PCI Addr
>
>
> ----------------------------------------------------------------------------
>
>
>
> 0  SAS2308_2(D1)   18.00.00.00    11.00.00.05    07.33.00.00
> 00:01:00:00
>
> 1  SAS2308_2(D1)   18.00.00.00    11.00.00.05    07.33.00.00
> 00:02:00:00
>
> 2  SAS2308_2(D1)   18.00.00.00    11.00.00.05    07.33.00.00
> 00:03:00:00
>
>
>
>
> The machine has 12 ST4000NM0023 (seagate 4TB DP) disks in it and a couple
> or bootdisks. The controllers are connected via 2 cables per controller to
> the backplane/expander. I've Installed latest stable omnios on it.
>
>
>
>
> Excerpt from dmesg:
>
>
>
> [...]
>
>
>
> genunix: [ID 936769
> kern.info] mpt_sas2 is /pci at 0,0/pci8086,e06 at 2,2/pci1000,3020 at 0
>
> scsi: [ID 583861
> kern.info] mpt_sas7 at mpt_sas2: scsi-iport 4
>
> genunix: [ID 936769
> kern.info] mpt_sas7 is /pci at 0,0/pci8086,e06 at 2,2/pci1000,3020 at 0/iport at 4
>
> genunix: [ID 408114
> kern.info] /pci at 0,0/pci8086,e06 at 2,2/pci1000,3020 at 0/iport at 4 (mpt_sas7)
> online
>
> scsi: [ID 583861
> kern.info] sd11 at scsi_vhci0: unit-address g5000c50057c1fce3: conf f_sym
>
> genunix: [ID 936769
> kern.info] sd11 is /scsi_vhci/disk at g5000c50057c1fce3
>
> genunix: [ID 408114
> kern.info] /scsi_vhci/disk at g5000c50057c1fce3 (sd11) online
>
> genunix: [ID 483743
> kern.info] /scsi_vhci/disk at g5000c50057c1fce3 (sd11) multipath status:
> degraded: path 4 mpt_sas15/disk at w5000c50057c1fce1,0 is online
>
> scsi: [ID 583861
> kern.info] mpt_sas11 at mpt_sas1: scsi-iport 2
>
> genunix: [ID 936769
> kern.info] mpt_sas11 is /pci at 0,0/pci8086,e04 at 2/pci1000,3020 at 0/iport at 2
>
> genunix: [ID 408114
> kern.info] /pci at 0,0/pci8086,e04 at 2/pci1000,3020 at 0/iport at 2 (mpt_sas11)
> online
>
> scsi: [ID 583861
> kern.info] sd2 at scsi_vhci0: unit-address g5000c50057ca74bb: conf f_sym
>
> genunix: [ID 936769
> kern.info] sd2 is /scsi_vhci/disk at g5000c50057ca74bb
>
> genunix: [ID 408114
> kern.info] /scsi_vhci/disk at g5000c50057ca74bb (sd2) online
>
>
>
> [...]
>
>
>
>
> Excerpt from mpathadm:
>
>
>
> # mpathadm list lu
>
>         /dev/rdsk/c1t5000C5006C0BF63Fd0s2
>
>                 Total Path Count: 1
>
>                 Operational Path Count: 1
>
>         /dev/rdsk/c1t5000C5006C0B29C7d0s2
>
>                 Total Path Count: 1
>
>                 Operational Path Count: 1
>
>         /dev/rdsk/c1t5000C50057C1F5BFd0s2
>
>                 Total Path Count: 1
>
>
>
> [...]
>
>
>
>
>
>
> Excerpt from format:
>
>
>
> # format
>
> Searching for disks...done
>
>
>
>
>
> AVAILABLE DISK SELECTIONS:
>
>        0. c1t5000C5006C0B29C7d0 <SEAGATE-ST3300657SS-0008 cyl 36469 alt 2
> hd 255 sec 63>
>
>           /scsi_vhci/disk at g5000c5006c0b29c7
>
>        1. c1t5000C5006C0BF63Fd0 <SEAGATE-ST3300657SS-0008 cyl 36469 alt 2
> hd 255 sec 63>
>
>           /scsi_vhci/disk at g5000c5006c0bf63f
>
>        2. c1t5000C50057C1F5BFd0 <SEAGATE-ST4000NM0023-0003-3.64TB>
>
>           /scsi_vhci/disk at g5000c50057c1f5bf
>
>        3. c1t5000C50057C1FCE3d0 <SEAGATE-ST4000NM0023-0003-3.64TB>
>
>           /scsi_vhci/disk at g5000c50057c1fce3
>
>
>
> [...]
>
>
>
>
> If i set mpxio-disable="yes"; in mpt_sas.conf the error above obviously
> dissapears and also i can see the 'real' device/controller id's when
> issuing the format command.
>
>
>
>
>
> If things were working correctly i assume i would see a total path count
> of 2 per disk, and the multipath status wouldn't be set as degraded in the
> log? What am i doing wrong? I've asked google and since they dont know the
> answer to the question i'd thought
>  i'd try a post here ;)
>
>
>
>
> Thanks in advance for any help,
>
>
>
>
> Martin
>
>
>
>
>
> _______________________________________________
>
> OmniOS-discuss mailing list
>
> OmniOS-discuss at lists.omniti.com
>
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
>
>
>
>
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140328/5291af4c/attachment-0001.html>

From 4omnios at nccg.de  Sat Mar 29 21:27:49 2014
From: 4omnios at nccg.de (4omnios at nccg.de)
Date: Sat, 29 Mar 2014 22:27:49 +0100
Subject: [OmniOS-discuss] (r151008) iscsi target, LU etc missing after reboot
Message-ID: <!&!AAAAAAAAAAAYAAAAAAAAAK8PMVSgEr9Jia+gY4035vvCnQAAEAAAAKzrpcEmC2hHj0oGAhKjFYMBAAAAAA==@nccg.de>

=== error symptoms
system with iscsi and SRP target. works great ... until reboot
 
then only SRP target shows up
  Target: eui.001A4BFFFF0C3218
but no iscsi target (iqn...), no LU. Neither do configured TG and HG exist
anymore.
svcs export/import stmf does not help
 
=== software versions
omnios-6de5e81, OmniOS v11 r151008
  * srpt pck was already installed, nothing else added
  * napp-it running version : 0.9e1_nightly Jan.25.2014
 
thx for any ideas how to solve
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140329/ad53bf30/attachment.html>

From johan.kragsterman at capvert.se  Sun Mar 30 11:42:12 2014
From: johan.kragsterman at capvert.se (Johan Kragsterman)
Date: Sun, 30 Mar 2014 13:42:12 +0200
Subject: [OmniOS-discuss] Again: Infiniband, OVUF, subnet manager,
	etc. Garret?
Message-ID: <OF850FA80E.B7CAC43E-ONC1257CAB.0032F3B3-C1257CAB.00404A1B@inse.com>

Hi!


Back again with some infiniband questions:

I've been trying to read up around the "IB subnet manager" in omnios/illumos, and what I don't understand so far is, if it is included in OVUF or not...? There are very few information sources out there regarding this...

What I also THINK I understand, is that OVUF is included in the ofk package.

Or is this all libraries, no tools...? If there are tools around, pls inform me...

Anyone with information on the subject? Garret?

So why am I nagging about this subject in eternity...?

It is because I would like to implement a dual storage head failover solution with zil on IB. So I'd prefer it to be direct attach, and would like to avoid IB switches.


Best regards from/Med v?nliga h?lsningar fr?n

Johan Kragsterman

Capvert


From tobi at oetiker.ch  Mon Mar 31 14:16:08 2014
From: tobi at oetiker.ch (Tobias Oetiker)
Date: Mon, 31 Mar 2014 16:16:08 +0200 (CEST)
Subject: [OmniOS-discuss] zpool degraded while smart sais disks are OK
In-Reply-To: <DF7112ED-1B47-437F-98C7-5C72CB794349@richardelling.com>
References: <alpine.DEB.2.02.1403211743260.3125@froburg.oetiker.ch>
	<39B55A5A-AA04-4C56-8A74-5B9316861405@RichardElling.com>
	<CALLZ=vXbhBpBwvN5ninNy_ehYPujRa6WmFH3w_oze7pxX67Urw@mail.gmail.com>
	<alpine.DEB.2.02.1403212322540.3125@froburg.oetiker.ch>
	<0D51CBC0-D049-4A12-A733-7DDB6320BD82@richardelling.com>
	<alpine.DEB.2.02.1403220606480.3125@froburg.oetiker.ch>
	<DF7112ED-1B47-437F-98C7-5C72CB794349@richardelling.com>
Message-ID: <alpine.DEB.2.02.1403311613040.3125@froburg.oetiker.ch>

Hi Richard,

Mar 23 Richard Elling wrote:

>
> On Mar 21, 2014, at 10:13 PM, Tobias Oetiker <tobi at oetiker.ch> wrote:
>
> > Yesterday Richard Elling wrote:
> >
> >>
> >> On Mar 21, 2014, at 3:23 PM, Tobias Oetiker <tobi at oetiker.ch> wrote:
> >
> > [...]
> >>>
> >>> it happened over time as you can see from the timestamps in the
> >>> log. The errors from zfs's point of view were 1 read and about 30 write
> >>>
> >>> but according to smart the disks are without flaw
> >>
> >> Actually, SMART is pretty dumb. In most cases, it only looks for uncorrectable
> >> errors that are related to media or heads. For a clue to more permanent errors,
> >> you will want to look at the read/write error reports for errors that are
> >> corrected with possible delays. You can also look at the grown defects list.
> >>
> >> This behaviour is expected for drives with errors that are not being quickly
> >> corrected or have firmware bugs (horrors!) and where the disk does not do TLER
> >> (or its vendor's equivalent)
> >> -- richard
> >
> > the error counters look like this:
> >
> >
> > Error counter log:
> >           Errors Corrected by           Total   Correction     Gigabytes    Total
> >               ECC          rereads/    errors   algorithm      processed    uncorrected
> >           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
> > read:       3494        0         0      3494      44904        530.879           0
> > write:         0        0         0         0      39111       1793.323           0
> > verify:        0        0         0         0       8133          0.000           0
>
> Errors corrected without delay looks good. The problem lies elsewhere.
>
> >
> > the disk vendor is HGST in case anyone has further ideas ... the system has 20 of these disks and the problems occured with
> > three of them. The system has been running fine for two months previously.
>
> ...and yet there are aborted commands, likely due to a reset after a timeout.
> Resets aren't issued without cause.
>
> There are two different resets issued by the sd driver: LU and bus. If the
> LU reset doesn't work, the resets are escalated to bus. This is, of course,
> tunable, but is rarely tuned. A bus reset for SAS is a questionable practice,
> since SAS is a fabric, not a bus. But the effect of a device in the fabric
> being reset could be seen as aborted commands by more than one target. To
> troubleshoot these cases, you need to look at all of the devices in the data
> path and map the common causes: HBAs, expanders, enclosures, etc. Traverse
> the devices looking for errors, as you did with the disks. Useful tools:
> sasinfo, lsiutil/sas2ircu, smp_utils, sg3_utils, mpathadm, fmtopo.

thanks for the hints ... after detatching/attaching the 'failed'
disks, they got resilvered and a subsequent scrub did not detect
any errors ...

all a bit mysterious ... will keep an eye on the box to see how it
fares on the future ...

cheers
tobi


-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
www.oetiker.ch tobi at oetiker.ch +41 62 775 9902
*** We are hiring IT staff: www.oetiker.ch/jobs ***

From steve at linuxsuite.org  Mon Mar 31 19:04:10 2014
From: steve at linuxsuite.org (steve at linuxsuite.org)
Date: Mon, 31 Mar 2014 15:04:10 -0400
Subject: [OmniOS-discuss] How to disable ata module / driver at boot
Message-ID: <7409d33d8efc08eccda1cecdc31bd7ea.squirrel@emailmg.netfirms.com>


      Howdy!

         I have omnios running on Dell R710, and get these warnings for
device ata0. This device is a TEAC DVD ROM.

kern.warning<4>: Nov 11 09:30:05 dfs2 #011timeout: reset target, target=0
lun=0
kern.warning<4>: Nov 11 09:30:05 dfs2 scsi: [ID 107833 kern.warning]
WARNING: /pci at 0,0/pci-ide at 1f,2/ide at 0 (ata0):
kern.warning<4>: Nov 11 09:30:05 dfs2 #011timeout: reset bus, target=0 lun=0
kern.info<6>: Nov 11 09:35:56 dfs2 pci_autoconfig: [ID 595143 kern.info]
NOTICE: add io-range on subtractive ppb[0/1e/0]: 0x3000 ~ 0x3fff

         Then system hangs and needs to be power cycled..

kern.info<6>: Nov 11 09:35:56 dfs2 genunix: [ID 936769 kern.info] pseudo0
is /pseudo

          May not be related, but I would like to reboot so that OmniOS
does not
see the device by not loading the driver / module. I do not need the
device after
system install..

         What is the best way to do this?

         thanx - steve


From jdg117 at elvis.arl.psu.edu  Mon Mar 31 23:31:24 2014
From: jdg117 at elvis.arl.psu.edu (John D Groenveld)
Date: Mon, 31 Mar 2014 19:31:24 -0400
Subject: [OmniOS-discuss] How to disable ata module / driver at boot
In-Reply-To: Your message of "Mon, 31 Mar 2014 15:04:10 EDT."
	<7409d33d8efc08eccda1cecdc31bd7ea.squirrel@emailmg.netfirms.com> 
References: <7409d33d8efc08eccda1cecdc31bd7ea.squirrel@emailmg.netfirms.com> 
Message-ID: <201403312331.s2VNVOIW011926@elvis.arl.psu.edu>

In message <7409d33d8efc08eccda1cecdc31bd7ea.squirrel at emailmg.netfirms.com>, st
eve at linuxsuite.org writes:
>          May not be related, but I would like to reboot so that OmniOS
>does not
>see the device by not loading the driver / module. I do not need the
>device after
>system install..

disable-ata=true
<URL:http://permalink.gmane.org/gmane.os.solaris.opensolaris.indiana/8851>

John
groenveld at acm.org