[OmniOS-discuss] zpool upgrade
Hafiz Rafibeyli
rafibeyli at gmail.com
Tue Nov 19 15:40:53 UTC 2013
Hello,
any side effects upgrading zfs pool for future flags?
I tested it on my test server ,but is it ok for prod zfs pools?
# zpool upgrade ydkpool
This system supports ZFS pool feature flags.
Successfully upgraded 'ydkpool' from version 28 to feature flags.
Enabled the following features on 'ydkpool':
async_destroy
empty_bpobj
lz4_compress
----- Orijinal Mesaj -----
Kimden: omnios-discuss-request at lists.omniti.com
Kime: omnios-discuss at lists.omniti.com
Gönderilenler: 19 Kasım Salı 2013 0:42:32
Konu: OmniOS-discuss Digest, Vol 20, Issue 19
Send OmniOS-discuss mailing list submissions to
omnios-discuss at lists.omniti.com
To subscribe or unsubscribe via the World Wide Web, visit
http://lists.omniti.com/mailman/listinfo/omnios-discuss
or, via email, send a message with subject or body 'help' to
omnios-discuss-request at lists.omniti.com
You can reach the person managing the list at
omnios-discuss-owner at lists.omniti.com
When replying, please edit your Subject line so it is more specific
than "Re: Contents of OmniOS-discuss digest..."
Today's Topics:
1. Re: kernel panic - anon_decref (wuffers)
2. Re: kernel panic - anon_decref (wuffers)
----------------------------------------------------------------------
Message: 1
Date: Sat, 16 Nov 2013 02:48:43 -0500
From: wuffers <moo at wuffers.net>
To: Saso Kiselkov <skiselkov.ml at gmail.com>
Cc: omnios-discuss <omnios-discuss at lists.omniti.com>
Subject: Re: [OmniOS-discuss] kernel panic - anon_decref
Message-ID:
<CA+tR_KyemANkTZyKs-1ejp+vzSGkZsDoBc=hhMqh-Qcu8AsvDA at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
When it pours, it rains. With r151006y, I had two kernel panics in quick
succession while trying to create some zero thick eager disks (4 at the
same time) in ESXi. They are now "kernel heap corruption detected" instead
of anon_decref.
Kernel panic 2 (dump info:
https://drive.google.com/file/d/0B7mCJnZUzJPKMHhqZHJnaDEzYkk)
http://i.imgur.com/eIssxmc.png?1
http://i.imgur.com/MXJy4zP.png?1
TIME UUID
SUNW-MSG-ID
Nov 16 2013 00:51:24.912170000 5998ba1e-3aa5-ccac-e885-be4897cfcfe8
SUNOS-8000-KL
TIME CLASS ENA
Nov 16 00:51:24.8638 ireport.os.sunos.panic.dump_available
0x0000000000000000
Nov 16 00:49:58.8671 ireport.os.sunos.panic.dump_pending_on_device
0x0000000000000000
nvlist version: 0
version = 0x0
class = list.suspect
uuid = 5998ba1e-3aa5-ccac-e885-be4897cfcfe8
code = SUNOS-8000-KL
diag-time = 1384581084 866703
de = fmd:///module/software-diagnosis
fault-list-sz = 0x1
fault-list = (array of embedded nvlists)
(start fault-list[0])
nvlist version: 0
version = 0x0
class = defect.sunos.kernel.panic
certainty = 0x64
asru =
sw:///:path=/var/crash/unknown/.5998ba1e-3aa5-ccac-e885-be4897cfcfe8
resource =
sw:///:path=/var/crash/unknown/.5998ba1e-3aa5-ccac-e885-be4897cfcfe8
savecore-succcess = 1
dump-dir = /var/crash/unknown
dump-files = vmdump.1
os-instance-uuid = 5998ba1e-3aa5-ccac-e885-be4897cfcfe8
panicstr = kernel heap corruption detected
panicstack = fffffffffba49c04 () |
genunix:kmem_slab_free+c1 () | genunix:kmem_magazine_destroy+6e () |
genunix:kmem_depot_ws_reap+5d () | genunix:kmem_cache_magazine_purge+118 ()
| genunix:kmem_cache_magazine_resize+40 () | genunix:taskq_thread+2d0 () |
unix:thread_start+8 () |
crashtime = 1384577735
panic-time = Fri Nov 15 23:55:35 2013 EST
(end fault-list[0])
fault-status = 0x1
severity = Major
__ttl = 0x1
__tod = 0x528707dc 0x365e9c10
kernel panic 3 (dump info:
https://drive.google.com/file/d/0B7mCJnZUzJPKbnZIeWZzQjhUOTQ):
(looked the same, no screenshots)
TIME UUID
SUNW-MSG-ID
Nov 16 2013 01:44:43.327489000 a6592c60-199f-ead5-9586-ff013bf5ab2d
SUNOS-8000-KL
TIME CLASS ENA
Nov 16 01:44:43.2941 ireport.os.sunos.panic.dump_available
0x0000000000000000
Nov 16 01:44:03.5356 ireport.os.sunos.panic.dump_pending_on_device
0x0000000000000000
nvlist version: 0
version = 0x0
class = list.suspect
uuid = a6592c60-199f-ead5-9586-ff013bf5ab2d
code = SUNOS-8000-KL
diag-time = 1384584283 296816
de = fmd:///module/software-diagnosis
fault-list-sz = 0x1
fault-list = (array of embedded nvlists)
(start fault-list[0])
nvlist version: 0
version = 0x0
class = defect.sunos.kernel.panic
certainty = 0x64
asru =
sw:///:path=/var/crash/unknown/.a6592c60-199f-ead5-9586-ff013bf5ab2d
resource =
sw:///:path=/var/crash/unknown/.a6592c60-199f-ead5-9586-ff013bf5ab2d
savecore-succcess = 1
dump-dir = /var/crash/unknown
dump-files = vmdump.2
os-instance-uuid = a6592c60-199f-ead5-9586-ff013bf5ab2d
panicstr = kernel heap corruption detected
panicstack = fffffffffba49c04 () |
genunix:kmem_slab_free+c1 () | genunix:kmem_magazine_destroy+6e () |
genunix:kmem_cache_magazine_purge+dc () |
genunix:kmem_cache_magazine_resize+40 () | genunix:taskq_thread+2d0 () |
unix:thread_start+8 () |
crashtime = 1384582658
panic-time = Sat Nov 16 01:17:38 2013 EST
(end fault-list[0])
fault-status = 0x1
severity = Major
__ttl = 0x1
__tod = 0x5287145b 0x138515e8
---
Now, having looked through all 3, I can see in the first two there were
some warnings:
WARNING: /pci at 0
<http://lists.omniti.com/mailman/listinfo/omnios-discuss>,0/pci8086,3c08
at 3 <http://lists.omniti.com/mailman/listinfo/omnios-discuss>/pci1000,3030
at 0 <http://lists.omniti.com/mailman/listinfo/omnios-discuss>
(mpt_sas1):
mptsas_handle_event_sync: IOCStatus=0x8000, IOCLogInfo=0x31120303
The /var/adm/message also had a sprinkling of these:
Nov 15 23:36:43 san1 scsi: [ID 243001 kern.warning] WARNING: /pci at 0
,0/pci8086,3c08 at 3/pci1000,3030 at 0 (mpt_sas1):
Nov 15 23:36:43 san1 mptsas_handle_event: IOCStatus=0x8000,
IOCLogInfo=0x31120303
Nov 15 23:36:43 san1 scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,3c08 at 3
/pci1000,3030 at 0 (mpt_sas1):
Nov 15 23:36:43 san1 Log info 0x31120303 received for target 10.
Nov 15 23:36:43 san1 scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc
Following this
http://lists.omniti.com/pipermail/omnios-discuss/2013-March/000544.html to
map the target disk, it's my Stec ZeusRAM ZIL drive that's configured as a
mirror (if I've done it right). I didn't see these errors in the 3rd dump,
so don't know if it's contributing. I may try to do a memtest tomorrow on
the system just in case it's some hardware issues.
My zpool status shows all my drives okay with no known data errors.
Not sure how to proceed from here.. my Hyper-V hosts have been using the
SAN with no issues for 2+ months since it's been up and configured, using
SRP and IB. I'd expect the VM hosts to crash before my SAN does.
Of course, I can make the vmdump.x files available to anyone who wants to
look at them (7GB, 8GB, 4GB).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20131116/302ede10/attachment-0001.html>
------------------------------
Message: 2
Date: Mon, 18 Nov 2013 17:42:31 -0500
From: wuffers <moo at wuffers.net>
To: Saso Kiselkov <skiselkov.ml at gmail.com>
Cc: omnios-discuss <omnios-discuss at lists.omniti.com>
Subject: Re: [OmniOS-discuss] kernel panic - anon_decref
Message-ID:
<CA+tR_Kx_hShMmt9mxEuje=Wy+CxnPoqBBt=SWNdTH=ttfnK1EA at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
Just to add to this, I had a 4th kernel panic, and this was a 3rd different
type. I did a memtest on the unit after this last panic, and it ran
successfully (24+ hours). I'm skeptical that it's memory, or something to
do with the IOCLogInfo=0x31120303 error (last 2 panics didn't have that - I
may start another thread on that), as I've been running this config with
Hyper-V hosts just fine. Adding an ESXi host (just one for now) into the
mix seems to make things unstable.
Should I be starting an issue in the Illumos issue report (
https://www.illumos.org/projects/illumos-gate/issues/new), and if so, just
one report or one for each panic type?
List of kernel panics so far:
Panic 1: anon_decref: slot count 0
Panic 2-3: kernel heap corruption detected
Panic 4: BAD TRAP: type=e (#pf Page fault) rp=ffffff01e97d7a70 addr=1500010
occurred in module "genunix" due to an illegal access to a user address
Latest crash file here:
https://drive.google.com/file/d/0B7mCJnZUzJPKWW83TFBhVHpVajQ
TIME UUID
SUNW-MSG-ID
Nov 17 2013 09:22:20.799446000 9d55f532-d39f-4dea-8f57-d3b24c8e9dff
SUNOS-8000-KL
TIME CLASS ENA
Nov 17 09:22:20.7654 ireport.os.sunos.panic.dump_available
0x0000000000000000
Nov 17 09:21:14.0267 ireport.os.sunos.panic.dump_pending_on_device
0x0000000000000000
nvlist version: 0
version = 0x0
class = list.suspect
uuid = 9d55f532-d39f-4dea-8f57-d3b24c8e9dff
code = SUNOS-8000-KL
diag-time = 1384698140 767808
de = fmd:///module/software-diagnosis
fault-list-sz = 0x1
fault-list = (array of embedded nvlists)
(start fault-list[0])
nvlist version: 0
version = 0x0
class = defect.sunos.kernel.panic
certainty = 0x64
asru =
sw:///:path=/var/crash/unknown/.9d55f532-d39f-4dea-8f57-d3b24c8e9dff
resource =
sw:///:path=/var/crash/unknown/.9d55f532-d39f-4dea-8f57-d3b24c8e9dff
savecore-succcess = 1
dump-dir = /var/crash/unknown
dump-files = vmdump.3
os-instance-uuid = 9d55f532-d39f-4dea-8f57-d3b24c8e9dff
panicstr = BAD TRAP: type=e (#pf Page fault)
rp=ffffff01e97d7a70 addr=1500010 occurred in module "genunix" due to an
illegal access to a user address
panicstack = unix:die+df () | unix:trap+db3 () |
unix:cmntrap+e6 () | genunix:anon_decref+35 () | genunix:anon_free+74 () |
genunix:segvn_free+242 () | genunix:seg_free+30 () |
genunix:segvn_unmap+cde () | genunix:as_free+e7 () | genunix:relvm+220 () |
genunix:proc_exit+454 () | genunix:exit+15 () | genunix:rexit+18 () |
unix:brand_sys_sysenter+1c9 () |
crashtime = 1384592942
panic-time = Sat Nov 16 04:09:02 2013 EST
(end fault-list[0])
fault-status = 0x1
severity = Major
__ttl = 0x1
__tod = 0x5288d11c 0x2fa693f0
On Sat, Nov 16, 2013 at 2:48 AM, wuffers <moo at wuffers.net> wrote:
> When it pours, it rains. With r151006y, I had two kernel panics in quick
> succession while trying to create some zero thick eager disks (4 at the
> same time) in ESXi. They are now "kernel heap corruption detected" instead
> of anon_decref.
>
> Kernel panic 2 (dump info:
> https://drive.google.com/file/d/0B7mCJnZUzJPKMHhqZHJnaDEzYkk)
> http://i.imgur.com/eIssxmc.png?1
> http://i.imgur.com/MXJy4zP.png?1
>
> TIME UUID
> SUNW-MSG-ID
> Nov 16 2013 00:51:24.912170000 5998ba1e-3aa5-ccac-e885-be4897cfcfe8
> SUNOS-8000-KL
>
> TIME CLASS ENA
> Nov 16 00:51:24.8638 ireport.os.sunos.panic.dump_available
> 0x0000000000000000
> Nov 16 00:49:58.8671 ireport.os.sunos.panic.dump_pending_on_device
> 0x0000000000000000
>
>
> nvlist version: 0
> version = 0x0
> class = list.suspect
> uuid = 5998ba1e-3aa5-ccac-e885-be4897cfcfe8
> code = SUNOS-8000-KL
> diag-time = 1384581084 866703
>
> de = fmd:///module/software-diagnosis
> fault-list-sz = 0x1
> fault-list = (array of embedded nvlists)
> (start fault-list[0])
> nvlist version: 0
> version = 0x0
> class = defect.sunos.kernel.panic
> certainty = 0x64
> asru =
> sw:///:path=/var/crash/unknown/.5998ba1e-3aa5-ccac-e885-be4897cfcfe8
> resource =
> sw:///:path=/var/crash/unknown/.5998ba1e-3aa5-ccac-e885-be4897cfcfe8
>
> savecore-succcess = 1
> dump-dir = /var/crash/unknown
> dump-files = vmdump.1
> os-instance-uuid = 5998ba1e-3aa5-ccac-e885-be4897cfcfe8
> panicstr = kernel heap corruption detected
> panicstack = fffffffffba49c04 () |
> genunix:kmem_slab_free+c1 () | genunix:kmem_magazine_destroy+6e () |
> genunix:kmem_depot_ws_reap+5d () | genunix:kmem_cache_magazine_purge+118 ()
> | genunix:kmem_cache_magazine_resize+40 () | genunix:taskq_thread+2d0 () |
> unix:thread_start+8 () |
> crashtime = 1384577735
> panic-time = Fri Nov 15 23:55:35 2013 EST
>
> (end fault-list[0])
>
> fault-status = 0x1
> severity = Major
> __ttl = 0x1
> __tod = 0x528707dc 0x365e9c10
>
> kernel panic 3 (dump info:
> https://drive.google.com/file/d/0B7mCJnZUzJPKbnZIeWZzQjhUOTQ):
> (looked the same, no screenshots)
>
> TIME UUID
> SUNW-MSG-ID
> Nov 16 2013 01:44:43.327489000 a6592c60-199f-ead5-9586-ff013bf5ab2d
> SUNOS-8000-KL
>
> TIME CLASS ENA
> Nov 16 01:44:43.2941 ireport.os.sunos.panic.dump_available
> 0x0000000000000000
> Nov 16 01:44:03.5356 ireport.os.sunos.panic.dump_pending_on_device
> 0x0000000000000000
>
>
> nvlist version: 0
> version = 0x0
> class = list.suspect
> uuid = a6592c60-199f-ead5-9586-ff013bf5ab2d
> code = SUNOS-8000-KL
> diag-time = 1384584283 296816
>
> de = fmd:///module/software-diagnosis
> fault-list-sz = 0x1
> fault-list = (array of embedded nvlists)
> (start fault-list[0])
> nvlist version: 0
> version = 0x0
> class = defect.sunos.kernel.panic
> certainty = 0x64
> asru =
> sw:///:path=/var/crash/unknown/.a6592c60-199f-ead5-9586-ff013bf5ab2d
> resource =
> sw:///:path=/var/crash/unknown/.a6592c60-199f-ead5-9586-ff013bf5ab2d
>
> savecore-succcess = 1
> dump-dir = /var/crash/unknown
> dump-files = vmdump.2
> os-instance-uuid = a6592c60-199f-ead5-9586-ff013bf5ab2d
> panicstr = kernel heap corruption detected
> panicstack = fffffffffba49c04 () |
> genunix:kmem_slab_free+c1 () | genunix:kmem_magazine_destroy+6e () |
> genunix:kmem_cache_magazine_purge+dc () |
> genunix:kmem_cache_magazine_resize+40 () | genunix:taskq_thread+2d0 () |
> unix:thread_start+8 () |
> crashtime = 1384582658
> panic-time = Sat Nov 16 01:17:38 2013 EST
>
> (end fault-list[0])
>
> fault-status = 0x1
> severity = Major
> __ttl = 0x1
> __tod = 0x5287145b 0x138515e8
>
>
> ---
> Now, having looked through all 3, I can see in the first two there were
> some warnings:
>
> WARNING: /pci at 0 <http://lists.omniti.com/mailman/listinfo/omnios-discuss>,0/pci8086,3c08 at 3 <http://lists.omniti.com/mailman/listinfo/omnios-discuss>/pci1000,3030 at 0 <http://lists.omniti.com/mailman/listinfo/omnios-discuss> (mpt_sas1):
> mptsas_handle_event_sync: IOCStatus=0x8000, IOCLogInfo=0x31120303
>
> The /var/adm/message also had a sprinkling of these:
> Nov 15 23:36:43 san1 scsi: [ID 243001 kern.warning] WARNING: /pci at 0
> ,0/pci8086,3c08 at 3/pci1000,3030 at 0 (mpt_sas1):
> Nov 15 23:36:43 san1 mptsas_handle_event: IOCStatus=0x8000,
> IOCLogInfo=0x31120303
> Nov 15 23:36:43 san1 scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,3c08 at 3
> /pci1000,3030 at 0 (mpt_sas1):
> Nov 15 23:36:43 san1 Log info 0x31120303 received for target 10.
> Nov 15 23:36:43 san1 scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc
>
> Following this
> http://lists.omniti.com/pipermail/omnios-discuss/2013-March/000544.htmlto map the target disk, it's my Stec ZeusRAM ZIL drive that's configured as
> a mirror (if I've done it right). I didn't see these errors in the 3rd
> dump, so don't know if it's contributing. I may try to do a memtest
> tomorrow on the system just in case it's some hardware issues.
>
> My zpool status shows all my drives okay with no known data errors.
>
> Not sure how to proceed from here.. my Hyper-V hosts have been using the
> SAN with no issues for 2+ months since it's been up and configured, using
> SRP and IB. I'd expect the VM hosts to crash before my SAN does.
>
> Of course, I can make the vmdump.x files available to anyone who wants to
> look at them (7GB, 8GB, 4GB).
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20131118/74a15cf9/attachment.html>
------------------------------
Subject: Digest Footer
_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss
------------------------------
End of OmniOS-discuss Digest, Vol 20, Issue 19
**********************************************
More information about the OmniOS-discuss
mailing list