From chip at innovates.com Sun May 4 13:39:10 2014 From: chip at innovates.com (Schweiss, Chip) Date: Sun, 4 May 2014 08:39:10 -0500 Subject: [OmniOS-discuss] [OpenIndiana-discuss] [developer] HBA recommended except LSI and ARECA In-Reply-To: References: <63594316-D19C-4292-B406-C8BADE5ED398@gmail.com> <5360c274.RBQHHg38Y+/JCe8C%Joerg.Schilling@fokus.fraunhofer.de> <5A21CFF6-06FA-45E5-8C25-A680B41569EB@gmail.com> <732EA48F-DCCC-4526-8F33-ACE7E8D1C340@gmail.com> <20140430162904.GB422@joyent.com> <20140430172628.GA776@joyent.com> <20140430184712.GA1017@joyent.com> Message-ID: The place that good SATA support is needed it not for spinning disk, but for SSD. The price delta is huge and even consumer grade SSDs are ideal for L2ARC. -Chip On May 4, 2014 4:43 AM, "Fred Liu" wrote: > > [fred]?ok. Let's see how it goes after I get the hba. > > [fred]: I have got 6805H HBA. It can recognize the sata drives in bios but > these drives cannot be detected in the latest illumos(smartos,oi) release. > I haven?t got a sas drive to test. But the price delta > between sas and sata drive under the same capacity is not small like USD30 > at all. ?. > > Thanks. > > Fred > _______________________________________________ > OpenIndiana-discuss mailing list > OpenIndiana-discuss at openindiana.org > http://openindiana.org/mailman/listinfo/openindiana-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmabis at vmware.com Sun May 4 23:22:53 2014 From: mmabis at vmware.com (Matthew Mabis) Date: Sun, 4 May 2014 16:22:53 -0700 (PDT) Subject: [OmniOS-discuss] Kernel Panic - Possibly SMB? In-Reply-To: References: <63594316-D19C-4292-B406-C8BADE5ED398@gmail.com> <20140430172628.GA776@joyent.com> <20140430184712.GA1017@joyent.com> Message-ID: <339412184.1866632.1399245773817.JavaMail.root@vmware.com> Hey all wondering if someone could help me figure out what just happened, i had a kernel panic that i dont understand what caused it based on what i see in the stack it might have been SMB, if you need more like the dump file itself let me know! Any help would greatly be appreciated! root at destiny:~# fmdump -Vp -u ae3350e8-17d5-4701-ecc6-b46fbaf1d04b |more TIME UUID SUNW-MSG-ID May 04 2014 17:16:03.161236000 ae3350e8-17d5-4701-ecc6-b46fbaf1d04b SUNOS-8000-KL TIME CLASS ENA May 04 17:16:03.1244 ireport.os.sunos.panic.dump_available 0x0000000000000000 May 04 17:15:23.3085 ireport.os.sunos.panic.dump_pending_on_device 0x0000000000000000 nvlist version: 0 version = 0x0 class = list.suspect uuid = ae3350e8-17d5-4701-ecc6-b46fbaf1d04b code = SUNOS-8000-KL diag-time = 1399245363 130378 de = fmd:///module/software-diagnosis fault-list-sz = 0x1 fault-list = (array of embedded nvlists) (start fault-list[0]) nvlist version: 0 version = 0x0 class = defect.sunos.kernel.panic certainty = 0x64 asru = sw:///:path=/var/crash/unknown/.ae3350e8-17d5-4701-ecc6-b46fbaf1d04b resource = sw:///:path=/var/crash/unknown/.ae3350e8-17d5-4701-ecc6-b46fbaf1d04b savecore-succcess = 1 dump-dir = /var/crash/unknown dump-files = vmdump.0 os-instance-uuid = ae3350e8-17d5-4701-ecc6-b46fbaf1d04b panicstr = BAD TRAP: type=e (#pf Page fault) rp=ffffff001e907450 addr=ffffff05ce4c90d8 panicstack = unix:real_mode_stop_cpu_stage2_end+9de3 () | unix:trap+db3 () | unix:cmntrap+e6 () | smbsrv:smb_fsop_lookup+118 () | smbsrv:smb_common_rename+d9 () | smbsrv:smb_tr ans2_rename+136 () | smbsrv:smb_set_rename_info+b8 () | smbsrv:smb_set_fileinfo+ed () | smbsrv:smb_set_by_fid+b0 () | smbsrv:smb_com_trans2_set_file_information+58 () | smbsrv:smb_trans2_dispa tch+313 () | smbsrv:smb_com_transaction2+1a7 () | smbsrv:smb_dispatch_request+662 () | smbsrv:smb_session_worker+a0 () | genunix:taskq_d_thread+b7 () | unix:thread_start+8 () | crashtime = 1399244984 panic-time = Sun May 4 17:09:44 2014 MDT (end fault-list[0]) severity = Major __ttl = 0x1 __tod = 0x5366ca33 0x99c4420 Matt Mabis -------------- next part -------------- An HTML attachment was scrubbed... URL: From filip.marvan at aira.cz Mon May 5 10:32:50 2014 From: filip.marvan at aira.cz (Filip Marvan) Date: Mon, 5 May 2014 12:32:50 +0200 Subject: [OmniOS-discuss] Strange ARC reads numbers Message-ID: <3BE0DEED8863E5429BAE4CAEDF6245650365044D4776@AIRA-SRV.aira.local> Hello, I have storage server with OmniOS LTS and 64 GB RAM. This server was installed one year ago. There are about 50 ZVOLs on that server and they are shared through Comstar iSCSI to KVM servers, which are using these ZVOLs for virtual servers. After installation, I created a simple Munin script for monitoring ARC and L2ARC, which is using arcstat.pl script from Mike Harsch (http://blog.harschsystems.com/2010/09/08/arcstat-pl-updated-for-l2arc-stati stics/). Last week, there was realy huge drop in ARC read statistics (you can see that in attachment). On that graph you can see Total ARC accesses per second. The only thing, that we done in that time, was deletion of about 5 unused ZVOLs with many snapshots and some clones. There was no change in virtual servers load. Is there anyone who have any idea, why deleting some data (ZVOLs and snapshots) have so dramatic efect on ARC accesses? There was more that 60% of free space on that ZFS pool. Thank you! Filip Marvan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: arc_reads.png Type: image/png Size: 19039 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6247 bytes Desc: not available URL: From Fred_Liu at issi.com Tue May 6 04:32:19 2014 From: Fred_Liu at issi.com (Fred Liu) Date: Mon, 5 May 2014 21:32:19 -0700 Subject: [OmniOS-discuss] Returned mail: List unknown Message-ID: Is mail address is case-sensitive? > -----Original Message----- > From: Fred Liu > Sent: ???, ?? 06, 2014 12:29 > To: 'OmniOS-discuss at lists.omniti.com' > Subject: RE: Returned mail: List unknown > > > > > -----Original Message----- > > From: Fred Liu > > Sent: ???, ?? 06, 2014 12:27 > > To: 'omniOS-discuss at lists.omniti.com' > > Subject: FW: Returned mail: List unknown > > > > Why do I always get this bouncing mail? > > > > -----Original Message----- > > From: MAILER-DAEMON at lists.omniti.com [mailto:MAILER- > > DAEMON at lists.omniti.com] > > Sent: ???, ?? 06, 2014 12:22 > > To: Fred Liu > > Subject: Returned mail: List unknown > > > > Your mail for OmniOS-discuss at lists.omniti.com could not be sent: > > no list named "OmniOS-discuss" is known by lists.omniti.com > > > > For a list of publicly-advertised mailing lists hosted on this server, > > visit this URL: > > http://lists.omniti.com/ > > > > If this does not resolve your problem, you may write to: > > postmaster at lists.omniti.com > > or > > mailman-owner at lists.omniti.com > > > > > > lists.omniti.com delivers e-mail to registered mailing lists and to > > the administrative addresses defined and required by IETF Request for > > Comments (RFC) 2142 [1]. > > > > Personal e-mail addresses are not offered by this server. > > > > The Internet Engineering Task Force [2] (IETF) oversees the > > development of open standards for the Internet community, including > > the protocols and formats employed by Internet mail systems. > > > > For your convenience, your original mail is attached. > > > > > > [1] Crocker, D. "Mailbox Names for Common Services, Roles and > > Functions". http://www.ietf.org/rfc/rfc2142.txt > > > > [2] http://www.ietf.org/ From paladinemishakal at gmail.com Tue May 6 06:27:59 2014 From: paladinemishakal at gmail.com (Lawrence Giam) Date: Tue, 6 May 2014 14:27:59 +0800 Subject: [OmniOS-discuss] Upgrade procedure for r151008j to r151008t Message-ID: Hi All, I am looking at upgrading or updating r151008j to r151008t. I would like to know how to do it, is there a wiki on upgrading OmniOS? Can someone provide me some info? Thanks & Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From paladinemishakal at gmail.com Tue May 6 06:51:21 2014 From: paladinemishakal at gmail.com (Lawrence Giam) Date: Tue, 6 May 2014 14:51:21 +0800 Subject: [OmniOS-discuss] Strange issue with OmniOS installation Message-ID: Hi All, I have a SuperMicro SSG-2026T-DE2R24L with X8DTS-F-2U motherboard connected to an external chassis with JBOD and I have the following disks: Main Chassis: 1. HDD Bay 0 - Toshiba 600GB SAS HDD 2. HDD Bay 1 to 9 - Seagate 1TB SAS HDD External Chassis: 1. HDD Bay 0 - Seagate Pulsar.2 100GB SAS SSD 2. HDD Bay 1 - Seagate Pulsar.2 200GB SAS SSD 3. HDD Bay 2 - 12 - Seagate 1TB SAS HDD I booted the server and using IPMI, I was trying to install OmniOS r151008j and I am unable to see the 600GB SAS HDD in the installer GUI at the section where the system search for disks to install the OS. The installer can see all the other disks except the 600GB disk. I dropped to the shell and using format, the system is able to detect and show me the 600GB disk. Going back to the installer and tried again, still cannot see. In the end, I have to remove all the disks and left only the 600GB disk in the system, boot to OmniOS installer and it can see the disk and also able to install. I would like to know if anybody experience this? Also another thing I want to brought up is that the installer GUI, if I have several disks of the same kind and size, it is almost impossible to identify which one as the installer cannot display the full disk ID. Examples of what I see in the installer: Disk 1: c4t5000C5 Disk 2: c4t5000C5 Disk 3: c4t5000C5 The actual disk ID when using "format" or "iostat -En": Disk 1: c4t5000C50057E39C5Fd0 Disk 2: c4t5000C50057E65663d0 Disk 3: c4t5000C50057E765A7d0 Is there a way for the installer to display the full disk ID? Thanks & Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben at fluffy.co.uk Tue May 6 07:18:36 2014 From: ben at fluffy.co.uk (Ben Summers) Date: Tue, 6 May 2014 08:18:36 +0100 Subject: [OmniOS-discuss] Upgrade procedure for r151008j to r151008t In-Reply-To: References: Message-ID: On 6 May 2014, at 07:27, Lawrence Giam wrote: > Hi All, > > I am looking at upgrading or updating r151008j to r151008t. I would like to know how to do it, is there a wiki on upgrading OmniOS? Can someone provide me some info? http://omnios.omniti.com/wiki.php/ReleaseNotes http://omnios.omniti.com/wiki.php/GeneralAdministration#Upgrading http://omnios.omniti.com/wiki.php/GeneralAdministration#UpgradingWithNon-GlobalZones Ben -- http://bens.me.uk From zmalone at omniti.com Tue May 6 14:33:33 2014 From: zmalone at omniti.com (Zach Malone) Date: Tue, 6 May 2014 10:33:33 -0400 Subject: [OmniOS-discuss] Upgrade procedure for r151008j to r151008t In-Reply-To: References: Message-ID: If you have no zones, it should just be: # pkg update --be-name=omnios-r151008t entire at 11,5.11-0.151008 http://omnios.omniti.com/wiki.php/Upgrade_r151006_r151008#PerformtheUpgrade has an example of inter-release updates, but inside a release, you can still just run $ pkg update to get the latest packages. The --be-name option will name the new boot environment, so you can easily roll back, and the "entire at 11,5.11-0.151008" section will keep you from updating any non-release packages. (or you can read the links that Ben Summers posted) --Zach Malone On Tue, May 6, 2014 at 2:27 AM, Lawrence Giam wrote: > Hi All, > > I am looking at upgrading or updating r151008j to r151008t. I would like to > know how to do it, is there a wiki on upgrading OmniOS? Can someone provide > me some info? > > Thanks & Regards. > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > From richard.elling at richardelling.com Wed May 7 01:56:03 2014 From: richard.elling at richardelling.com (Richard Elling) Date: Tue, 6 May 2014 18:56:03 -0700 Subject: [OmniOS-discuss] Strange ARC reads numbers In-Reply-To: <3BE0DEED8863E5429BAE4CAEDF6245650365044D4776@AIRA-SRV.aira.local> References: <3BE0DEED8863E5429BAE4CAEDF6245650365044D4776@AIRA-SRV.aira.local> Message-ID: Hi Filip, On May 5, 2014, at 3:32 AM, Filip Marvan wrote: > Hello, > > I have storage server with OmniOS LTS and 64 GB RAM. This server was installed one year ago. There are about 50 ZVOLs on that server and they are shared through Comstar iSCSI to KVM servers, which are using these ZVOLs for virtual servers. > > After installation, I created a simple Munin script for monitoring ARC and L2ARC, which is using arcstat.pl script from Mike Harsch (http://blog.harschsystems.com/2010/09/08/arcstat-pl-updated-for-l2arc-statistics/). > > Last week, there was realy huge drop in ARC read statistics (you can see that in attachment). On that graph you can see Total ARC accesses per second. The only thing, that we done in that time, was deletion of about 5 unused ZVOLs with many snapshots and some clones. There was no change in virtual servers load. > > Is there anyone who have any idea, why deleting some data (ZVOLs and snapshots) have so dramatic efect on ARC accesses? There was more that 60% of free space on that ZFS pool. There are two primary reasons for reduction in the number of ARC reads. 1. the workload isn't reading as much as it used to 2. the latency of reads has increased 3. your measurement is b0rken there are three reasons... The data you shared clearly shows reduction in reads, but doesn't contain the answers to the cause. Usually, if #2 is the case, then the phone will be ringing with angry customers on the other end. If the above 3 are not the case, then perhaps it is something more subtle. The arcstat reads does not record the size of the read. To get the read size for zvols is a little tricky, you can infer it from the pool statistics in iostat. The subtleness here is that if the volblocksize is different between the old and new zvols, then the number of (block) reads will be different for the same workload. -- richard > > Thank you! > Filip Marvan > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss -- Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: From filip.marvan at aira.cz Wed May 7 08:44:08 2014 From: filip.marvan at aira.cz (Filip Marvan) Date: Wed, 7 May 2014 10:44:08 +0200 Subject: [OmniOS-discuss] Strange ARC reads numbers In-Reply-To: References: <3BE0DEED8863E5429BAE4CAEDF6245650365044D4776@AIRA-SRV.aira.local> Message-ID: <3BE0DEED8863E5429BAE4CAEDF6245650365045016A2@AIRA-SRV.aira.local> Hi Richard, thank you for your reply. 1. Workload is still the same or very similar. Zvols, which we deleted from our pool were disconnected from KVM server a few days before, so the only change was, that we deleted that zvols with all snapshots. 2. As you wrote, our customers are fine for now :) We have monitoring of all our virtual servers running from that storage server, and there is no noticeable change in workload or latencies. 3. That could be the reason, of course. But in the graph are only data from arcstat.pl script. We can see, that arcstat is reporting heavy read accesses every 5 seconds (propably some update of ARC after ZFS writes data to disks from ZIL? All of them are marked as "cache hits" by arcstat script) and with only few ARC accesses between that 5 seconds periody. Before we deleted that zvols (about 0.7 TB data from 10 TB pool, which have 5 TB of free space) there were about 40k accesses every 5 seconds, now there are no more than 2k accesses every 5 seconds. Most of our zvols have 8K volblocksize (including deleted zvols), only few have 64K. Unfortunately I have no data about size of the read before that change. But we have two more storage servers, with similary high ARC read accesses every 5 seconds as on the first pool before deletion. Maybe I should try to delete some data on that pools and see what happen with more detailed monitoring. Thank you, Filip ________________________________ From: Richard Elling [mailto:richard.elling at richardelling.com] Sent: Wednesday, May 07, 2014 3:56 AM To: Filip Marvan Cc: omnios-discuss at lists.omniti.com Subject: Re: [OmniOS-discuss] Strange ARC reads numbers Hi Filip, There are two primary reasons for reduction in the number of ARC reads. 1. the workload isn't reading as much as it used to 2. the latency of reads has increased 3. your measurement is b0rken there are three reasons... The data you shared clearly shows reduction in reads, but doesn't contain the answers to the cause. Usually, if #2 is the case, then the phone will be ringing with angry customers on the other end. If the above 3 are not the case, then perhaps it is something more subtle. The arcstat reads does not record the size of the read. To get the read size for zvols is a little tricky, you can infer it from the pool statistics in iostat. The subtleness here is that if the volblocksize is different between the old and new zvols, then the number of (block) reads will be different for the same workload. -- richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmabis at vmware.com Wed May 7 14:01:55 2014 From: mmabis at vmware.com (Matthew Mabis) Date: Wed, 7 May 2014 07:01:55 -0700 (PDT) Subject: [OmniOS-discuss] [developer] Kernel Panic - Possibly SMB? In-Reply-To: References: <63594316-D19C-4292-B406-C8BADE5ED398@gmail.com> <20140430184712.GA1017@joyent.com> <339412184.1866632.1399245773817.JavaMail.root@vmware.com> Message-ID: <1087943953.2562992.1399471315033.JavaMail.root@vmware.com> I haven't been able to reproduce this issue, I believe a NC Replication (From Different Pools on the Same Host) was going on while SMB was being actively used. How would i look at the " In particular, look at the smb_xa_t it's working on, and the param+data buffers. Are they all there?" Any help would be greatly appreciated! Matt Mabis ----- Original Message ----- From: "Gordon Ross" To: "Matthew Mabis" Cc: developer at lists.illumos.org, "omnios-discuss" Sent: Tuesday, May 6, 2014 9:07:23 AM Subject: Re: [developer] Kernel Panic - Possibly SMB? Is this reproducible? Can you show any more detail about the data structure being operated on when it panic's? In particular, look at the smb_xa_t it's working on, and the param+data buffers. Are they all there? This is just a guess, but I suspect you may have stumbled upon an instance of the bug fixed here: https://github.com/Nexenta/illumos-nexenta/commit/803bd0af2c4842f440c58a8ab2c7b52f4171145d Gordon On Sun, May 4, 2014 at 7:22 PM, Matthew Mabis < mmabis at vmware.com > wrote: Hey all wondering if someone could help me figure out what just happened, i had a kernel panic that i dont understand what caused it based on what i see in the stack it might have been SMB, if you need more like the dump file itself let me know! Any help would greatly be appreciated! root at destiny:~# fmdump -Vp -u ae3350e8-17d5-4701-ecc6-b46fbaf1d04b |more TIME UUID SUNW-MSG-ID May 04 2014 17:16:03.161236000 ae3350e8-17d5-4701-ecc6-b46fbaf1d04b SUNOS-8000-KL TIME CLASS ENA May 04 17:16:03.1244 ireport.os.sunos.panic.dump_available 0x0000000000000000 May 04 17:15:23.3085 ireport.os.sunos.panic.dump_pending_on_device 0x0000000000000000 nvlist version: 0 version = 0x0 class = list.suspect uuid = ae3350e8-17d5-4701-ecc6-b46fbaf1d04b code = SUNOS-8000-KL diag-time = 1399245363 130378 de = fmd:///module/software-diagnosis fault-list-sz = 0x1 fault-list = (array of embedded nvlists) (start fault-list[0]) nvlist version: 0 version = 0x0 class = defect.sunos.kernel.panic certainty = 0x64 asru = sw:///:path=/var/crash/unknown/.ae3350e8-17d5-4701-ecc6-b46fbaf1d04b resource = sw:///:path=/var/crash/unknown/.ae3350e8-17d5-4701-ecc6-b46fbaf1d04b savecore-succcess = 1 dump-dir = /var/crash/unknown dump-files = vmdump.0 os-instance-uuid = ae3350e8-17d5-4701-ecc6-b46fbaf1d04b panicstr = BAD TRAP: type=e (#pf Page fault) rp=ffffff001e907450 addr=ffffff05ce4c90d8 panicstack = unix:real_mode_stop_cpu_stage2_end+9de3 () | unix:trap+db3 () | unix:cmntrap+e6 () | smbsrv:smb_fsop_lookup+118 () | smbsrv:smb_common_rename+d9 () | smbsrv:smb_tr ans2_rename+136 () | smbsrv:smb_set_rename_info+b8 () | smbsrv:smb_set_fileinfo+ed () | smbsrv:smb_set_by_fid+b0 () | smbsrv:smb_com_trans2_set_file_information+58 () | smbsrv:smb_trans2_dispa tch+313 () | smbsrv:smb_com_transaction2+1a7 () | smbsrv:smb_dispatch_request+662 () | smbsrv:smb_session_worker+a0 () | genunix:taskq_d_thread+b7 () | unix:thread_start+8 () | crashtime = 1399244984 panic-time = Sun May 4 17:09:44 2014 MDT (end fault-list[0]) severity = Major __ttl = 0x1 __tod = 0x5366ca33 0x99c4420 Matt Mabis illumos-developer | Archives | Modify Your Subscription -------------- next part -------------- An HTML attachment was scrubbed... URL: From daleg at omniti.com Wed May 7 17:36:25 2014 From: daleg at omniti.com (Dale Ghent) Date: Wed, 7 May 2014 13:36:25 -0400 Subject: [OmniOS-discuss] ANN: OmniOS r151010 (stable) released Message-ID: OmniOS r151010, the newest Stable release of OmniOS is now available for download and upgrading to. This release offers many enhancements and improvements. It also marks the beginning of this and future major releases of OmniOS having their own IPS repositories. Please read the Release Notes and Upgrade Instructions for full details on these and other changes. * Installation Media: http://omnios.omniti.com/wiki.php/Installation * Release Notes: http://omnios.omniti.com/wiki.php/ReleaseNotes#r151010 * Upgrade Instructions: http://omnios.omniti.com/wiki.php/Upgrade_r151008_r151010 /dale -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 494 bytes Desc: Message signed with OpenPGP using GPGMail URL: From chip at innovates.com Wed May 7 19:17:30 2014 From: chip at innovates.com (Schweiss, Chip) Date: Wed, 7 May 2014 14:17:30 -0500 Subject: [OmniOS-discuss] ANN: OmniOS r151010 (stable) released In-Reply-To: References: Message-ID: I was looking forward to ZFS bookmarks, but it appears they are not working yet. I upgraded one of my test VMs and tried them: root at ZFSsendTest1:~# zfs snapshot testpool/zfs_send at snap_for_bookmark_test1 root at ZFSsendTest1:~# zfs bookmark testpool/zfs_send at snap_for_bookmark_testbookmark#1 cannot create bookmark 'bookmark#1': unknown error root at ZFSsendTest1:~# zpool get all testpool NAME PROPERTY VALUE SOURCE testpool size 99.5G - testpool capacity 0% - testpool altroot - default testpool health ONLINE - testpool guid 14859070249843210579 default testpool version - default testpool bootfs - default testpool delegation on default testpool autoreplace off default testpool cachefile - default testpool failmode wait default testpool listsnapshots off default testpool autoexpand off default testpool dedupditto 0 default testpool dedupratio 1.00x - testpool free 99.1G - testpool allocated 390M - testpool readonly off - testpool comment - default testpool expandsize 0 - testpool freeing 0 default testpool feature at async_destroy enabled local testpool feature at empty_bpobj active local testpool feature at lz4_compress active local testpool feature at multi_vdev_crash_dump enabled local testpool feature at spacemap_histogram active local testpool feature at enabled_txg active local testpool feature at hole_birth active local testpool feature at extensible_dataset enabled local testpool feature at bookmarks enabled local On Wed, May 7, 2014 at 12:36 PM, Dale Ghent wrote: > > OmniOS r151010, the newest Stable release of OmniOS is now available for > download and upgrading to. > > This release offers many enhancements and improvements. It also marks the > beginning of this and future major releases of OmniOS having their own IPS > repositories. Please read the Release Notes and Upgrade Instructions for > full details on these and other changes. > > * Installation Media: http://omnios.omniti.com/wiki.php/Installation > * Release Notes: http://omnios.omniti.com/wiki.php/ReleaseNotes#r151010 > * Upgrade Instructions: > http://omnios.omniti.com/wiki.php/Upgrade_r151008_r151010 > > /dale > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chip at innovates.com Wed May 7 19:24:51 2014 From: chip at innovates.com (Schweiss, Chip) Date: Wed, 7 May 2014 14:24:51 -0500 Subject: [OmniOS-discuss] ANN: OmniOS r151010 (stable) released In-Reply-To: References: Message-ID: On Wed, May 7, 2014 at 2:17 PM, Schweiss, Chip wrote: > I was looking forward to ZFS bookmarks, but it appears they are not > working yet. > > I upgraded one of my test VMs and tried them: > > root at ZFSsendTest1:~# zfs snapshot > testpool/zfs_send at snap_for_bookmark_test1 > root at ZFSsendTest1:~# zfs bookmark testpool/zfs_send at snap_for_bookmark_testbookmark#1 > cannot create bookmark 'bookmark#1': unknown error > Don't mind the syntax error I had here. The results are the same when the command is correct: root at ZFSsendTest1:~# zfs bookmark testpool/zfs_send at snap_for_bookmark_test1bookmark#1 cannot create bookmark 'bookmark#1': unknown error > root at ZFSsendTest1:~# zpool get all testpool > NAME PROPERTY VALUE > SOURCE > testpool size 99.5G - > testpool capacity 0% - > testpool altroot - > default > testpool health ONLINE - > testpool guid 14859070249843210579 > default > testpool version - > default > testpool bootfs - > default > testpool delegation on > default > testpool autoreplace off > default > testpool cachefile - > default > testpool failmode wait > default > testpool listsnapshots off > default > testpool autoexpand off > default > testpool dedupditto 0 > default > testpool dedupratio 1.00x - > testpool free 99.1G - > testpool allocated 390M - > testpool readonly off - > testpool comment - > default > testpool expandsize 0 - > testpool freeing 0 > default > testpool feature at async_destroy enabled > local > testpool feature at empty_bpobj active > local > testpool feature at lz4_compress active > local > testpool feature at multi_vdev_crash_dump enabled > local > testpool feature at spacemap_histogram active > local > testpool feature at enabled_txg active > local > testpool feature at hole_birth active > local > testpool feature at extensible_dataset enabled > local > testpool feature at bookmarks enabled > local > > > > On Wed, May 7, 2014 at 12:36 PM, Dale Ghent wrote: > >> >> OmniOS r151010, the newest Stable release of OmniOS is now available for >> download and upgrading to. >> >> This release offers many enhancements and improvements. It also marks the >> beginning of this and future major releases of OmniOS having their own IPS >> repositories. Please read the Release Notes and Upgrade Instructions for >> full details on these and other changes. >> >> * Installation Media: http://omnios.omniti.com/wiki.php/Installation >> * Release Notes: http://omnios.omniti.com/wiki.php/ReleaseNotes#r151010 >> * Upgrade Instructions: >> http://omnios.omniti.com/wiki.php/Upgrade_r151008_r151010 >> >> /dale >> >> >> _______________________________________________ >> OmniOS-discuss mailing list >> OmniOS-discuss at lists.omniti.com >> http://lists.omniti.com/mailman/listinfo/omnios-discuss >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chip at innovates.com Wed May 7 20:57:16 2014 From: chip at innovates.com (Schweiss, Chip) Date: Wed, 7 May 2014 15:57:16 -0500 Subject: [OmniOS-discuss] ANN: OmniOS r151010 (stable) released In-Reply-To: References: Message-ID: Looks like the documentation needs to a bit clearer on the syntax. This worked: root at ZFSsendTest1:~# zfs bookmark testpool/zfs_send at snap_for_bookmark_test1testpool/zfs_send#bookmark1 On Wed, May 7, 2014 at 2:24 PM, Schweiss, Chip wrote: > > > > On Wed, May 7, 2014 at 2:17 PM, Schweiss, Chip wrote: > >> I was looking forward to ZFS bookmarks, but it appears they are not >> working yet. >> >> I upgraded one of my test VMs and tried them: >> >> root at ZFSsendTest1:~# zfs snapshot >> testpool/zfs_send at snap_for_bookmark_test1 >> root at ZFSsendTest1:~# zfs bookmark >> testpool/zfs_send at snap_for_bookmark_test bookmark#1 >> cannot create bookmark 'bookmark#1': unknown error >> > > Don't mind the syntax error I had here. The results are the same when > the command is correct: > > root at ZFSsendTest1:~# zfs bookmark > testpool/zfs_send at snap_for_bookmark_test1 bookmark#1 > > cannot create bookmark 'bookmark#1': unknown error > > > >> root at ZFSsendTest1:~# zpool get all testpool >> NAME PROPERTY VALUE >> SOURCE >> testpool size 99.5G - >> testpool capacity 0% - >> testpool altroot - >> default >> testpool health ONLINE - >> testpool guid 14859070249843210579 >> default >> testpool version - >> default >> testpool bootfs - >> default >> testpool delegation on >> default >> testpool autoreplace off >> default >> testpool cachefile - >> default >> testpool failmode wait >> default >> testpool listsnapshots off >> default >> testpool autoexpand off >> default >> testpool dedupditto 0 >> default >> testpool dedupratio 1.00x - >> testpool free 99.1G - >> testpool allocated 390M - >> testpool readonly off - >> testpool comment - >> default >> testpool expandsize 0 - >> testpool freeing 0 >> default >> testpool feature at async_destroy enabled >> local >> testpool feature at empty_bpobj active >> local >> testpool feature at lz4_compress active >> local >> testpool feature at multi_vdev_crash_dump enabled >> local >> testpool feature at spacemap_histogram active >> local >> testpool feature at enabled_txg active >> local >> testpool feature at hole_birth active >> local >> testpool feature at extensible_dataset enabled >> local >> testpool feature at bookmarks enabled >> local >> >> >> >> On Wed, May 7, 2014 at 12:36 PM, Dale Ghent wrote: >> >>> >>> OmniOS r151010, the newest Stable release of OmniOS is now available for >>> download and upgrading to. >>> >>> This release offers many enhancements and improvements. It also marks >>> the beginning of this and future major releases of OmniOS having their own >>> IPS repositories. Please read the Release Notes and Upgrade Instructions >>> for full details on these and other changes. >>> >>> * Installation Media: http://omnios.omniti.com/wiki.php/Installation >>> * Release Notes: http://omnios.omniti.com/wiki.php/ReleaseNotes#r151010 >>> * Upgrade Instructions: >>> http://omnios.omniti.com/wiki.php/Upgrade_r151008_r151010 >>> >>> /dale >>> >>> >>> _______________________________________________ >>> OmniOS-discuss mailing list >>> OmniOS-discuss at lists.omniti.com >>> http://lists.omniti.com/mailman/listinfo/omnios-discuss >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.elling at richardelling.com Wed May 7 22:47:04 2014 From: richard.elling at richardelling.com (Richard Elling) Date: Wed, 7 May 2014 15:47:04 -0700 Subject: [OmniOS-discuss] Strange ARC reads numbers In-Reply-To: <3BE0DEED8863E5429BAE4CAEDF6245650365045016A2@AIRA-SRV.aira.local> References: <3BE0DEED8863E5429BAE4CAEDF6245650365044D4776@AIRA-SRV.aira.local> <3BE0DEED8863E5429BAE4CAEDF6245650365045016A2@AIRA-SRV.aira.local> Message-ID: <58058A78-6619-4E2C-B3FB-38B012EAAD34@RichardElling.com> On May 7, 2014, at 1:44 AM, Filip Marvan wrote: > Hi Richard, > > thank you for your reply. > > 1. Workload is still the same or very similar. Zvols, which we deleted from our pool were disconnected from KVM server a few days before, so the only change was, that we deleted that zvols with all snapshots. > 2. As you wrote, our customers are fine for now :) We have monitoring of all our virtual servers running from that storage server, and there is no noticeable change in workload or latencies. good, then there might not be an actual problem, just a puzzle :-) > 3. That could be the reason, of course. But in the graph are only data from arcstat.pl script. We can see, that arcstat is reporting heavy read accesses every 5 seconds (propably some update of ARC after ZFS writes data to disks from ZIL? All of them are marked as "cache hits" by arcstat script) and with only few ARC accesses between that 5 seconds periody. Before we deleted that zvols (about 0.7 TB data from 10 TB pool, which have 5 TB of free space) there were about 40k accesses every 5 seconds, now there are no more than 2k accesses every 5 seconds. This is expected behaviour for older ZFS releases that used a txg_timeout of 5 seconds. You should see a burst of write activity around that timeout and it can include reads for zvols. Unfortunately, the zvol code is not very efficient and you will see a lot more reads than you expect. -- richard > > Most of our zvols have 8K volblocksize (including deleted zvols), only few have 64K. Unfortunately I have no data about size of the read before that change. But we have two more storage servers, with similary high ARC read accesses every 5 seconds as on the first pool before deletion. Maybe I should try to delete some data on that pools and see what happen with more detailed monitoring. > > Thank you, > Filip > > > From: Richard Elling [mailto:richard.elling at richardelling.com] > Sent: Wednesday, May 07, 2014 3:56 AM > To: Filip Marvan > Cc: omnios-discuss at lists.omniti.com > Subject: Re: [OmniOS-discuss] Strange ARC reads numbers > > Hi Filip, > > There are two primary reasons for reduction in the number of ARC reads. > 1. the workload isn't reading as much as it used to > 2. the latency of reads has increased > 3. your measurement is b0rken > there are three reasons... > > The data you shared clearly shows reduction in reads, but doesn't contain the answers > to the cause. Usually, if #2 is the case, then the phone will be ringing with angry customers > on the other end. > > If the above 3 are not the case, then perhaps it is something more subtle. The arcstat reads > does not record the size of the read. To get the read size for zvols is a little tricky, you can > infer it from the pool statistics in iostat. The subtleness here is that if the volblocksize is > different between the old and new zvols, then the number of (block) reads will be different > for the same workload. > -- richard -- Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at will.to Thu May 8 02:12:28 2014 From: doug at will.to (Doug Hughes) Date: Wed, 07 May 2014 22:12:28 -0400 Subject: [OmniOS-discuss] CIFS with OmniOS Message-ID: <536AE80C.7010006@will.to> Any documentation on getting a CIFS share working with OmniOS? All I'm finding on the web is Solaris11, which has some key things that OmniOS doesn't in smbadm, so the instructions fall apart rather early. (Tried this also, but many stale links: http://wiki.illumos.org/display/illumos/Getting+Started+With+the+CIFS+Service) From dswartz at druber.com Thu May 8 02:57:01 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Wed, 7 May 2014 22:57:01 -0400 Subject: [OmniOS-discuss] CIFS with OmniOS In-Reply-To: <536AE80C.7010006@will.to> References: <536AE80C.7010006@will.to> Message-ID: > Any documentation on getting a CIFS share working with OmniOS? All I'm > finding on the web is Solaris11, which has some key things that OmniOS > doesn't in smbadm, so the instructions fall apart rather early. OpenIndiana instructions? From doug at will.to Thu May 8 02:58:47 2014 From: doug at will.to (Doug Hughes) Date: Wed, 07 May 2014 22:58:47 -0400 Subject: [OmniOS-discuss] CIFS with OmniOS In-Reply-To: References: <536AE80C.7010006@will.to> Message-ID: <536AF2E7.9050106@will.to> On 5/7/2014 10:57 PM, Dan Swartzendruber wrote: >> Any documentation on getting a CIFS share working with OmniOS? All I'm >> finding on the web is Solaris11, which has some key things that OmniOS >> doesn't in smbadm, so the instructions fall apart rather early. > > OpenIndiana instructions? > None that I've found so far that seem to do the trick. Got a link? From dswartz at druber.com Thu May 8 03:00:57 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Wed, 7 May 2014 23:00:57 -0400 Subject: [OmniOS-discuss] CIFS with OmniOS In-Reply-To: <536AF2E7.9050106@will.to> References: <536AE80C.7010006@will.to> <536AF2E7.9050106@will.to> Message-ID: > On 5/7/2014 10:57 PM, Dan Swartzendruber wrote: >>> Any documentation on getting a CIFS share working with OmniOS? All I'm >>> finding on the web is Solaris11, which has some key things that OmniOS >>> doesn't in smbadm, so the instructions fall apart rather early. >> >> OpenIndiana instructions? >> > > None that I've found so far that seem to do the trick. Got a link? Not specifically, no. I'm not doing anything esoteric though. Just this: tank/windows/dswartz sharesmb name=dswartz local (replace dswartz by various usernames). With the right password, our windows7 boxes can just connect seamlessly. I know there are much more complex things people do, but I don't. What specifically are you trying to do that isn't obvious? From doug at will.to Thu May 8 03:29:19 2014 From: doug at will.to (Doug Hughes) Date: Wed, 07 May 2014 23:29:19 -0400 Subject: [OmniOS-discuss] CIFS with OmniOS In-Reply-To: References: <536AE80C.7010006@will.to> <536AF2E7.9050106@will.to> Message-ID: <536AFA0F.7040801@will.to> On 5/7/2014 11:00 PM, Dan Swartzendruber wrote: >> On 5/7/2014 10:57 PM, Dan Swartzendruber wrote: >>>> Any documentation on getting a CIFS share working with OmniOS? All I'm >>>> finding on the web is Solaris11, which has some key things that OmniOS >>>> doesn't in smbadm, so the instructions fall apart rather early. >>> >>> OpenIndiana instructions? >>> >> >> None that I've found so far that seem to do the trick. Got a link? > > Not specifically, no. I'm not doing anything esoteric though. Just this: > > tank/windows/dswartz sharesmb name=dswartz local > > (replace dswartz by various usernames). With the right password, our > windows7 boxes can just connect seamlessly. I know there are much more > complex things people do, but I don't. What specifically are you trying > to do that isn't obvious? > I got it to work following many of the steps here: https://blogs.oracle.com/timthomas/entry/solaris_cifs_in_workgroup_mode (show-vp is one that isn't applicable, but the basic bones are there) From alka at hfg-gmuend.de Thu May 8 08:00:47 2014 From: alka at hfg-gmuend.de (Guenther Alka) Date: Thu, 08 May 2014 10:00:47 +0200 Subject: [OmniOS-discuss] CIFS with OmniOS In-Reply-To: <536AE80C.7010006@will.to> References: <536AE80C.7010006@will.to> Message-ID: <536B39AF.50602@hfg-gmuend.de> You can either - use the Oracle docs from old Solaris Express 11 (fully compatible settings to OmniOS) if you google you may find them like http://archive.today/snZaS - use my napp-it where you can do settings via Web-UI Gea Am 08.05.2014 04:12, schrieb Doug Hughes: > Any documentation on getting a CIFS share working with OmniOS? All I'm > finding on the web is Solaris11, which has some key things that OmniOS > doesn't in smbadm, so the instructions fall apart rather early. > > (Tried this also, but many stale links: > http://wiki.illumos.org/display/illumos/Getting+Started+With+the+CIFS+Service) > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss From johan.kragsterman at capvert.se Thu May 8 08:30:02 2014 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Thu, 8 May 2014 10:30:02 +0200 Subject: [OmniOS-discuss] move dump and swap to other pool Message-ID: Hi! I got a relatively small(20 GB) SLC SSD as rpool, and I would like to move the dump and swap devices to another pool. ATM the pkg update process isn't possible, because I got too little space left on the rpool. Can someone give me some advices here, how I would do this in the best way? Best regards from/Med v?nliga h?lsningar fr?n Johan Kragsterman Capvert From danmcd at omniti.com Thu May 8 14:37:49 2014 From: danmcd at omniti.com (Dan McDonald) Date: Thu, 8 May 2014 10:37:49 -0400 Subject: [OmniOS-discuss] move dump and swap to other pool In-Reply-To: References: Message-ID: <5CADD3E0-CC44-4480-A8CC-F129884A462B@omniti.com> On May 8, 2014, at 4:30 AM, Johan Kragsterman wrote: > > Hi! > > > I got a relatively small(20 GB) SLC SSD as rpool, and I would like to move the dump and swap devices to another pool. ATM the pkg update process isn't possible, because I got too little space left on the rpool. > > Can someone give me some advices here, how I would do this in the best way? About dump: Unless you have r151008 or later, you can only use a zvol on a mirror or single-disk pool. Having said that: zfs create -V newpool/dump dumpadm -d /dev/zvol/newpool/dump And for swap (which should be good regardless what "newpool" is...): zfs create -V newpool/swap swap -a /dev/zvol/newpool/swap swap -d /dev/zvol/oldpool/swap Hope this helps, Dan From johan.kragsterman at capvert.se Thu May 8 14:57:06 2014 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Thu, 8 May 2014 16:57:06 +0200 Subject: [OmniOS-discuss] move dump and swap to other pool In-Reply-To: <5CADD3E0-CC44-4480-A8CC-F129884A462B@omniti.com> References: <5CADD3E0-CC44-4480-A8CC-F129884A462B@omniti.com>, Message-ID: Hi! -----Dan McDonald skrev: ----- Till: Johan Kragsterman Fr?n: Dan McDonald Datum: 2014-05-08 16:37 Kopia: "OmniOS-discuss at lists.omniti.com" ?rende: Re: [OmniOS-discuss] move dump and swap to other pool On May 8, 2014, at 4:30 AM, Johan Kragsterman wrote: > > Hi! > > > I got a relatively small(20 GB) SLC SSD as rpool, and I would like to move the dump and swap devices to another pool. ATM the pkg update process isn't possible, because I got too little space left on the rpool. > > Can someone give me some advices here, how I would do this in the best way? About dump: ?Unless you have r151008 or later, you can only use a zvol on a mirror or single-disk pool. ?Having said that: zfs create -V newpool/dump dumpadm -d /dev/zvol/newpool/dump And for swap (which should be good regardless what "newpool" is...): zfs create -V newpool/swap swap -a /dev/zvol/newpool/swap swap -d /dev/zvol/oldpool/swap Hope this helps, Dan Thanks, Dan, that was EXACTLY what I needed!!! And the "newpool", wich is called "mainpool" here, are mirrored disk, yes. Regards Johan From robin at coraid.com Thu May 8 15:04:56 2014 From: robin at coraid.com (Robin P. Blanchard) Date: Thu, 8 May 2014 15:04:56 +0000 Subject: [OmniOS-discuss] failsafe boot? Message-ID: <44C6CC19-47DA-470D-8D68-A9FC18CF8BF2@coraid.com> Hi guys, I managed to destroy my /kernel/drv/scsi_vhci.conf and/or sd.conf and can no longer boot (into any BE) :/ Is there a way (other than live media) to boot into some sort of rescue/failsafe mode? From robin at coraid.com Thu May 8 16:06:44 2014 From: robin at coraid.com (Robin P. Blanchard) Date: Thu, 8 May 2014 16:06:44 +0000 Subject: [OmniOS-discuss] failsafe boot? In-Reply-To: <44C6CC19-47DA-470D-8D68-A9FC18CF8BF2@coraid.com> References: <44C6CC19-47DA-470D-8D68-A9FC18CF8BF2@coraid.com> Message-ID: Replying to myself here... Presumably my other BEs are failing since my rpool is now upgraded. So I've decided to try to boot from latest ISO and attempt to mount the BE and fix it. so what am I missing here: from live media: # mkdir -p /rescue # zpool import -R /rescue 14750227168826216208 # zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 48.2G 865G 40K /rescue/rpool rpool/ROOT 14.7G 865G 31K legacy rpool/ROOT/omnios 7.39M 865G 3.50G /rescue rpool/ROOT/omnios-1 10.3M 865G 3.53G /rescue rpool/ROOT/omnios-2 287M 865G 3.83G /rescue rpool/ROOT/omnios-3 279M 865G 7.15G /rescue rpool/ROOT/omnios-4 282M 865G 7.32G /rescue rpool/ROOT/omnios-4-backup-1 40K 865G 7.00G /rescue rpool/ROOT/omnios-4-backup-2 137K 865G 7.09G /rescue rpool/ROOT/omnios-5 285M 865G 7.47G /rescue rpool/ROOT/omnios-5-backup-1 71K 865G 7.18G /rescue rpool/ROOT/omnios-6 13.6G 865G 7.82G /rescue rpool/ROOT/omnios-backup-1 84K 865G 3.33G /rescue rpool/ROOT/omnios-backup-2 96K 865G 3.50G /rescue rpool/ROOT/omniosvar 31K 865G 31K legacy rpool/dump 28.0G 865G 28.0G - rpool/export 1.38G 865G 32K /rescue/export rpool/export/home 1.38G 865G 1.38G /rescue/export/home rpool/swap 4.13G 869G 5.16M - # beadm list BE Active Mountpoint Space Policy Created omnios - - 7.39M static 2013-11-19 21:11 omnios-1 - - 10.3M static 2013-12-08 00:42 omnios-2 - - 287M static 2013-12-08 01:05 omnios-3 - - 279M static 2013-12-11 18:06 omnios-4 - - 282M static 2014-01-21 18:21 omnios-4-backup-1 - - 40.0K static 2014-01-21 18:28 omnios-4-backup-2 - - 137K static 2014-03-10 20:05 omnios-5 - - 285M static 2014-04-08 03:12 omnios-5-backup-1 - - 71.0K static 2014-04-11 17:09 omnios-6 R - 16.5G static 2014-05-08 12:35 omnios-backup-1 - - 84.0K static 2013-12-07 14:47 omnios-backup-2 - - 96.0K static 2013-12-08 00:42 omniosvar - - 31.0K static 2013-11-19 21:11 # mkdir -p /rescue/be # beadm mount omnios-6 /rescue/be/ Mounted successfully on: '/rescue/be/' # find /rescue/be/ /rescue/be/ nothing here. and same for the other BEs.... On May 8, 2014, at 11:04 AM, Robin P. Blanchard wrote: > Hi guys, > > I managed to destroy my /kernel/drv/scsi_vhci.conf and/or sd.conf and can no longer boot (into any BE) :/ > > Is there a way (other than live media) to boot into some sort of rescue/failsafe mode? > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss -- Robin P. Blanchard Technical Solutions Engineer Coraid Global Field Services and Support www.coraid.com +1 650.730.5140 From robin at coraid.com Thu May 8 16:20:14 2014 From: robin at coraid.com (Robin P. Blanchard) Date: Thu, 8 May 2014 16:20:14 +0000 Subject: [OmniOS-discuss] failsafe boot? In-Reply-To: References: <44C6CC19-47DA-470D-8D68-A9FC18CF8BF2@coraid.com> Message-ID: # zfs list -r -t snapshot NAME USED AVAIL REFER MOUNTPOINT rpool/ROOT/omnios at 2013-12-08-00:42:38 0 - 3.66G - rpool/ROOT/omnios-6 at install 292M - 1.38G - rpool/ROOT/omnios-6 at 2013-12-07-14:47:31 134M - 3.33G - rpool/ROOT/omnios-6 at 2013-12-08-00:42:37 291M - 3.66G - rpool/ROOT/omnios-6 at 2013-12-08-01:05:46 578M - 3.80G - rpool/ROOT/omnios-6 at 2013-12-11-18:06:57 314M - 3.66G - rpool/ROOT/omnios-6 at 2014-01-21-18:21:58 300M - 7.00G - rpool/ROOT/omnios-6 at 2014-01-21-18:28:40 22.6M - 7.00G - rpool/ROOT/omnios-6 at 2014-03-10-20:05:42 31.4M - 7.09G - rpool/ROOT/omnios-6 at 2014-04-08-03:12:40 301M - 7.15G - rpool/ROOT/omnios-6 at 2014-04-11-17:09:48 325M - 7.18G - rpool/ROOT/omnios-6 at 2014-05-08-12:35:52 345M - 7.67G - is the beadm mount not enough? do i still need to manually mount its snapshot? On May 8, 2014, at 12:06 PM, Robin P. Blanchard wrote: > Replying to myself here... > Presumably my other BEs are failing since my rpool is now upgraded. > > So I've decided to try to boot from latest ISO and attempt to mount the BE and fix it. > > so what am I missing here: > > from live media: > > # mkdir -p /rescue > > # zpool import -R /rescue 14750227168826216208 > > # zfs list > NAME USED AVAIL REFER MOUNTPOINT > rpool 48.2G 865G 40K /rescue/rpool > rpool/ROOT 14.7G 865G 31K legacy > rpool/ROOT/omnios 7.39M 865G 3.50G /rescue > rpool/ROOT/omnios-1 10.3M 865G 3.53G /rescue > rpool/ROOT/omnios-2 287M 865G 3.83G /rescue > rpool/ROOT/omnios-3 279M 865G 7.15G /rescue > rpool/ROOT/omnios-4 282M 865G 7.32G /rescue > rpool/ROOT/omnios-4-backup-1 40K 865G 7.00G /rescue > rpool/ROOT/omnios-4-backup-2 137K 865G 7.09G /rescue > rpool/ROOT/omnios-5 285M 865G 7.47G /rescue > rpool/ROOT/omnios-5-backup-1 71K 865G 7.18G /rescue > rpool/ROOT/omnios-6 13.6G 865G 7.82G /rescue > rpool/ROOT/omnios-backup-1 84K 865G 3.33G /rescue > rpool/ROOT/omnios-backup-2 96K 865G 3.50G /rescue > rpool/ROOT/omniosvar 31K 865G 31K legacy > rpool/dump 28.0G 865G 28.0G - > rpool/export 1.38G 865G 32K /rescue/export > rpool/export/home 1.38G 865G 1.38G /rescue/export/home > rpool/swap 4.13G 869G 5.16M - > > # beadm list > BE Active Mountpoint Space Policy Created > omnios - - 7.39M static 2013-11-19 21:11 > omnios-1 - - 10.3M static 2013-12-08 00:42 > omnios-2 - - 287M static 2013-12-08 01:05 > omnios-3 - - 279M static 2013-12-11 18:06 > omnios-4 - - 282M static 2014-01-21 18:21 > omnios-4-backup-1 - - 40.0K static 2014-01-21 18:28 > omnios-4-backup-2 - - 137K static 2014-03-10 20:05 > omnios-5 - - 285M static 2014-04-08 03:12 > omnios-5-backup-1 - - 71.0K static 2014-04-11 17:09 > omnios-6 R - 16.5G static 2014-05-08 12:35 > omnios-backup-1 - - 84.0K static 2013-12-07 14:47 > omnios-backup-2 - - 96.0K static 2013-12-08 00:42 > omniosvar - - 31.0K static 2013-11-19 21:11 > > # mkdir -p /rescue/be > > # beadm mount omnios-6 /rescue/be/ > Mounted successfully on: '/rescue/be/' > > # find /rescue/be/ > /rescue/be/ > > > nothing here. and same for the other BEs.... > > > On May 8, 2014, at 11:04 AM, Robin P. Blanchard wrote: > >> Hi guys, >> >> I managed to destroy my /kernel/drv/scsi_vhci.conf and/or sd.conf and can no longer boot (into any BE) :/ >> >> Is there a way (other than live media) to boot into some sort of rescue/failsafe mode? >> _______________________________________________ >> OmniOS-discuss mailing list >> OmniOS-discuss at lists.omniti.com >> http://lists.omniti.com/mailman/listinfo/omnios-discuss > > -- > Robin P. Blanchard > Technical Solutions Engineer > Coraid Global Field Services and Support > www.coraid.com > +1 650.730.5140 > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss -- Robin P. Blanchard Technical Solutions Engineer Coraid Global Field Services and Support www.coraid.com +1 650.730.5140 From henson at acm.org Thu May 8 20:34:00 2014 From: henson at acm.org (Paul B. Henson) Date: Thu, 8 May 2014 13:34:00 -0700 Subject: [OmniOS-discuss] move dump and swap to other pool In-Reply-To: <5CADD3E0-CC44-4480-A8CC-F129884A462B@omniti.com> References: <5CADD3E0-CC44-4480-A8CC-F129884A462B@omniti.com> Message-ID: <07dc01cf6afc$da0d20b0$8e276210$@acm.org> > From: Dan McDonald > Sent: Thursday, May 08, 2014 7:38 AM > > And for swap (which should be good regardless what "newpool" is...): > > zfs create -V newpool/swap Unless I'm misremembering, the default blocksize for a zvol is 8k, and the recommended blocksize for a swap zvol on x86 is 4k, so I usually explicitly specify the blocksize when creating a swap zvol: zfs create -b 4k -V pool/swap Also, in the past there was a bug that could result in a kernel wedge unless you tweaked the cache settings for a zvol in use for swap: zfs set primarycache=metadata pool/swap zfs set secondarycache=none pool/swap Possibly that has been fixed? But I don't think it hurts either way, so I still do it. From richard.elling at richardelling.com Fri May 9 02:00:01 2014 From: richard.elling at richardelling.com (Richard Elling) Date: Thu, 8 May 2014 19:00:01 -0700 Subject: [OmniOS-discuss] failsafe boot? In-Reply-To: References: <44C6CC19-47DA-470D-8D68-A9FC18CF8BF2@coraid.com> Message-ID: <3E532225-A3CC-4079-BF7A-B1C1A8A47F19@RichardElling.com> On May 8, 2014, at 9:20 AM, Robin P. Blanchard wrote: > # zfs list -r -t snapshot > NAME USED AVAIL REFER MOUNTPOINT > rpool/ROOT/omnios at 2013-12-08-00:42:38 0 - 3.66G - > rpool/ROOT/omnios-6 at install 292M - 1.38G - > rpool/ROOT/omnios-6 at 2013-12-07-14:47:31 134M - 3.33G - > rpool/ROOT/omnios-6 at 2013-12-08-00:42:37 291M - 3.66G - > rpool/ROOT/omnios-6 at 2013-12-08-01:05:46 578M - 3.80G - > rpool/ROOT/omnios-6 at 2013-12-11-18:06:57 314M - 3.66G - > rpool/ROOT/omnios-6 at 2014-01-21-18:21:58 300M - 7.00G - > rpool/ROOT/omnios-6 at 2014-01-21-18:28:40 22.6M - 7.00G - > rpool/ROOT/omnios-6 at 2014-03-10-20:05:42 31.4M - 7.09G - > rpool/ROOT/omnios-6 at 2014-04-08-03:12:40 301M - 7.15G - > rpool/ROOT/omnios-6 at 2014-04-11-17:09:48 325M - 7.18G - > rpool/ROOT/omnios-6 at 2014-05-08-12:35:52 345M - 7.67G - > > > is the beadm mount not enough? do i still need to manually mount its snapshot? You can manually mount the dataset. Something like: mount -F zfs zpool/ROOT/omnios-6 /mnt -- richard > > > > On May 8, 2014, at 12:06 PM, Robin P. Blanchard wrote: > >> Replying to myself here... >> Presumably my other BEs are failing since my rpool is now upgraded. >> >> So I've decided to try to boot from latest ISO and attempt to mount the BE and fix it. >> >> so what am I missing here: >> >> from live media: >> >> # mkdir -p /rescue >> >> # zpool import -R /rescue 14750227168826216208 >> >> # zfs list >> NAME USED AVAIL REFER MOUNTPOINT >> rpool 48.2G 865G 40K /rescue/rpool >> rpool/ROOT 14.7G 865G 31K legacy >> rpool/ROOT/omnios 7.39M 865G 3.50G /rescue >> rpool/ROOT/omnios-1 10.3M 865G 3.53G /rescue >> rpool/ROOT/omnios-2 287M 865G 3.83G /rescue >> rpool/ROOT/omnios-3 279M 865G 7.15G /rescue >> rpool/ROOT/omnios-4 282M 865G 7.32G /rescue >> rpool/ROOT/omnios-4-backup-1 40K 865G 7.00G /rescue >> rpool/ROOT/omnios-4-backup-2 137K 865G 7.09G /rescue >> rpool/ROOT/omnios-5 285M 865G 7.47G /rescue >> rpool/ROOT/omnios-5-backup-1 71K 865G 7.18G /rescue >> rpool/ROOT/omnios-6 13.6G 865G 7.82G /rescue >> rpool/ROOT/omnios-backup-1 84K 865G 3.33G /rescue >> rpool/ROOT/omnios-backup-2 96K 865G 3.50G /rescue >> rpool/ROOT/omniosvar 31K 865G 31K legacy >> rpool/dump 28.0G 865G 28.0G - >> rpool/export 1.38G 865G 32K /rescue/export >> rpool/export/home 1.38G 865G 1.38G /rescue/export/home >> rpool/swap 4.13G 869G 5.16M - >> >> # beadm list >> BE Active Mountpoint Space Policy Created >> omnios - - 7.39M static 2013-11-19 21:11 >> omnios-1 - - 10.3M static 2013-12-08 00:42 >> omnios-2 - - 287M static 2013-12-08 01:05 >> omnios-3 - - 279M static 2013-12-11 18:06 >> omnios-4 - - 282M static 2014-01-21 18:21 >> omnios-4-backup-1 - - 40.0K static 2014-01-21 18:28 >> omnios-4-backup-2 - - 137K static 2014-03-10 20:05 >> omnios-5 - - 285M static 2014-04-08 03:12 >> omnios-5-backup-1 - - 71.0K static 2014-04-11 17:09 >> omnios-6 R - 16.5G static 2014-05-08 12:35 >> omnios-backup-1 - - 84.0K static 2013-12-07 14:47 >> omnios-backup-2 - - 96.0K static 2013-12-08 00:42 >> omniosvar - - 31.0K static 2013-11-19 21:11 >> >> # mkdir -p /rescue/be >> >> # beadm mount omnios-6 /rescue/be/ >> Mounted successfully on: '/rescue/be/' >> >> # find /rescue/be/ >> /rescue/be/ >> >> >> nothing here. and same for the other BEs.... >> >> >> On May 8, 2014, at 11:04 AM, Robin P. Blanchard wrote: >> >>> Hi guys, >>> >>> I managed to destroy my /kernel/drv/scsi_vhci.conf and/or sd.conf and can no longer boot (into any BE) :/ >>> >>> Is there a way (other than live media) to boot into some sort of rescue/failsafe mode? >>> _______________________________________________ >>> OmniOS-discuss mailing list >>> OmniOS-discuss at lists.omniti.com >>> http://lists.omniti.com/mailman/listinfo/omnios-discuss >> >> -- >> Robin P. Blanchard >> Technical Solutions Engineer >> Coraid Global Field Services and Support >> www.coraid.com >> +1 650.730.5140 >> >> _______________________________________________ >> OmniOS-discuss mailing list >> OmniOS-discuss at lists.omniti.com >> http://lists.omniti.com/mailman/listinfo/omnios-discuss > > -- > Robin P. Blanchard > Technical Solutions Engineer > Coraid Global Field Services and Support > www.coraid.com > +1 650.730.5140 > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss -- Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: From svavar at januar.is Fri May 9 09:27:49 2014 From: svavar at januar.is (=?UTF-8?Q?Svavar_=C3=96rn_Eysteinsson?=) Date: Fri, 9 May 2014 09:27:49 +0000 Subject: [OmniOS-discuss] Move rpool from IDE mode to AHCI ? Message-ID: Hello List. I recently installed and configured OmniOS on a HP Microserver N40L NAS box. In time to time my box hangs/freezes for some seconds, when I try to SSH into the box or execute some SMART-INFO functions. And then it gives me access. In my /var/adm/messages i have these messages : May 8 20:10:41 blackbox scsi: [ID 107833 kern.warning] WARNING: /pci at 0 ,0/pci-ide at 14,1/ide at 0 (ata0): May 8 20:10:41 blackbox timeout: abort request, target=1 lun=0 May 8 20:10:41 blackbox scsi: [ID 107833 kern.warning] WARNING: /pci at 0 ,0/pci-ide at 14,1/ide at 0 (ata0): May 8 20:10:41 blackbox timeout: abort device, target=1 lun=0 May 8 20:10:41 blackbox scsi: [ID 107833 kern.warning] WARNING: /pci at 0 ,0/pci-ide at 14,1/ide at 0 (ata0): May 8 20:10:41 blackbox timeout: reset target, target=1 lun=0 May 8 20:10:41 blackbox scsi: [ID 107833 kern.warning] WARNING: /pci at 0 ,0/pci-ide at 14,1/ide at 0 (ata0): May 8 20:10:41 blackbox timeout: reset bus, target=1 lun=0 May 8 20:10:42 blackbox gda: [ID 107833 kern.warning] WARNING: /pci at 0 ,0/pci-ide at 14,1/ide at 0/cmdk at 1,0 (Disk0): May 8 20:10:42 blackbox Error for command 'read sector' Error Level: Informational May 8 20:10:42 blackbox gda: [ID 107833 kern.notice] Sense Key: aborted command May 8 20:10:42 blackbox gda: [ID 107833 kern.notice] Vendor 'Gen-ATA ' error code: 0x3 My current configuration is the System(OmniOS) is installed onto a SATA disk located at port 5 on the Microserver. (yes i have the BIOS hacked which allows for SATA 3G/AHCI functions on the port other ports) Then I have 4x SATA disks configured in RAIDZ I double checked my BIOS configuration, and it seems that I have misconfigured the OS disk into a IDE mode, but not AHCI. So when I change the mode to AHCI, I get the GRUB loader and the OmniOS logo after that, but suddenly the machines restarts again and never boots into the system all the way. So my question is, is there any way for my to reconfigure my running-system to use AHCI instead of IDE,ATA mode ? Is it enough to boot up a usb live system and issue a "zpool import -f rpool", and then finally "zpool export rpool" with a running BIOS-config as AHCI and then restart the computer and let it boot normally ? I read somewhere that the import/export functions would update the device path, but I have no experience on this issue. Any help would be much appreciated. Thanks allot people. Best regards, Svavar O Reykjavik - Iceland -------------- next part -------------- An HTML attachment was scrubbed... URL: From alka at hfg-gmuend.de Fri May 9 14:05:15 2014 From: alka at hfg-gmuend.de (Guenther Alka) Date: Fri, 09 May 2014 16:05:15 +0200 Subject: [OmniOS-discuss] OmniOS 151010 improvements In-Reply-To: <536B39AF.50602@hfg-gmuend.de> References: <536AE80C.7010006@will.to> <536B39AF.50602@hfg-gmuend.de> Message-ID: <536CE09B.2080507@hfg-gmuend.de> Release 151010is out http://omnios.omniti.com/wiki.php/ReleaseNotes#ReleaseNotes states: * ZFS performance and reliability improvements, especially with hole-y ZFS sends and receives. * ZFS bookmarks for easier sending and receiving of ZFS filesystems. * ZFS CLI improvements * NFS reliability improvements. Can anyone publish some details about these 4 improvements Gea -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.elling at richardelling.com Fri May 9 14:12:13 2014 From: richard.elling at richardelling.com (Richard Elling) Date: Fri, 9 May 2014 07:12:13 -0700 Subject: [OmniOS-discuss] Move rpool from IDE mode to AHCI ? In-Reply-To: References: Message-ID: <0A800504-B5FB-465B-B812-49F0067758F4@RichardElling.com> Hi Svavar, > On May 9, 2014, at 2:27 AM, Svavar ?rn Eysteinsson wrote: > > Hello List. > > I recently installed and configured OmniOS on a HP Microserver N40L NAS box. > > In time to time my box hangs/freezes for some seconds, when I try to SSH into the box or execute some SMART-INFO functions. > And then it gives me access. > > In my /var/adm/messages i have these messages : > > May 8 20:10:41 blackbox scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci-ide at 14,1/ide at 0 (ata0): > May 8 20:10:41 blackbox timeout: abort request, target=1 lun=0 > May 8 20:10:41 blackbox scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci-ide at 14,1/ide at 0 (ata0): > May 8 20:10:41 blackbox timeout: abort device, target=1 lun=0 > May 8 20:10:41 blackbox scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci-ide at 14,1/ide at 0 (ata0): > May 8 20:10:41 blackbox timeout: reset target, target=1 lun=0 > May 8 20:10:41 blackbox scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci-ide at 14,1/ide at 0 (ata0): > May 8 20:10:41 blackbox timeout: reset bus, target=1 lun=0 > May 8 20:10:42 blackbox gda: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci-ide at 14,1/ide at 0/cmdk at 1,0 (Disk0): > May 8 20:10:42 blackbox Error for command 'read sector' Error Level: Informational > May 8 20:10:42 blackbox gda: [ID 107833 kern.notice] Sense Key: aborted command > May 8 20:10:42 blackbox gda: [ID 107833 kern.notice] Vendor 'Gen-ATA ' error code: 0x3 > > > My current configuration is the System(OmniOS) is installed onto a SATA disk located at port 5 on the Microserver. > (yes i have the BIOS hacked which allows for SATA 3G/AHCI functions on the port other ports) > Then I have 4x SATA disks configured in RAIDZ > > I double checked my BIOS configuration, and it seems that I have misconfigured the OS disk into a IDE mode, but not AHCI. > So when I change the mode to AHCI, I get the GRUB loader and the OmniOS logo after that, but suddenly the machines restarts again > and never boots into the system all the way. > > So my question is, is there any way for my to reconfigure my running-system to use AHCI instead of IDE,ATA mode ? > > Is it enough to boot up a usb live system and issue a "zpool import -f rpool", and then finally "zpool export rpool" with a running BIOS-config as AHCI > and then restart the computer and let it boot normally ? Yes. > I read somewhere that the import/export functions would update the device path, but I have no experience on this issue. Yes, this is the fix. > > Any help would be much appreciated. > > Thanks allot people. Enjoy! -- richard > > Best regards, > > Svavar O > Reykjavik - Iceland > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss From lists at mcintyreweb.com Fri May 9 14:28:59 2014 From: lists at mcintyreweb.com (Hugh McIntyre) Date: Fri, 09 May 2014 07:28:59 -0700 Subject: [OmniOS-discuss] Move rpool from IDE mode to AHCI ? In-Reply-To: References: Message-ID: <536CE62B.8010003@mcintyreweb.com> On 5/9/14 2:27 AM, Svavar ?rn Eysteinsson wrote: > I double checked my BIOS configuration, and it seems that I have > misconfigured the OS disk into a IDE mode, but not AHCI. > So when I change the mode to AHCI, I get the GRUB loader and the OmniOS > logo after that, but suddenly the machines restarts again > and never boots into the system all the way. > > So my question is, is there any way for my to reconfigure my running-system > to use AHCI instead of IDE,ATA mode ? > > Is it enough to boot up a usb live system and issue a "zpool import -f > rpool", and then finally "zpool export rpool" with a running BIOS-config as > AHCI > and then restart the computer and let it boot normally ? I read somewhere > that the import/export functions would update the device path, but I have > no experience on this issue. I needed to do this last year, and the instructions from Richard Elling were close to what you asked (the same except change the BIOS mode first): 0. change BIOS setting to AHCI 1. boot livecd with shell login 2. import rpool 3. export rpool 4. boot The only extra complication I found is that if you have pools with cache devices then you need to remove/add the cache devices to make things happy, but otherwise this should work. Hugh. From mmabis at vmware.com Fri May 9 14:50:49 2014 From: mmabis at vmware.com (Matthew Mabis) Date: Fri, 9 May 2014 07:50:49 -0700 (PDT) Subject: [OmniOS-discuss] VAAI? In-Reply-To: <536CE62B.8010003@mcintyreweb.com> References: <536CE62B.8010003@mcintyreweb.com> Message-ID: <195708107.3090735.1399647049932.JavaMail.root@vmware.com> Hey All, I was wondering if there is any news on if OmniOS will support VAAI Extensions in the near future at all? Matt Mabis From danmcd at omniti.com Fri May 9 15:05:43 2014 From: danmcd at omniti.com (Dan McDonald) Date: Fri, 9 May 2014 11:05:43 -0400 Subject: [OmniOS-discuss] VAAI? In-Reply-To: <195708107.3090735.1399647049932.JavaMail.root@vmware.com> References: <536CE62B.8010003@mcintyreweb.com> <195708107.3090735.1399647049932.JavaMail.root@vmware.com> Message-ID: <2E9819D1-DD00-455D-BE70-C303BDE60622@omniti.com> On May 9, 2014, at 10:50 AM, Matthew Mabis wrote: > Hey All, > > I was wondering if there is any news on if OmniOS will support VAAI Extensions in the near future at all? Two of them: UNMAP and WRITE_SAME, have been in Illumos (and therefore OmniOS) since 2011. The last two: XCOPY and ATS, were developed by Nexenta, and were open-sourced as part of their NexentaStor 4 release recently. These would need to be upstreamed in to illumos proper at some point. If you're feeling rambunctious, you can see what it takes to extract 'em from https://github.com/Nexenta/illumos-nexenta/ and upstream them into the main illumos-gate. These were tricky, as COMSTAR isn't the most robust code to begin with, but Nexenta's had real-world experience with all of the VAAI primitives. Dan From daleg at omniti.com Fri May 9 22:02:34 2014 From: daleg at omniti.com (Dale Ghent) Date: Fri, 9 May 2014 18:02:34 -0400 Subject: [OmniOS-discuss] OmniOS 151010 improvements In-Reply-To: <536CE09B.2080507@hfg-gmuend.de> References: <536AE80C.7010006@will.to> <536B39AF.50602@hfg-gmuend.de> <536CE09B.2080507@hfg-gmuend.de> Message-ID: <1B442355-C107-45E3-AC37-C2B56B0B02EC@omniti.com> On May 9, 2014, at 10:05 AM, Guenther Alka wrote: > Release 151010 is out > > http://omnios.omniti.com/wiki.php/ReleaseNotes#ReleaseNotes states: > ? ZFS performance and reliability improvements, especially with hole-y ZFS sends and receives. > ? ZFS bookmarks for easier sending and receiving of ZFS filesystems. > ? ZFS CLI improvements > ? NFS reliability improvements. > > Can anyone publish some details about these 4 improvements Sure: > ? ZFS performance and reliability improvements, especially with hole-y ZFS sends and receives. It was discovered that the ZFS driver would steadily slow down ZFS sends and receives of images which contained large stretches of no data (ie, "hole-y?) Particular culprits (and the one which brought this to our attention) seemed to be images of zvols which host NTFS file systems. > ? ZFS bookmarks for easier sending and receiving of ZFS filesystems. Detailed here: https://www.illumos.org/issues/4369 and in the zfs(1M) man page. > ? ZFS CLI improvements Details: https://www.illumos.org/issues/3993 https://www.illumos.org/issues/4700 https://www.illumos.org/issues/4573 > ? NFS reliability improvements. Details: https://www.illumos.org/issues/4642 https://www.illumos.org/issues/4628 https://www.illumos.org/issues/4342 Related: https://www.illumos.org/issues/4483 https://www.illumos.org/issues/4575 /dale -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 494 bytes Desc: Message signed with OpenPGP using GPGMail URL: From johan.kragsterman at capvert.se Sat May 10 09:45:02 2014 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Sat, 10 May 2014 11:45:02 +0200 Subject: [OmniOS-discuss] move dump and swap to other pool In-Reply-To: <5CADD3E0-CC44-4480-A8CC-F129884A462B@omniti.com> References: <5CADD3E0-CC44-4480-A8CC-F129884A462B@omniti.com>, Message-ID: Hi! -----Dan McDonald skrev: ----- Till: Johan Kragsterman Fr?n: Dan McDonald Datum: 2014-05-08 16:37 Kopia: "OmniOS-discuss at lists.omniti.com" ?rende: Re: [OmniOS-discuss] move dump and swap to other pool On May 8, 2014, at 4:30 AM, Johan Kragsterman wrote: > > Hi! > > > I got a relatively small(20 GB) SLC SSD as rpool, and I would like to move the dump and swap devices to another pool. ATM the pkg update process isn't possible, because I got too little space left on the rpool. > > Can someone give me some advices here, how I would do this in the best way? "About dump: ?Unless you have r151008 or later, you can only use a zvol on a mirror or single-disk pool. ?Having said that: zfs create -V newpool/dump dumpadm -d /dev/zvol/newpool/dump And for swap (which should be good regardless what "newpool" is...): zfs create -V newpool/swap swap -a /dev/zvol/newpool/swap swap -d /dev/zvol/oldpool/swap" This all worked fine, but "swap -a /dev/zvol/newpool/swap" doesn't persist across reboots...I checked the swap(1M), but can't find anything that suggest how to handle that. Any advices? Regards Johan Hope this helps, Dan From jimklimov at cos.ru Sat May 10 10:04:05 2014 From: jimklimov at cos.ru (Jim Klimov) Date: Sat, 10 May 2014 12:04:05 +0200 Subject: [OmniOS-discuss] move dump and swap to other pool In-Reply-To: References: <5CADD3E0-CC44-4480-A8CC-F129884A462B@omniti.com> Message-ID: <32210936-7dac-41cc-818b-58495132f928@email.android.com> 10 ??? 2014??. 11:45:02 CEST, Johan Kragsterman ?????: > >Hi! > > > >-----Dan McDonald skrev: ----- >Till: Johan Kragsterman >Fr?n: Dan McDonald >Datum: 2014-05-08 16:37 >Kopia: "OmniOS-discuss at lists.omniti.com" > >?rende: Re: [OmniOS-discuss] move dump and swap to other pool > >On May 8, 2014, at 4:30 AM, Johan Kragsterman > wrote: > >> >> Hi! >> >> >> I got a relatively small(20 GB) SLC SSD as rpool, and I would like to >move the dump and swap devices to another pool. ATM the pkg update >process isn't possible, because I got too little space left on the >rpool. >> >> Can someone give me some advices here, how I would do this in the >best way? > > > > > > > >"About dump: ?Unless you have r151008 or later, you can only use a zvol >on a mirror or single-disk pool. ?Having said that: > >zfs create -V newpool/dump >dumpadm -d /dev/zvol/newpool/dump > >And for swap (which should be good regardless what "newpool" is...): > >zfs create -V newpool/swap >swap -a /dev/zvol/newpool/swap >swap -d /dev/zvol/oldpool/swap" > > > > > > > > >This all worked fine, but "swap -a /dev/zvol/newpool/swap" doesn't >persist across reboots...I checked the swap(1M), but can't find >anything that suggest how to handle that. > >Any advices? > >Regards Johan > > > > > > > >Hope this helps, >Dan > > > > >_______________________________________________ >OmniOS-discuss mailing list >OmniOS-discuss at lists.omniti.com >http://lists.omniti.com/mailman/listinfo/omnios-discuss Fix up /etc/vfstab -- Typos courtesy of K-9 Mail on my Samsung Android From mmabis at vmware.com Sun May 11 03:11:59 2014 From: mmabis at vmware.com (Matthew Mabis) Date: Sat, 10 May 2014 20:11:59 -0700 (PDT) Subject: [OmniOS-discuss] [developer] Kernel Panic - Possibly SMB? In-Reply-To: <1087943953.2562992.1399471315033.JavaMail.root@vmware.com> References: <63594316-D19C-4292-B406-C8BADE5ED398@gmail.com> <20140430184712.GA1017@joyent.com> <339412184.1866632.1399245773817.JavaMail.root@vmware.com> <1087943953.2562992.1399471315033.JavaMail.root@vmware.com> Message-ID: <626523312.3270212.1399777919653.JavaMail.root@vmware.com> Hey all, Had a repeat of this again, looks like its repeatable only when i pound the SMB Service a lot (had uploads/downloads from multiple boxes at the same time) (WIFI Mac Client downloading 5 different things [Large video files]) Had some uploading going on from two different servers and video streaming to an XBMC box. I am no good at troubleshooting these things, i am hoping someone could help me out i have attached the panic again. I use to do this all the time on the older versions of the software (using latest stable) and never hit this issue. I am wondering if an SMB Code change might be making this occur. If needed i can get you the dump file to help with the troubleshooting as i dont know MDB all that well any help would greatly be appreciated, it does seem whatever is causing this it requires that i pound the SMB Service a lot and then it becomes easily reproducible. FMDUMP Info nvlist version: 0 version = 0x0 class = list.suspect uuid = 331b57b0-7aa4-e892-b710-ae1c59c6fe6a code = SUNOS-8000-KL diag-time = 1399776957 591111 de = fmd:///module/software-diagnosis fault-list-sz = 0x1 fault-list = (array of embedded nvlists) (start fault-list[0]) nvlist version: 0 version = 0x0 class = defect.sunos.kernel.panic certainty = 0x64 asru = sw:///:path=/var/crash/unknown/.331b57b0-7aa4-e892-b710-ae1c59c6fe6a resource = sw:///:path=/var/crash/unknown/.331b57b0-7aa4-e892-b710-ae1c59c6fe6a savecore-succcess = 1 dump-dir = /var/crash/unknown dump-files = vmdump.0 os-instance-uuid = 331b57b0-7aa4-e892-b710-ae1c59c6fe6a panicstr = BAD TRAP: type=e (#pf Page fault) rp=ffffff0011a03450 addr=ffffff03197b10c8 panicstack = unix:real_mode_stop_cpu_stage2_end+9de3 () | unix:trap+db3 () | unix:cmntrap+e6 () | smbsrv:smb_fsop_lookup+118 () | smbsrv:smb_common_rename+d9 () | smbsrv:smb_trans2_rename+136 () | smbsrv:smb_set_rename_info+b8 () | smbsrv:smb_set_fileinfo+ed () | smbsrv:smb_set_by_fid+b0 () | smbsrv:smb_com_trans2_set_file_information+58 () | smbsrv:smb_trans2_dispatch+313 () | smbsrv:smb_com_transaction2+1a7 () | smbsrv:smb_dispatch_request+662 () | smbsrv:smb_session_worker+a0 () | genunix:taskq_d_thread+b7 () | unix:thread_start+8 () | crashtime = 1399776856 panic-time = Sat May 10 20:54:16 2014 MDT (end fault-list[0]) fault-status = 0x1 severity = Major __ttl = 0x1 __ tod = 0x536ee6bd 0x241b4d28 Here is the ::Stack smb_fsop_lookup+0x118(ffffff03197b0990, ffffff02dbfca018, 0, ffffff03197b0db8, ffffff02da252970, ffffff03197b0ea8) smb_common_rename+0xd9(ffffff03197b0990, ffffff03197b0bc8, ffffff03197b0db8) smb_trans2_rename+0x136(ffffff03197b0990, ffffff02e60c9ca0, ffffff02dc90b628, 1) smb_set_rename_info+0xb8(ffffff03197b0990, ffffff0011a03840) smb_set_fileinfo+0xed(ffffff03197b0990, ffffff0011a03840) smb_set_by_fid+0xb0(ffffff03197b0990, ffffff030f2ff400, 3f2) smb_com_trans2_set_file_information+0x58(ffffff03197b0990, ffffff030f2ff400) smb_trans2_dispatch+0x313(ffffff03197b0990, ffffff030f2ff400) smb_com_transaction2+0x1a7(ffffff03197b0990) smb_dispatch_request+0x662(ffffff03197b0990) smb_session_worker+0xa0(ffffff03197b0990) taskq_d_thread+0xb7(ffffff0331430760) thread_start+8() Matt Mabis ----- Original Message ----- From: "Matthew Mabis" To: "Gordon Ross" Cc: developer at lists.illumos.org, "omnios-discuss" Sent: Wednesday, May 7, 2014 8:01:55 AM Subject: Re: [OmniOS-discuss] [developer] Kernel Panic - Possibly SMB? I haven't been able to reproduce this issue, I believe a NC Replication (From Different Pools on the Same Host) was going on while SMB was being actively used. How would i look at the " In particular, look at the smb_xa_t it's working on, and the param+data buffers. Are they all there?" Any help would be greatly appreciated! Matt Mabis ----- Original Message ----- From: "Gordon Ross" To: "Matthew Mabis" Cc: developer at lists.illumos.org, "omnios-discuss" Sent: Tuesday, May 6, 2014 9:07:23 AM Subject: Re: [developer] Kernel Panic - Possibly SMB? Is this reproducible? Can you show any more detail about the data structure being operated on when it panic's? In particular, look at the smb_xa_t it's working on, and the param+data buffers. Are they all there? This is just a guess, but I suspect you may have stumbled upon an instance of the bug fixed here: https://github.com/Nexenta/illumos-nexenta/commit/803bd0af2c4842f440c58a8ab2c7b52f4171145d Gordon On Sun, May 4, 2014 at 7:22 PM, Matthew Mabis < mmabis at vmware.com > wrote: Hey all wondering if someone could help me figure out what just happened, i had a kernel panic that i dont understand what caused it based on what i see in the stack it might have been SMB, if you need more like the dump file itself let me know! Any help would greatly be appreciated! root at destiny:~# fmdump -Vp -u ae3350e8-17d5-4701-ecc6-b46fbaf1d04b |more TIME UUID SUNW-MSG-ID May 04 2014 17:16:03.161236000 ae3350e8-17d5-4701-ecc6-b46fbaf1d04b SUNOS-8000-KL TIME CLASS ENA May 04 17:16:03.1244 ireport.os.sunos.panic.dump_available 0x0000000000000000 May 04 17:15:23.3085 ireport.os.sunos.panic.dump_pending_on_device 0x0000000000000000 nvlist version: 0 version = 0x0 class = list.suspect uuid = ae3350e8-17d5-4701-ecc6-b46fbaf1d04b code = SUNOS-8000-KL diag-time = 1399245363 130378 de = fmd:///module/software-diagnosis fault-list-sz = 0x1 fault-list = (array of embedded nvlists) (start fault-list[0]) nvlist version: 0 version = 0x0 class = defect.sunos.kernel.panic certainty = 0x64 asru = sw:///:path=/var/crash/unknown/.ae3350e8-17d5-4701-ecc6-b46fbaf1d04b resource = sw:///:path=/var/crash/unknown/.ae3350e8-17d5-4701-ecc6-b46fbaf1d04b savecore-succcess = 1 dump-dir = /var/crash/unknown dump-files = vmdump.0 os-instance-uuid = ae3350e8-17d5-4701-ecc6-b46fbaf1d04b panicstr = BAD TRAP: type=e (#pf Page fault) rp=ffffff001e907450 addr=ffffff05ce4c90d8 panicstack = unix:real_mode_stop_cpu_stage2_end+9de3 () | unix:trap+db3 () | unix:cmntrap+e6 () | smbsrv:smb_fsop_lookup+118 () | smbsrv:smb_common_rename+d9 () | smbsrv:smb_tr ans2_rename+136 () | smbsrv:smb_set_rename_info+b8 () | smbsrv:smb_set_fileinfo+ed () | smbsrv:smb_set_by_fid+b0 () | smbsrv:smb_com_trans2_set_file_information+58 () | smbsrv:smb_trans2_dispa tch+313 () | smbsrv:smb_com_transaction2+1a7 () | smbsrv:smb_dispatch_request+662 () | smbsrv:smb_session_worker+a0 () | genunix:taskq_d_thread+b7 () | unix:thread_start+8 () | crashtime = 1399244984 panic-time = Sun May 4 17:09:44 2014 MDT (end fault-list[0]) severity = Major __ttl = 0x1 __tod = 0x5366ca33 0x99c4420 Matt Mabis illumos-developer | Archives | Modify Your Subscription _______________________________________________ OmniOS-discuss mailing list OmniOS-discuss at lists.omniti.com https://urldefense.proofpoint.com/v1/url?u=http://lists.omniti.com/mailman/listinfo/omnios-discuss&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=yqgQ6LhGnfWMd79QvLrmWsnr%2FlpWj5c0oy4MpT8%2Bgik%3D%0A&m=kP68dpV0mRMFNohpnv8r%2FJF7j4%2FzOwVtv5SSakIqIiQ%3D%0A&s=3be7b320dd64a6f1a310ade277f3028203e77cef9da0e743852773f97be078c2 -------------- next part -------------- An HTML attachment was scrubbed... URL: From natxo.asenjo at gmail.com Sun May 11 19:00:08 2014 From: natxo.asenjo at gmail.com (Natxo Asenjo) Date: Sun, 11 May 2014 21:00:08 +0200 Subject: [OmniOS-discuss] compiling mediatomb #error "non-amd64 code depends on amd64 privileged header!" Message-ID: hi, trying to use my home omnios server for streaming media I get this error when making mediatomb: g++ -DHAVE_CONFIG_H -I. -I.. -I../tombupnp/upnp/inc -I../src -I../tombupnp/ixml/inc -I../tombupnp/threadutil/inc -I../tombupnp/upnp/inc -I.. -I/usr/local/include -D_REENTRANT -pthreads -I/usr/include/amd64 -g -O2 -MT libmediatomb_a-action_request.o -MD -MP -MF .deps/libmediatomb_a-action_request.Tpo -c -o libmediatomb_a-action_request.o `test -f '../src/action_request.cc' || echo './'`../src/action_request.cc In file included from /usr/include/sys/regset.h:420:0, from /usr/include/sys/ucontext.h:36, from /usr/include/sys/signal.h:245, from /usr/include/sys/procset.h:42, from /usr/include/sys/wait.h:43, from /usr/include/stdlib.h:40, from ../src/memory.h:35, from ../src/common.h:36, from ../src/action_request.h:36, from ../src/action_request.cc:36: /usr/include/amd64/sys/privregs.h:42:2: error: #error "non-amd64 code depends on amd64 privileged header!" #error "non-amd64 code depends on amd64 privileged header!" ^ make[2]: *** [libmediatomb_a-action_request.o] Error 1 make[2]: Leaving directory `/root/mediatomb-0.12.1/build' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/root/mediatomb-0.12.1' make: *** [all] Error 2 There are quite a few hits on google regarding this #error "non-amd64 code depends on amd64 privileged header!" on omnios, but so far I could not find a solution. I am building this on a zone with gcc-4.8.1. Any help greatly appreciated. -- Groeten, natxo -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Sun May 11 19:35:25 2014 From: danmcd at omniti.com (Dan McDonald) Date: Sun, 11 May 2014 15:35:25 -0400 Subject: [OmniOS-discuss] compiling mediatomb #error "non-amd64 code depends on amd64 privileged header!" In-Reply-To: References: Message-ID: <7CBA1524-82CD-4B2C-929B-E0DFEFF6E15D@omniti.com> You are trying to build a 64-but object , right? Are you adding -m64 to the compiler flags? Dan Sent from my iPhone (typos, autocorrect, and all) > On May 11, 2014, at 3:00 PM, Natxo Asenjo wrote: > > hi, > > trying to use my home omnios server for streaming media I get this error when making mediatomb: > > g++ -DHAVE_CONFIG_H -I. -I.. -I../tombupnp/upnp/inc -I../src -I../tombupnp/ixml/inc -I../tombupnp/threadutil/inc -I../tombupnp/upnp/inc -I.. -I/usr/local/include -D_REENTRANT -pthreads -I/usr/include/amd64 -g -O2 -MT libmediatomb_a-action_request.o -MD -MP -MF .deps/libmediatomb_a-action_request.Tpo -c -o libmediatomb_a-action_request.o `test -f '../src/action_request.cc' || echo './'`../src/action_request.cc > In file included from /usr/include/sys/regset.h:420:0, > from /usr/include/sys/ucontext.h:36, > from /usr/include/sys/signal.h:245, > from /usr/include/sys/procset.h:42, > from /usr/include/sys/wait.h:43, > from /usr/include/stdlib.h:40, > from ../src/memory.h:35, > from ../src/common.h:36, > from ../src/action_request.h:36, > from ../src/action_request.cc:36: > /usr/include/amd64/sys/privregs.h:42:2: error: #error "non-amd64 code depends on amd64 privileged header!" > #error "non-amd64 code depends on amd64 privileged header!" > ^ > make[2]: *** [libmediatomb_a-action_request.o] Error 1 > make[2]: Leaving directory `/root/mediatomb-0.12.1/build' > make[1]: *** [all-recursive] Error 1 > make[1]: Leaving directory `/root/mediatomb-0.12.1' > make: *** [all] Error 2 > > There are quite a few hits on google regarding this #error "non-amd64 code depends on amd64 privileged header!" on omnios, but so far I could not find a solution. > > I am building this on a zone with gcc-4.8.1. > > Any help greatly appreciated. > > -- > Groeten, > natxo > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss From natxo.asenjo at gmail.com Sun May 11 19:57:26 2014 From: natxo.asenjo at gmail.com (Natxo Asenjo) Date: Sun, 11 May 2014 21:57:26 +0200 Subject: [OmniOS-discuss] compiling mediatomb #error "non-amd64 code depends on amd64 privileged header!" In-Reply-To: <7CBA1524-82CD-4B2C-929B-E0DFEFF6E15D@omniti.com> References: <7CBA1524-82CD-4B2C-929B-E0DFEFF6E15D@omniti.com> Message-ID: On Sun, May 11, 2014 at 9:35 PM, Dan McDonald wrote: > You are trying to build a 64-but object , right? Are you adding -m64 to > the compiler flags? > > I just tried adding -m64 to the Makefile but the error is the same: # grep m64 Makefile CFLAGS = -g -O2 -m64 Or should I do it differently? I am not really sure ... Thanks, -- regards, natxo -------------- next part -------------- An HTML attachment was scrubbed... URL: From natxo.asenjo at gmail.com Mon May 12 08:50:01 2014 From: natxo.asenjo at gmail.com (Natxo Asenjo) Date: Mon, 12 May 2014 10:50:01 +0200 Subject: [OmniOS-discuss] compiling mediatomb #error "non-amd64 code depends on amd64 privileged header!" In-Reply-To: References: Message-ID: hi, Tonight I'll try that, I suppose i should look in the Makefile? regards, natxo -- Groeten, natxo On Sun, May 11, 2014 at 11:02 PM, Dan McDonald wrote: > Okay. Our headers are 64-bit aware, maybe lose -I/usr/include/amd64 ? > > Dan > > Sent from my iPhone (typos, autocorrect, and all) > > > On May 11, 2014, at 3:00 PM, Natxo Asenjo > wrote: > > > > hi, > > > > trying to use my home omnios server for streaming media I get this error > when making mediatomb: > > > > g++ -DHAVE_CONFIG_H -I. -I.. -I../tombupnp/upnp/inc -I../src > -I../tombupnp/ixml/inc -I../tombupnp/threadutil/inc -I../tombupnp/upnp/inc > -I.. -I/usr/local/include -D_REENTRANT -pthreads > -I/usr/include/amd64 -g -O2 -MT libmediatomb_a-action_request.o -MD -MP > -MF .deps/libmediatomb_a-action_request.Tpo -c -o > libmediatomb_a-action_request.o `test -f '../src/action_request.cc' || echo > './'`../src/action_request.cc > > In file included from /usr/include/sys/regset.h:420:0, > > from /usr/include/sys/ucontext.h:36, > > from /usr/include/sys/signal.h:245, > > from /usr/include/sys/procset.h:42, > > from /usr/include/sys/wait.h:43, > > from /usr/include/stdlib.h:40, > > from ../src/memory.h:35, > > from ../src/common.h:36, > > from ../src/action_request.h:36, > > from ../src/action_request.cc:36: > > /usr/include/amd64/sys/privregs.h:42:2: error: #error "non-amd64 code > depends on amd64 privileged header!" > > #error "non-amd64 code depends on amd64 privileged header!" > > ^ > > make[2]: *** [libmediatomb_a-action_request.o] Error 1 > > make[2]: Leaving directory `/root/mediatomb-0.12.1/build' > > make[1]: *** [all-recursive] Error 1 > > make[1]: Leaving directory `/root/mediatomb-0.12.1' > > make: *** [all] Error 2 > > > > There are quite a few hits on google regarding this #error "non-amd64 > code depends on amd64 privileged header!" on omnios, but so far I could not > find a solution. > > > > I am building this on a zone with gcc-4.8.1. > > > > Any help greatly appreciated. > > > > -- > > Groeten, > > natxo > > _______________________________________________ > > OmniOS-discuss mailing list > > OmniOS-discuss at lists.omniti.com > > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lotheac at iki.fi Mon May 12 09:37:22 2014 From: lotheac at iki.fi (Lauri Tirkkonen) Date: Mon, 12 May 2014 12:37:22 +0300 Subject: [OmniOS-discuss] compiling mediatomb #error "non-amd64 code depends on amd64 privileged header!" In-Reply-To: References: <7CBA1524-82CD-4B2C-929B-E0DFEFF6E15D@omniti.com> Message-ID: <20140512093722.GA29898@gutsman.lotheac.fi> On Sun, May 11 2014 21:57:26 +0200, Natxo Asenjo wrote: > # grep m64 Makefile > CFLAGS = -g -O2 -m64 > > Or should I do it differently? I am not really sure ... It depends on the build system of the software you're trying to build. Your make output before included g++, so it is likely that you need to add -m64 to CXXFLAGS as well. -- Lauri Tirkkonen | +358 50 5341376 | lotheac @ IRCnet From johan.kragsterman at capvert.se Mon May 12 11:13:41 2014 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Mon, 12 May 2014 13:13:41 +0200 Subject: [OmniOS-discuss] fmdump help? Message-ID: Hi! Got some fmdump issues here that I would appreciate someone to help me diagnos. System is as you can see a Dell T5500 workstation, equipped with dual xeon L5520 with HT enabled, and 36 GB of ram. Bge integrated nic on mobo is disabled, and I use a quad port Gbe Intel nic at the PCI-X slot. Got the rpool on an Intel SLC SSD on the motherboard integrated SATA controller. Got a Dell H200 flashed to IT f/w(LSI2008), anounces itself as a Dell 6 gb HBA, connected to two Seagate ST4000VN000, and a Samsung 840 EVO SSD as a L2Arc device. root at omni:~# fmdump -p TIME UUID SUNW-MSG-ID EVENT maj 01 13:16:25.9491 bf630a54-1d96-6b2b-e6e9-e3347c1ba7f3 ZFS-8000-D3 Diagnosed maj 10 21:49:13.8088 431d3b05-328c-4ec2-d83a-f58a006ea156 SUNOS-8000-J0 Diagnosed maj 10 21:49:14.0433 f0a4a159-daf5-41c9-b948-d68055fb5a48 SUNOS-8000-J0 Diagnosed maj 10 21:49:14.6796 87a8a141-fa1f-6bed-f25d-b467e130c85d PCIEX-8000-43 Diagnosed Of the fmdumps, The last three from may 10 are the ones I'm interested in, and I choosed to display two of them here.one is severity "Major", and the other one is "Critical" I see some "defect.sunos.eft.unexpected_telemetry", "class and path are incompatible" and "fault.io.pci.bus-linkerr". Unfortunatly though, can't tell what it means. root at omni:~# fmdump -V -u 431d3b05-328c-4ec2-d83a-f58a006ea156 TIME UUID SUNW-MSG-ID maj 10 2014 21:49:13.808892000 431d3b05-328c-4ec2-d83a-f58a006ea156 SUNOS-8000-J0 TIME CLASS ENA maj 10 21:47:03.1897 ereport.io.pcix.unex-spl 0x32b407bf59f01001 nvlist version: 0 version = 0x0 class = list.suspect uuid = 431d3b05-328c-4ec2-d83a-f58a006ea156 code = SUNOS-8000-J0 diag-time = 1399751353 665690 de = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = fmd authority = (embedded nvlist) nvlist version: 0 version = 0x0 product-id = Precision-WorkStation-T5500 chassis-id = 17BPY4J server-id = omni (end authority) mod-name = eft mod-version = 1.16 (end de) fault-list-sz = 0x2 fault-list = (array of embedded nvlists) (start fault-list[0]) nvlist version: 0 version = 0x0 class = defect.sunos.eft.unexpected_telemetry certainty = 0x32 resource = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = hc hc-root = authority = (embedded nvlist) nvlist version: 0 product-id = Precision-WorkStation-T5500 server-id = omni chassis-id = 17BPY4J (end authority) hc-list-sz = 0x6 hc-list = (array of embedded nvlists) (start hc-list[0]) nvlist version: 0 hc-name = motherboard hc-id = 0 (end hc-list[0]) (start hc-list[1]) nvlist version: 0 hc-name = hostbridge hc-id = 0 (end hc-list[1]) (start hc-list[2]) nvlist version: 0 hc-name = pciexrc hc-id = 0 (end hc-list[2]) (start hc-list[3]) nvlist version: 0 hc-name = pciexbus hc-id = 1 (end hc-list[3]) (start hc-list[4]) nvlist version: 0 hc-name = pciexdev hc-id = 0 (end hc-list[4]) (start hc-list[5]) nvlist version: 0 hc-name = pciexfn hc-id = 0 (end hc-list[5]) (end resource) reason = ereport.io.pcix.unex-spl at motherboard0/hostbridge0/pciexrc0/pciexbus1/pciexdev0/pciexfn0 class and path are incompatible retire = 0 response = 0 asru = (embedded nvlist) nvlist version: 0 scheme = mod version = 0x0 mod-id = 86 mod-name = pcieb mod-desc = PCIe bridge/switch driver (end asru) (end fault-list[0]) (start fault-list[1]) nvlist version: 0 version = 0x0 class = fault.sunos.eft.unexpected_telemetry certainty = 0x32 resource = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = hc hc-root = authority = (embedded nvlist) nvlist version: 0 product-id = Precision-WorkStation-T5500 server-id = omni chassis-id = 17BPY4J (end authority) hc-list-sz = 0x6 hc-list = (array of embedded nvlists) (start hc-list[0]) nvlist version: 0 hc-name = motherboard hc-id = 0 (end hc-list[0]) (start hc-list[1]) nvlist version: 0 hc-name = hostbridge hc-id = 0 (end hc-list[1]) (start hc-list[2]) nvlist version: 0 hc-name = pciexrc hc-id = 0 (end hc-list[2]) (start hc-list[3]) nvlist version: 0 hc-name = pciexbus hc-id = 1 (end hc-list[3]) (start hc-list[4]) nvlist version: 0 hc-name = pciexdev hc-id = 0 (end hc-list[4]) (start hc-list[5]) nvlist version: 0 hc-name = pciexfn hc-id = 0 (end hc-list[5]) (end resource) reason = ereport.io.pcix.unex-spl at motherboard0/hostbridge0/pciexrc0/pciexbus1/pciexdev0/pciexfn0 class and path are incompatible retire = 0 response = 0 asru = (embedded nvlist) nvlist version: 0 scheme = dev version = 0x0 device-path = /pci at 0,0/pci8086,3408 at 1/pci12d8,e130 at 0 (end asru) fru = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = hc hc-root = authority = (embedded nvlist) nvlist version: 0 product-id = Precision-WorkStation-T5500 server-id = omni chassis-id = 17BPY4J (end authority) hc-list = (array of embedded nvlists) (start hc-list[0]) nvlist version: 0 hc-name = motherboard hc-id = 0 (end hc-list[0]) (end fru) location = MB (end fault-list[1]) fault-status = 0x1 0x1 severity = Major __ttl = 0x1 __tod = 0x536e82b9 0x3036b660 root at omni:~# fmdump -V -u 87a8a141-fa1f-6bed-f25d-b467e130c85d TIME UUID SUNW-MSG-ID maj 10 2014 21:49:14.679652000 87a8a141-fa1f-6bed-f25d-b467e130c85d PCIEX-8000-43 TIME CLASS ENA maj 10 21:47:03.1897 ereport.io.pci.dpe 0x32b407c3fd901001 maj 10 21:47:03.1897 ereport.io.pci.sserr 0x32b407c3fd901001 maj 10 21:47:03.1897 ereport.io.pciex.bdg.sec-serr 0x32b407bf59f01001 maj 10 21:47:03.1897 ereport.io.pci.sec-rserr 0x32b407bf59f01001 maj 10 21:47:03.1897 ereport.io.pciex.rc.fe-msg 0x32b407bb51101001 maj 10 21:47:03.1897 ereport.io.pci.sec-rserr 0x32b407bb51101001 nvlist version: 0 version = 0x0 class = list.suspect uuid = 87a8a141-fa1f-6bed-f25d-b467e130c85d code = PCIEX-8000-43 diag-time = 1399751354 555330 de = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = fmd authority = (embedded nvlist) nvlist version: 0 version = 0x0 product-id = Precision-WorkStation-T5500 chassis-id = 17BPY4J server-id = omni (end authority) mod-name = eft mod-version = 1.16 (end de) fault-list-sz = 0x2 fault-list = (array of embedded nvlists) (start fault-list[0]) nvlist version: 0 version = 0x0 class = fault.io.pciex.device-interr certainty = 0x43 resource = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = hc hc-root = authority = (embedded nvlist) nvlist version: 0 product-id = Precision-WorkStation-T5500 server-id = omni chassis-id = 17BPY4J (end authority) hc-list-sz = 0x6 hc-list = (array of embedded nvlists) (start hc-list[0]) nvlist version: 0 hc-name = motherboard hc-id = 0 (end hc-list[0]) (start hc-list[1]) nvlist version: 0 hc-name = hostbridge hc-id = 0 (end hc-list[1]) (start hc-list[2]) nvlist version: 0 hc-name = pciexrc hc-id = 0 (end hc-list[2]) (start hc-list[3]) nvlist version: 0 hc-name = pciexbus hc-id = 1 (end hc-list[3]) (start hc-list[4]) nvlist version: 0 hc-name = pciexdev hc-id = 0 (end hc-list[4]) (start hc-list[5]) nvlist version: 0 hc-name = pciexfn hc-id = 0 (end hc-list[5]) (end resource) asru = (embedded nvlist) nvlist version: 0 scheme = dev version = 0x0 device-path = /pci at 0,0/pci8086,3408 at 1/pci12d8,e130 at 0 (end asru) fru = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = hc hc-root = authority = (embedded nvlist) nvlist version: 0 product-id = Precision-WorkStation-T5500 server-id = omni chassis-id = 17BPY4J (end authority) hc-list = (array of embedded nvlists) (start hc-list[0]) nvlist version: 0 hc-name = motherboard hc-id = 0 (end hc-list[0]) (end fru) location = MB (end fault-list[0]) (start fault-list[1]) nvlist version: 0 version = 0x0 class = fault.io.pci.bus-linkerr certainty = 0x21 resource = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = hc hc-root = authority = (embedded nvlist) nvlist version: 0 product-id = Precision-WorkStation-T5500 server-id = omni chassis-id = 17BPY4J (end authority) hc-list-sz = 0x7 hc-list = (array of embedded nvlists) (start hc-list[0]) nvlist version: 0 hc-name = motherboard hc-id = 0 (end hc-list[0]) (start hc-list[1]) nvlist version: 0 hc-name = hostbridge hc-id = 0 (end hc-list[1]) (start hc-list[2]) nvlist version: 0 hc-name = pciexrc hc-id = 0 (end hc-list[2]) (start hc-list[3]) nvlist version: 0 hc-name = pciexbus hc-id = 1 (end hc-list[3]) (start hc-list[4]) nvlist version: 0 hc-name = pciexdev hc-id = 0 (end hc-list[4]) (start hc-list[5]) nvlist version: 0 hc-name = pciexfn hc-id = 0 (end hc-list[5]) (start hc-list[6]) nvlist version: 0 hc-name = pcibus hc-id = 2 (end hc-list[6]) (end resource) asru = (embedded nvlist) nvlist version: 0 scheme = dev version = 0x0 device-path = /pci at 0,0/pci8086,3408 at 1/pci12d8,e130 at 0 (end asru) fru = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = hc hc-root = authority = (embedded nvlist) nvlist version: 0 product-id = Precision-WorkStation-T5500 server-id = omni chassis-id = 17BPY4J (end authority) hc-list = (array of embedded nvlists) (start hc-list[0]) nvlist version: 0 hc-name = motherboard hc-id = 0 (end hc-list[0]) (end fru) location = MB (end fault-list[1]) fault-status = 0x1 0x1 severity = Critical __ttl = 0x1 __tod = 0x536e82ba 0x2882aaa0 root at omni:~# Best regards from/Med v?nliga h?lsningar fr?n Johan Kragsterman Capvert From johan.kragsterman at capvert.se Mon May 12 12:46:04 2014 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Mon, 12 May 2014 14:46:04 +0200 Subject: [OmniOS-discuss] Ang: fmdump help? In-Reply-To: References: Message-ID: Hi again! Got some more info about what I wrote last. Is this a hardware problem? I did some dtrace of the dump, and got this: root at omni:/var/crash/unknown# savecore -f /var/crash/unknown/vmdump.1 savecore: System dump time: Sat May 10 21:47:04 2014 savecore: saving system crash dump in /var/crash/unknown/{unix,vmcore}.1 Constructing namelist /var/crash/unknown/unix.1 Constructing corefile /var/crash/unknown/vmcore.1 0:41 100% done: 607251 of 607251 pages saved root at omni:/var/crash/unknown# mdb -k unix.1 vmcore.1 Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc pcplusmp scsi_vhci zfs sata sd ip hook neti sockfs arp usba uhci stmf stmf_sbd md lofs mpt_sas random idm nfs crypto ptm kvm cpc smbsrv ufs logindmux nsmb ] > ::status debugging crash dump vmcore.1 (64-bit) from omni operating system: 5.11 omnios-8c08411 (i86pc) image uuid: e43a2059-c9b8-e592-b307-f05eafbbe15b panic message: pcieb-0: PCI(-X) Express Fatal Error. (0x145) dump content: kernel pages only > ::stack vpanic() pcieb_intr_handler+0x1c9(ffffff0a1da39830, 0) av_dispatch_autovect+0x95(49) dispatch_hardint+0x36(49, 0) switch_sp_and_call+0x13() do_interrupt+0xa8(ffffff0047e9d110, fffffe03e383e000) _interrupt+0xba() htable_lookup+0x73(ffffff0a08ecce78, fffffe03e383e000, 1) htable_getpte+0x58(ffffff0a08ecce78, fffffe03e383e000, ffffff0047e9d2ec, ffffff0047e9d2e0, 1) htable_getpage+0x30(ffffff0a08ecce78, fffffe03e383e000, ffffff0047e9d34c) hat_getpfnum+0x71(ffffff0a08ecce78, fffffe03e383e000) kvm_va2pa+0x1b() mmu_alloc_roots+0xaa() kvm_mmu_load+0x40() kvm_mmu_reload+0x18() vcpu_enter_guest+0x68() __vcpu_run+0x8b() kvm_arch_vcpu_ioctl_run+0x112() kvm_ioctl+0x466() cdev_ioctl+0x39(10800000005, 2000ae80, 0, 202003, ffffff0a2c4995e8, ffffff0047e9dea8) spec_ioctl+0x60(ffffff0a2c875380, 2000ae80, 0, 202003, ffffff0a2c4995e8, ffffff0047e9dea8) fop_ioctl+0x55(ffffff0a2c875380, 2000ae80, 0, 202003, ffffff0a2c4995e8, ffffff0047e9dea8) ioctl+0x9b(d, 2000ae80, 0) sys_syscall+0x17a() > ::msgbuf MESSAGE vcpu 7 received sipi with vector # 10 vcpu 6 received sipi with vector # 10 kvm_lapic_reset: vcpu=ffffff0a38b5a000, id=2, base_msr= fee00800 PRIx64 base_addre ss=fee00000 kvm_lapic_reset: vcpu=ffffff0a38b52000, id=3, base_msr= fee00800 PRIx64 base_addre ss=fee00000 kvm_lapic_reset: vcpu=ffffff0a38b4a000, id=4, base_msr= fee00800 PRIx64 base_addre ss=fee00000 kvm_lapic_reset: vcpu=ffffff0a38ba2000, id=5, base_msr= fee00800 PRIx64 base_addre ss=fee00000 kvm_lapic_reset: vcpu=ffffff0a38b92000, id=7, base_msr= fee00800 PRIx64 base_addre ss=fee00000 kvm_lapic_reset: vcpu=ffffff0a38b9a000, id=6, base_msr= fee00800 PRIx64 base_addre ss=fee00000 unhandled wrmsr: 0x0 data 0 vcpu 1 received sipi with vector # 98 kvm_lapic_reset: vcpu=ffffff0a38b62000, id=1, base_msr= fee00800 PRIx64 base_addre ss=fee00000 vcpu 2 received sipi with vector # 98 kvm_lapic_reset: vcpu=ffffff0a38b5a000, id=2, base_msr= fee00800 PRIx64 base_addre ss=fee00000 vcpu 3 received sipi with vector # 98 kvm_lapic_reset: vcpu=ffffff0a38b52000, id=3, base_msr= fee00800 PRIx64 base_addre ss=fee00000 vcpu 4 received sipi with vector # 98 kvm_lapic_reset: vcpu=ffffff0a38b4a000, id=4, base_msr= fee00800 PRIx64 base_address=f ee00000 vcpu 5 received sipi with vector # 98 kvm_lapic_reset: vcpu=ffffff0a38ba2000, id=5, base_msr= fee00800 PRIx64 base_address=f ee00000 vcpu 6 received sipi with vector # 98 kvm_lapic_reset: vcpu=ffffff0a38b9a000, id=6, base_msr= fee00800 PRIx64 base_address=f ee00000 vcpu 7 received sipi with vector # 98 kvm_lapic_reset: vcpu=ffffff0a38b92000, id=7, base_msr= fee00800 PRIx64 base_address=f ee00000 kvm_lapic_reset: vcpu=ffffff0a38ba2000, id=0, base_msr= fee00100 PRIx64 base_address=f ee00000 vmcs revision_id = e kvm_lapic_reset: vcpu=ffffff0a38b4a000, id=1, base_msr= fee00000 PRIx64 base_address=f ee00000 vmcs revision_id = e unhandled wrmsr: 0x1010101 data fffffd7fffdfe870 unhandled wrmsr: 0x1010101 data fffffd7fffdfe870 unhandled wrmsr: 0xff318d0c data fffffd7fffdfe840 unhandled wrmsr: 0xff318d0c data fffffd7fffdfe840 unhandled wrmsr: 0xffdfef38 data 3000001a4 unhandled wrmsr: 0xffdfef38 data 3000001a4 vcpu 1 received sipi with vector # 10 kvm_lapic_reset: vcpu=ffffff0a38b4a000, id=1, base_msr= fee00800 PRIx64 base_address=f ee00000 unhandled rdmsr: 0x756e6547 unhandled wrmsr: 0x0 data 6c65746e756e6547 vcpu 1 received sipi with vector # 9f kvm_lapic_reset: vcpu=ffffff0a38b4a000, id=1, base_msr= fee00800 PRIx64 base_address=f ee00000 kvm_lapic_reset: vcpu=ffffff0a38b52000, id=0, base_msr= fee00100 PRIx64 base_address=f ee00000 vmcs revision_id = e kvm_lapic_reset: vcpu=ffffff0a38b5a000, id=1, base_msr= fee00000 PRIx64 base_address=f ee00000 vmcs revision_id = e kvm_lapic_reset: vcpu=ffffff0a38b62000, id=2, base_msr= fee00000 PRIx64 base_address=f ee00000 vmcs revision_id = e kvm_lapic_reset: vcpu=ffffff0a384e9000, id=3, base_msr= fee00000 PRIx64 base_address=f ee00000 vmcs revision_id = e kvm_lapic_reset: vcpu=ffffff0a4b942000, id=4, base_msr= fee00000 PRIx64 base_address=f ee00000 vmcs revision_id = e kvm_lapic_reset: vcpu=ffffff0a4b93a000, id=5, base_msr= fee00000 PRIx64 base_address=f ee00000 vmcs revision_id = e kvm_lapic_reset: vcpu=ffffff0a71a72000, id=6, base_msr= fee00000 PRIx64 base_address=f ee00000 vmcs revision_id = e kvm_lapic_reset: vcpu=ffffff0a71a6a000, id=7, base_msr= fee00000 PRIx64 base_address=f ee00000 vmcs revision_id = e unhandled wrmsr: 0x1010101 data fffffd7fffdfe9e0 unhandled wrmsr: 0x1010101 data fffffd7fffdfe9e0 unhandled wrmsr: 0x1010101 data fffffd7fffdfe9e0 unhandled wrmsr: 0x1010101 data fffffd7fffdfe9e0 unhandled wrmsr: 0x1010101 data fffffd7fffdfe9e0 unhandled wrmsr: 0x1010101 data fffffd7fffdfe9e0 unhandled wrmsr: 0x1010101 data fffffd7fffdfe9e0 unhandled wrmsr: 0x1010101 data fffffd7fffdfe9e0 unhandled wrmsr: 0xff3cfdac data fffffd7fffdfe9b0 unhandled wrmsr: 0xff3cfdac data fffffd7fffdfe9b0 unhandled wrmsr: 0xff3cfdac data fffffd7fffdfe9b0 unhandled wrmsr: 0xff3cfdac data fffffd7fffdfe9b0 unhandled wrmsr: 0xff3cfdac data fffffd7fffdfe9b0 unhandled wrmsr: 0xff3cfdac data fffffd7fffdfe9b0 unhandled wrmsr: 0xff3cfdac data fffffd7fffdfe9b0 unhandled wrmsr: 0xff3cfdac data fffffd7fffdfe9b0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 vcpu 1 received sipi with vector # 10 vcpu 2 received sipi with vector # 10 kvm_lapic_reset: vcpu=ffffff0a38b5a000, id=1, base_msr= fee00800 PRIx64 base_address=f ee00000 kvm_lapic_reset: vcpu=ffffff0a38b62000, id=2, base_msr= fee00800 PRIx64 base_address=f ee00000 vcpu 5 received sipi with vector # 10 vcpu 6 received sipi with vector # 10 kvm_lapic_reset: vcpu=ffffff0a4b93a000, id=5, base_msr= fee00800 PRIx64 base_address=f ee00000 kvm_lapic_reset: vcpu=ffffff0a71a72000, id=6, base_msr= fee00800 PRIx64 base_address=f ee00000 vcpu 4 received sipi with vector # 10 vcpu 7 received sipi with vector # 10 kvm_lapic_reset: vcpu=ffffff0a4b942000, id=4, base_msr= fee00800 PRIx64 base_address=f ee00000 vcpu 3 received sipi with vector # 10 kvm_lapic_reset: vcpu=ffffff0a71a6a000, id=7, base_msr= fee00800 PRIx64 base_address=f ee00000 kvm_lapic_reset: vcpu=ffffff0a384e9000, id=3, base_msr= fee00800 PRIx64 base_address=f ee00000 unhandled wrmsr: 0x0 data 0 NOTICE: e1000g3 link down NOTICE: vnic1000 link down NOTICE: e1000g3 link up, 100 Mbps, full duplex NOTICE: vnic1000 link up, 100 Mbps, unknown duplex NOTICE: SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major panic[cpu4]/thread=ffffff004616fc40: pcieb-0: PCI(-X) Express Fatal Error. (0x145) ffffff004616fb70 pcieb:pcieb_intr_handler+1c9 () ffffff004616fbe0 unix:av_dispatch_autovect+95 () ffffff004616fc20 unix:dispatch_hardint+36 () ffffff0047e9d0a0 unix:switch_sp_and_call+13 () ffffff0047e9d100 unix:do_interrupt+a8 () ffffff0047e9d110 unix:cmnint+ba () ffffff0047e9d250 unix:htable_lookup+73 () ffffff0047e9d2d0 unix:htable_getpte+58 () ffffff0047e9d320 unix:htable_getpage+30 () ffffff0047e9d380 unix:hat_getpfnum+71 () ffffff0047e9d3a0 kvm:kvm_va2pa+1b () ffffff0047e9d400 kvm:mmu_alloc_roots+aa () ffffff0047e9d420 kvm:kvm_mmu_load+40 () ffffff0047e9d430 kvm:kvm_mmu_reload+18 () ffffff0047e9d460 kvm:vcpu_enter_guest+68 () ffffff0047e9d4a0 kvm:__vcpu_run+8b () ffffff0047e9d4e0 kvm:kvm_arch_vcpu_ioctl_run+112 () ffffff0047e9dcc0 kvm:kvm_ioctl+466 () ffffff0047e9dd00 genunix:cdev_ioctl+39 () ffffff0047e9dd50 specfs:spec_ioctl+60 () ffffff0047e9dde0 genunix:fop_ioctl+55 () ffffff0047e9df00 genunix:ioctl+9b () ffffff0047e9df10 unix:brand_sys_syscall+1f5 () syncing file systems... done dumping to /dev/zvol/dsk/mainpool/dump, offset 65536, content: kernel Best regards from/Med v?nliga h?lsningar fr?n Johan Kragsterman Capvert _______________________________________________ OmniOS-discuss mailing list OmniOS-discuss at lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss From danmcd at omniti.com Mon May 12 13:41:36 2014 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 12 May 2014 09:41:36 -0400 Subject: [OmniOS-discuss] Ang: fmdump help? In-Reply-To: References: Message-ID: <21A096E4-8BC6-4485-9CB3-F857E80FFE92@omniti.com> On May 12, 2014, at 8:46 AM, Johan Kragsterman wrote: > panic message: pcieb-0: PCI(-X) Express Fatal Error. (0x145) That's these flags from pcie_impl.h (viewable from the source, it's not an installed system header file): #define PF_ERR_NO_ERROR (1 << 0) /* No error seen */ #define PF_ERR_NO_PANIC (1 << 2) /* Error should not panic sys */ #define PF_ERR_PANIC (1 << 6) /* Error should panic system */ #define PF_ERR_MATCH_DOM (1 << 9) /* Error Handled By IO domain */ That's a lot of flags set, and all of this flag-setting happens during a fault scan of the PCIe bus (see pcie_fault.c, especially starting with pf_scan_fabric() and its descendants). I'd be inclined to say this is a HW error, especially given your e1000g3 device complained, per here: NOTICE: e1000g3 link down NOTICE: vnic1000 link down NOTICE: e1000g3 link up, 100 Mbps, full duplex NOTICE: vnic1000 link up, 100 Mbps, unknown duplex NOTICE: SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major Dan From johan.kragsterman at capvert.se Mon May 12 15:06:36 2014 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Mon, 12 May 2014 17:06:36 +0200 Subject: [OmniOS-discuss] Ang: fmdump help? In-Reply-To: <21A096E4-8BC6-4485-9CB3-F857E80FFE92@omniti.com> References: <21A096E4-8BC6-4485-9CB3-F857E80FFE92@omniti.com>, Message-ID: Thanks again, Dan! Some more questions further down... -----Dan McDonald skrev: ----- Till: Johan Kragsterman Fr?n: Dan McDonald Datum: 2014-05-12 15:46 Kopia: "OmniOS-discuss at lists.omniti.com" ?rende: Re: [OmniOS-discuss] Ang: fmdump help? On May 12, 2014, at 8:46 AM, Johan Kragsterman wrote: > panic message: pcieb-0: PCI(-X) Express Fatal Error. (0x145) Does this mean it is the PCI-X bus? And/or a device on that bus? It makes sense if so, because the e1000g3 is on an Intel quad port PCI-X adapter on the only PCI-X bus on the system. And I had severe issues with a client connected to that port. But could a port issue really crash the system? Wouldn't it be more likely that it is the bus? First step will be that I'll change the connections to that port to another port on the same nic, and see if it'll be some changes. If I still got problems, I'll change the nic to a similar, and if that doesn't help, I put another nic on a PCIe-bus instead. That's these flags from pcie_impl.h (viewable from the source, it's not an installed system header file): #define PF_ERR_NO_ERROR ? ? ? ? (1 << 0) /* No error seen */ #define PF_ERR_NO_PANIC ? ? ? ? (1 << 2) /* Error should not panic sys */ #define PF_ERR_PANIC ? ? ? ? ? ?(1 << 6) /* Error should panic system */ #define PF_ERR_MATCH_DOM ? ? ? ?(1 << 9) /* Error Handled By IO domain */ That's a lot of flags set, and all of this flag-setting happens during a fault scan of the PCIe bus (see pcie_fault.c, especially starting with pf_scan_fabric() and its descendants). I'd be inclined to say this is a HW error, especially given your e1000g3 device complained, per here: NOTICE: e1000g3 link down NOTICE: vnic1000 link down NOTICE: e1000g3 link up, 100 Mbps, full duplex NOTICE: vnic1000 link up, 100 Mbps, unknown duplex NOTICE: SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major Dan Rgrds Johan From danmcd at omniti.com Mon May 12 15:10:38 2014 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 12 May 2014 11:10:38 -0400 Subject: [OmniOS-discuss] Ang: fmdump help? In-Reply-To: References: <21A096E4-8BC6-4485-9CB3-F857E80FFE92@omniti.com>, Message-ID: <4BE71163-8256-4D62-BA73-66A2495340F4@omniti.com> On May 12, 2014, at 11:06 AM, Johan Kragsterman wrote: > Thanks again, Dan! > > > Some more questions further down... > > > > > Does this mean it is the PCI-X bus? And/or a device on that bus? It makes sense if so, because the e1000g3 is on an Intel quad port PCI-X adapter on the only PCI-X bus on the system. And I had severe issues with a client connected to that port. But could a port issue really crash the system? Wouldn't it be more likely that it is the bus? The error message originates from the pcieb (PCI-E bus controller): 161 fffffffff8077000 4440 228 1 pcieb (PCIe bridge/switch driver) and yes it's likely the bus, as that message/panic happens after a bus scan. I indicated e1000g3 so you could maybe see if the slot it was in was bad. > First step will be that I'll change the connections to that port to another port on the same nic, and see if it'll be some changes. > > If I still got problems, I'll change the nic to a similar, and if that doesn't help, I put another nic on a PCIe-bus instead. > That's what I'd do. Dan From johan.kragsterman at capvert.se Mon May 12 15:33:53 2014 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Mon, 12 May 2014 17:33:53 +0200 Subject: [OmniOS-discuss] Ang: fmdump help? In-Reply-To: <4BE71163-8256-4D62-BA73-66A2495340F4@omniti.com> References: <4BE71163-8256-4D62-BA73-66A2495340F4@omniti.com>, <21A096E4-8BC6-4485-9CB3-F857E80FFE92@omniti.com>, Message-ID: -----Dan McDonald skrev: ----- Till: Johan Kragsterman Fr?n: Dan McDonald Datum: 2014-05-12 17:15 Kopia: "OmniOS-discuss at lists.omniti.com" ?rende: Re: [OmniOS-discuss] Ang: fmdump help? On May 12, 2014, at 11:06 AM, Johan Kragsterman wrote: > Thanks again, Dan! > > > Some more questions further down... > > > > > Does this mean it is the PCI-X bus? And/or a device on that bus? It makes sense if so, because the e1000g3 is on an Intel quad port PCI-X adapter on the only PCI-X bus on the system. And I had severe issues with a client connected to that port. But could a port issue really crash the system? Wouldn't it be more likely that it is the bus? The error message originates from the pcieb (PCI-E bus controller): 161 fffffffff8077000 ? 4440 228 ? 1 ?pcieb (PCIe bridge/switch driver) and yes it's likely the bus, as that message/panic happens after a bus scan. ?I indicated e1000g3 so you could maybe see if the slot it was in was bad. > First step will be that I'll change the connections to that port to another port on the same nic, and see if it'll be some changes. > > If I still got problems, I'll change the nic to a similar, and if that doesn't help, I put another nic on a PCIe-bus instead. > That's what I'd do. Dan The nic is on a PCI-X bus, not a PCIe bus. All nic ports on the system are on that PCI-X nic. No nic on PCIe. Does that mean that the e1000g3 had nothing to do with the problem? And that the problem must be on a PCIe bus/device? If so, I can rule out the nic. And concentrate on other devices/buses. The only adapters that are in PCIe slot/buses are the SAS controller and the graphics adapter. Or perhaps the integrated SATA controller as well is on a PCIe bus... I actually got two more of these T5500, so I could easily switch to another one, if I needed that. From danmcd at omniti.com Mon May 12 15:38:34 2014 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 12 May 2014 11:38:34 -0400 Subject: [OmniOS-discuss] Ang: fmdump help? In-Reply-To: References: <4BE71163-8256-4D62-BA73-66A2495340F4@omniti.com>, <21A096E4-8BC6-4485-9CB3-F857E80FFE92@omniti.com>, Message-ID: <5EEBB8AD-17E2-40D5-8F5C-03C8E31B3710@omniti.com> I'm not sure if that code is common to PCI-X as well. After all, the printf message mentions PCI-X (but maybe as a typo)? And interrupts from PCI-X may still sabotage PCIe. I'd continue to focus on that NIC for starters (and save the dumps if you've the disk space). Dan From johan.kragsterman at capvert.se Mon May 12 16:52:38 2014 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Mon, 12 May 2014 18:52:38 +0200 Subject: [OmniOS-discuss] differencies in booting process for KVM VM's Message-ID: Hi! Just have a question about why it looks so different on the consol when booting different KVM VM's? When I boot a pfsense VM, with single socket single core dual thread 2 G memory 4 vnics, it shows this: root at omni:/# /usr/bin/vmpfsense.sh qemu-system-x86_64: -net vnic,vlan=0,name=net0,ifname=pfwan0,macaddr=2:8:20:62:62:61: vnic dhcp disabled qemu-system-x86_64: -net vnic,vlan=1,name=net1,ifname=pflan0,macaddr=2:8:20:a4:87:27: vnic dhcp disabled qemu-system-x86_64: -net vnic,vlan=2,name=net2,ifname=pftlout0,macaddr=2:8:20:a2:d6:6d: vnic dhcp disabled qemu-system-x86_64: -net vnic,vlan=3,name=net3,ifname=pftlin0,macaddr=2:8:20:3f:d2:7f: vnic dhcp disabled Started VM: PfSense2.13 VNC available at: host IP 127.0.0.1 192.168.255.8 0.0.0.0 0.0.0.0 0.0.0.0 0.0.0.0 0.0.0.0 0.0.0.0 ::1/128 ::/0 ::/0 ::/0 ::/0 ::/0 ::/0 ::/0 port 5902 QEMU Monitor, do: # telnet localhost 7002. Note: use Control ] to exit monitor before quit! root at omni:/# And it stops with the promt. But when I boot a ubuntu server with 2 sockets, 2 cores, two threads, 16 GB mem, 3 vnic, it shows this: root at omni:/# /usr/bin/vmedubuntu.sh qemu-system-x86_64: -net vnic,vlan=0,name=net0,ifname=ltsp0,macaddr=2:8:20:15:30:bc: vnic dhcp disabled qemu-system-x86_64: -net vnic,vlan=1,name=net1,ifname=ltsp1,macaddr=2:8:20:83:d2:c3: vnic dhcp disabled qemu-system-x86_64: -net vnic,vlan=2,name=net2,ifname=ltsp2,macaddr=2:8:20:ec:a4:57: vnic dhcp disabled Start bios (version 0.6.1.2-20110201_165504-titi) Ram Size=0xe0000000 (0x0000000320000000 high) CPU Mhz=2261 PCI: pci_bios_init_bus_rec bus = 0x0 PIIX3/PIIX4 init: elcr=00 0c PCI: bus=0 devfn=0x00: vendor_id=0x8086 device_id=0x1237 PCI: bus=0 devfn=0x08: vendor_id=0x8086 device_id=0x7000 PCI: bus=0 devfn=0x09: vendor_id=0x8086 device_id=0x7010 region 4: 0x0000c000 PCI: bus=0 devfn=0x0b: vendor_id=0x8086 device_id=0x7113 PCI: bus=0 devfn=0x10: vendor_id=0x1013 device_id=0x00b8 region 0: 0xf0000000 region 1: 0xf2000000 region 6: 0xf2010000 PCI: bus=0 devfn=0x18: vendor_id=0x1af4 device_id=0x1000 region 0: 0x0000c020 region 1: 0xf2020000 region 6: 0xf2030000 PCI: bus=0 devfn=0x20: vendor_id=0x1af4 device_id=0x1000 region 0: 0x0000c040 region 1: 0xf2040000 region 6: 0xf2050000 PCI: bus=0 devfn=0x28: vendor_id=0x1af4 device_id=0x1000 region 0: 0x0000c060 region 1: 0xf2060000 region 6: 0xf2070000 PCI: bus=0 devfn=0x30: vendor_id=0x1af4 device_id=0x1001 region 0: 0x0000c080 region 1: 0xf2080000 Found 8 cpu(s) max supported 8 cpu(s) MP table addr=0x000fdbd0 MPC table addr=0x000fdbe0 size=260 SMBIOS ptr=0x000fdbb0 table=0xdffffd90 ACPI tables: RSDP=0x000fdb80 RSDT=0xdfffd810 Scan for VGA option rom Running option rom at c000:0003 VGABios $Id$ Turning on vga text mode console SeaBIOS (version 0.6.1.2-20110201_165504-titi) Found 1 lpt ports Found 1 serial ports ATA controller 0 at 1f0/3f4/0 (irq 14 dev 9) ATA controller 1 at 170/374/0 (irq 15 dev 9) found virtio-blk at 0:6 ebda moved from 9fc00 to 9dc00 drive 0x000fdb30: PCHS=16383/16/63 translation=lba LCHS=1024/255/63 s=838860800 ata1-0: QEMU DVD-ROM ATAPI-4 DVD/CD PS2 keyboard initialized All threads complete. Scan for option roms Running option rom at c900:0003 pnp call arg1=60 pmm call arg1=0 pmm call arg1=2 pmm call arg1=0 Running option rom at c980:0003 pnp call arg1=60 pmm call arg1=0 pmm call arg1=2 pmm call arg1=0 pmm call arg1=2 pmm call arg1=0 Running option rom at ca00:0003 pnp call arg1=60 pmm call arg1=0 pmm call arg1=2 pmm call arg1=0 pmm call arg1=2 pmm call arg1=0 Running option rom at ca80:0003 Returned 53248 bytes of ZoneHigh e820 map has 8 items: 0: 0000000000000000 - 000000000009dc00 = 1 1: 000000000009dc00 - 00000000000a0000 = 2 2: 00000000000f0000 - 0000000000100000 = 2 3: 0000000000100000 - 00000000dfffd000 = 1 4: 00000000dfffd000 - 00000000e0000000 = 2 5: 00000000feffc000 - 00000000ff000000 = 2 6: 00000000fffc0000 - 0000000100000000 = 2 7: 0000000100000000 - 0000000420000000 = 1 enter handle_19: NULL Booting from Hard Disk... Booting from 0000:7c00 And it stops without prompt, which means I can do a ctrl-c to stop the process....? Kinda strange differencies, imho... Rgrds Johan From natxo.asenjo at gmail.com Mon May 12 18:27:54 2014 From: natxo.asenjo at gmail.com (Natxo Asenjo) Date: Mon, 12 May 2014 20:27:54 +0200 Subject: [OmniOS-discuss] compiling mediatomb #error "non-amd64 code depends on amd64 privileged header!" In-Reply-To: <20140512093722.GA29898@gutsman.lotheac.fi> References: <7CBA1524-82CD-4B2C-929B-E0DFEFF6E15D@omniti.com> <20140512093722.GA29898@gutsman.lotheac.fi> Message-ID: On Mon, May 12, 2014 at 11:37 AM, Lauri Tirkkonen wrote: > On Sun, May 11 2014 21:57:26 +0200, Natxo Asenjo wrote: > > # grep m64 Makefile > > CFLAGS = -g -O2 -m64 > > > > Or should I do it differently? I am not really sure ... > > It depends on the build system of the software you're trying to build. > Your make output before included g++, so it is likely that you need to > add -m64 to CXXFLAGS as well. > > I finally got rid of that error using this ./configure line: ./configure "CFLAGS=-m32" "CXXFLAGS=-m32" "LDFLAGS=-m32" --exclude-youtube The youbute line was adding the amd64 libraries for curl Unfortunately I got some more make errors that like mediatomb related. So I went the easy way and tried serviio (a java based server) and it works out of the box following their instructions http://wiki.serviio.org/doku.php?id=howto:solaris:install) and using ffmpeg from csw (I already had java for crashplan). So far so good. Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Mon May 12 18:39:24 2014 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 12 May 2014 14:39:24 -0400 Subject: [OmniOS-discuss] compiling mediatomb #error "non-amd64 code depends on amd64 privileged header!" In-Reply-To: References: <7CBA1524-82CD-4B2C-929B-E0DFEFF6E15D@omniti.com> <20140512093722.GA29898@gutsman.lotheac.fi> Message-ID: On May 12, 2014, at 2:27 PM, Natxo Asenjo wrote: > > On Mon, May 12, 2014 at 11:37 AM, Lauri Tirkkonen wrote: > On Sun, May 11 2014 21:57:26 +0200, Natxo Asenjo wrote: > > # grep m64 Makefile > > CFLAGS = -g -O2 -m64 > > > > Or should I do it differently? I am not really sure ... > > It depends on the build system of the software you're trying to build. > Your make output before included g++, so it is likely that you need to > add -m64 to CXXFLAGS as well. > > > I finally got rid of that error using this ./configure line: > > ./configure "CFLAGS=-m32" "CXXFLAGS=-m32" "LDFLAGS=-m32" --exclude-youtube Hmm. You're building a 32-bit binary now. Do you want that? And if you swap all of those -m32s for -m64, does it still fail? Curious, Dan From natxo.asenjo at gmail.com Mon May 12 19:47:21 2014 From: natxo.asenjo at gmail.com (Natxo Asenjo) Date: Mon, 12 May 2014 21:47:21 +0200 Subject: [OmniOS-discuss] compiling mediatomb #error "non-amd64 code depends on amd64 privileged header!" In-Reply-To: References: <7CBA1524-82CD-4B2C-929B-E0DFEFF6E15D@omniti.com> <20140512093722.GA29898@gutsman.lotheac.fi> Message-ID: -- Groeten, natxo On Mon, May 12, 2014 at 8:39 PM, Dan McDonald wrote: > > On May 12, 2014, at 2:27 PM, Natxo Asenjo wrote: > > > > > On Mon, May 12, 2014 at 11:37 AM, Lauri Tirkkonen > wrote: > > On Sun, May 11 2014 21:57:26 +0200, Natxo Asenjo wrote: > > > # grep m64 Makefile > > > CFLAGS = -g -O2 -m64 > > > > > > Or should I do it differently? I am not really sure ... > > > > It depends on the build system of the software you're trying to build. > > Your make output before included g++, so it is likely that you need to > > add -m64 to CXXFLAGS as well. > > > > > > I finally got rid of that error using this ./configure line: > > > > ./configure "CFLAGS=-m32" "CXXFLAGS=-m32" "LDFLAGS=-m32" > --exclude-youtube > > Hmm. You're building a 32-bit binary now. Do you want that? > > And if you swap all of those -m32s for -m64, does it still fail? > > Curious, > Dan > > then I get the 'mediatomb' errors I mentioned earlier. It starts with this one: In file included from ../src/zmm/zmm.h:37:0, from ../src/zmmf/zmmf.h:35, from ../src/autoscan.h:36, from ../src/autoscan.cc:36: ../src/zmm/object.h:51:32: error: declaration of 'operator new' as non-function static void* operator new (size_t size); ^ ../src/zmm/object.h:51:27: error: expected ';' at end of member declaration static void* operator new (size_t size); ^ ../src/zmm/object.h:51:39: error: expected ')' before 'size' static void* operator new (size_t size); ^ make[2]: *** [libmediatomb_a-autoscan.o] Error 1 for which I found this patch: http://sourceforge.net/p/mediatomb/patches/25/ After patching src/zmm/object.h I get this one: In file included from ../src/hash.h:47:0, from ../src/storage.h:40, from ../src/content_manager.h:36, from ../src/content_manager.cc:45: ../src/hash/dbr_hash.h: In instantiation of 'bool DBRHash::remove(KT) [with KT = int]': ../src/content_manager.cc:904:42: required from here ../src/hash/dbr_hash.h:127:32: error: 'search' was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive] if (! search(key, &slot)) ^ ../src/hash/dbr_hash.h:127:32: note: declarations in dependent base 'DHashBase >' are not found by unqualified lookup ../src/hash/dbr_hash.h:127:32: note: use 'this->search' instead ../src/hash/dbr_hash.h:137:51: error: 'search' was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive] if (! search(data_array[array_slot], &slot)) ^ ../src/hash/dbr_hash.h:137:51: note: declarations in dependent base 'DHashBase >' are not found by unqualified lookup ../src/hash/dbr_hash.h:137:51: note: use 'this->search' instead make[2]: *** [libmediatomb_a-content_manager.o] Error 1 And I gave up, serviio has been running for a few hours and it looks great. I might just buy the pro edition. -- groet, natxo -------------- next part -------------- An HTML attachment was scrubbed... URL: From timbrown at muskegonisd.org Mon May 12 20:41:18 2014 From: timbrown at muskegonisd.org (Tim Brown) Date: Mon, 12 May 2014 16:41:18 -0400 Subject: [OmniOS-discuss] ZFS and Usage-creep Message-ID: I have a question about ZFS usage and how to predictability allocate space. I have scoured the web trying to get a good answer, but have yet to find one. I am advertising 2TB datastores to our VMware cluster over fiber channel using comstar. I use this command to create the dataset. zfs create -V 2047g vmpool01/datastores/ds01 It all works great but some of my datasets are using far more that the 2047g(more than double in one case). Here are some examples: zfs list NAME USED AVAIL REFER MOUNTPOINT ... vmpool01 27.4T 20.7T 469K /vmpool01 vmpool01/datastores 27.4T 20.7T 384K /vmpool01/datastores vmpool01/datastores/ds01 3.10T 20.7T 3.10T - vmpool01/datastores/ds02 2.06T 21.4T 1.34T - vmpool01/datastores/ds03 2.69T 20.7T 2.69T - vmpool01/datastores/ds04 2.49T 20.7T 2.49T - vmpool01/datastores/ds05 3.69T 20.7T 3.69T - vmpool01/datastores/ds06 4.67T 20.7T 4.67T - vmpool01/datastores/ds07 2.47T 20.7T 2.47T - vmpool01/datastores/ds08 2.06T 20.8T 1.92T - ... Can someone explain this to me or is there a document somewhere that can tell me how to predict the usage? Thanks. - Tim -- Tim Brown Network System Manager Muskegon Area ISD http://www.muskegonisd.org 231-767-7237 Always be yourself. Unless you can be a pirate. Then always be a pirate. From turbo124 at gmail.com Mon May 12 22:13:53 2014 From: turbo124 at gmail.com (David Bomba) Date: Tue, 13 May 2014 08:13:53 +1000 Subject: [OmniOS-discuss] Comstar Disconnects under high load. Message-ID: <9F8E3465-C32F-459A-AA49-8FCFA841E078@gmail.com> Hi guys, We have ~ 10 OmniOS powered ZFS storage arrays used to drive Virtual Machines under XenServer + VMWare using Infiniband interconnect. Our usual recipe is to use either LSI HBA or Areca Cards in pass through mode using internal drives SAS drives.. This has worked flawlessly with Omnios 6/8. Recently we deployed a slightly different configuration HP DL380 G6 64GB ram X5650 proc LSI 9208-e card HP MDS 600 / SSA 70 external enclosure 30 TOSHIBA-MK2001TRKB-1001-1.82TB SAS2 drives in mirrored configuration. despite the following message in dmesg the array appeared to be working as expected scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,340f at 8/pci1000,30b0 at 0 (mpt_sas1): May 13 04:01:07 s6 Log info 0x31140000 received for target 11. Despite this message we pushed into production and whilst the performance of the array has been good, as soon as we perform high write IO performance goes from 22k IOPS down to 100IOPS, this causes the target to disconnect from hypervisors and general mayhem ensues for the VMs.\ During this period where performance degrades, there are no other messages coming into dmesg. Where should we begin to debug this? Could this be a symptom of not enough RAM? We have flashed the LSI cards to the latest firmware with no change in performance. Thanks in advance! From danmcd at omniti.com Mon May 12 22:23:55 2014 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 12 May 2014 18:23:55 -0400 Subject: [OmniOS-discuss] Comstar Disconnects under high load. In-Reply-To: <9F8E3465-C32F-459A-AA49-8FCFA841E078@gmail.com> References: <9F8E3465-C32F-459A-AA49-8FCFA841E078@gmail.com> Message-ID: <91233CBD-6429-49D7-BAAE-6AE5C014E42B@omniti.com> On May 12, 2014, at 6:13 PM, David Bomba wrote: > Hi guys, > > We have ~ 10 OmniOS powered ZFS storage arrays used to drive Virtual Machines under XenServer + VMWare using Infiniband interconnect. > > Our usual recipe is to use either LSI HBA or Areca Cards in pass through mode using internal drives SAS drives.. > > This has worked flawlessly with Omnios 6/8. > > Recently we deployed a slightly different configuration > > HP DL380 G6 > 64GB ram > X5650 proc > LSI 9208-e card > HP MDS 600 / SSA 70 external enclosure > 30 TOSHIBA-MK2001TRKB-1001-1.82TB SAS2 drives in mirrored configuration. 1.) What was your previous configuration? And running 006 or 008? 2.) What is running on your new HW? 006 or 008? Dan From turbo124 at gmail.com Mon May 12 22:34:43 2014 From: turbo124 at gmail.com (David Bomba) Date: Tue, 13 May 2014 08:34:43 +1000 Subject: [OmniOS-discuss] Comstar Disconnects under high load. In-Reply-To: <91233CBD-6429-49D7-BAAE-6AE5C014E42B@omniti.com> References: <9F8E3465-C32F-459A-AA49-8FCFA841E078@gmail.com> <91233CBD-6429-49D7-BAAE-6AE5C014E42B@omniti.com> Message-ID: <4293D0D4-F57B-45BD-A5F0-4F44F9B2D12E@gmail.com> Previous configurations were HP DL180 G6 96GB RAM Areca 1882-ix using passthrough 25x600Gb Toshiba MBF2600RC SAS 10k drives in mirrored config L5640 Proc We have 4 of these using a mixture of 006 and 008 New hardware runs 008. On 13/05/2014, at 8:23 AM, Dan McDonald wrote: > > On May 12, 2014, at 6:13 PM, David Bomba wrote: > >> Hi guys, >> >> We have ~ 10 OmniOS powered ZFS storage arrays used to drive Virtual Machines under XenServer + VMWare using Infiniband interconnect. >> >> Our usual recipe is to use either LSI HBA or Areca Cards in pass through mode using internal drives SAS drives.. >> >> This has worked flawlessly with Omnios 6/8. >> >> Recently we deployed a slightly different configuration >> >> HP DL380 G6 >> 64GB ram >> X5650 proc >> LSI 9208-e card >> HP MDS 600 / SSA 70 external enclosure >> 30 TOSHIBA-MK2001TRKB-1001-1.82TB SAS2 drives in mirrored configuration. > > 1.) What was your previous configuration? And running 006 or 008? > > 2.) What is running on your new HW? 006 or 008? > > Dan > From danmcd at omniti.com Mon May 12 22:41:27 2014 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 12 May 2014 18:41:27 -0400 Subject: [OmniOS-discuss] Comstar Disconnects under high load. In-Reply-To: <4293D0D4-F57B-45BD-A5F0-4F44F9B2D12E@gmail.com> References: <9F8E3465-C32F-459A-AA49-8FCFA841E078@gmail.com> <91233CBD-6429-49D7-BAAE-6AE5C014E42B@omniti.com> <4293D0D4-F57B-45BD-A5F0-4F44F9B2D12E@gmail.com> Message-ID: <196DE647-59D7-4D1C-A710-DFEAD4B43AFE@omniti.com> On May 12, 2014, at 6:34 PM, David Bomba wrote: > Previous configurations were > > HP DL180 G6 > 96GB RAM > Areca 1882-ix using passthrough > 25x600Gb Toshiba MBF2600RC SAS 10k drives in mirrored config > L5640 Proc > > We have 4 of these using a mixture of 006 and 008 > > New hardware runs 008. Hmmm, and the same kind of load works on your older boxes?!? One other thing --> 010 is out now. You may wish to upgrade your newest HW to our newest stable release. Dan From turbo124 at gmail.com Mon May 12 22:43:41 2014 From: turbo124 at gmail.com (David Bomba) Date: Tue, 13 May 2014 08:43:41 +1000 Subject: [OmniOS-discuss] Comstar Disconnects under high load. In-Reply-To: <196DE647-59D7-4D1C-A710-DFEAD4B43AFE@omniti.com> References: <9F8E3465-C32F-459A-AA49-8FCFA841E078@gmail.com> <91233CBD-6429-49D7-BAAE-6AE5C014E42B@omniti.com> <4293D0D4-F57B-45BD-A5F0-4F44F9B2D12E@gmail.com> <196DE647-59D7-4D1C-A710-DFEAD4B43AFE@omniti.com> Message-ID: Out of ~ 10 storage boxes, this is the first time we have seen this pop up. And yes, workloads are identical pretty much. I'll upgrade to 010 and see if that makes a difference. Has there been any changes to mpt_sas driver in 010? Dave On 13/05/2014, at 8:41 AM, Dan McDonald wrote: > > On May 12, 2014, at 6:34 PM, David Bomba wrote: > >> Previous configurations were >> >> HP DL180 G6 >> 96GB RAM >> Areca 1882-ix using passthrough >> 25x600Gb Toshiba MBF2600RC SAS 10k drives in mirrored config >> L5640 Proc >> >> We have 4 of these using a mixture of 006 and 008 >> >> New hardware runs 008. > > Hmmm, and the same kind of load works on your older boxes?!? > > One other thing --> 010 is out now. You may wish to upgrade your newest HW to our newest stable release. > > Dan > From narayan.desai at gmail.com Mon May 12 23:32:40 2014 From: narayan.desai at gmail.com (Narayan Desai) Date: Mon, 12 May 2014 18:32:40 -0500 Subject: [OmniOS-discuss] Comstar Disconnects under high load. In-Reply-To: <9F8E3465-C32F-459A-AA49-8FCFA841E078@gmail.com> References: <9F8E3465-C32F-459A-AA49-8FCFA841E078@gmail.com> Message-ID: Are you perchance using iscsi/iSER? We've seen similar timeouts that don't seem to correspond to hardware issues. From what we can tell, something causes iscsi heartbeats not to be processed, so the client eventually times out the block device and tries to reinitialize it. In our case, we're running VMs using KVM on linux hosts. The guest detects block device death, and won't recover without a reboot. FWIW, switching to iscsi directly over IPoIB works great for identical workloads. We've seen this with 151006 and I think 151008. We've not yet tried it with 151010. This smells like some problem in comstar's iscsi/iser driver. -nld On Mon, May 12, 2014 at 5:13 PM, David Bomba wrote: > Hi guys, > > We have ~ 10 OmniOS powered ZFS storage arrays used to drive Virtual > Machines under XenServer + VMWare using Infiniband interconnect. > > Our usual recipe is to use either LSI HBA or Areca Cards in pass through > mode using internal drives SAS drives.. > > This has worked flawlessly with Omnios 6/8. > > Recently we deployed a slightly different configuration > > HP DL380 G6 > 64GB ram > X5650 proc > LSI 9208-e card > HP MDS 600 / SSA 70 external enclosure > 30 TOSHIBA-MK2001TRKB-1001-1.82TB SAS2 drives in mirrored configuration. > > despite the following message in dmesg the array appeared to be working as > expected > > scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,340f at 8/pci1000,30b0 at 0(mpt_sas1): > May 13 04:01:07 s6 Log info 0x31140000 received for target 11. > > Despite this message we pushed into production and whilst the performance > of the array has been good, as soon as we perform high write IO performance > goes from 22k IOPS down to 100IOPS, this causes the target to disconnect > from hypervisors and general mayhem ensues for the VMs.\ > > During this period where performance degrades, there are no other messages > coming into dmesg. > > Where should we begin to debug this? Could this be a symptom of not enough > RAM? We have flashed the LSI cards to the latest firmware with no change in > performance. > > Thanks in advance! > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From turbo124 at gmail.com Mon May 12 23:41:45 2014 From: turbo124 at gmail.com (David Bomba) Date: Tue, 13 May 2014 09:41:45 +1000 Subject: [OmniOS-discuss] Comstar Disconnects under high load. In-Reply-To: References: <9F8E3465-C32F-459A-AA49-8FCFA841E078@gmail.com> Message-ID: Hi Narayan, We do not use iSER. We use SRP for VMWare, and IPoIB for XenServer. In our case, our VMs operate as expected. However when copying data between Storage Repo's that is when we see the disconnects irrespective of SCSI transport. On 13 May 2014 09:32, Narayan Desai wrote: > Are you perchance using iscsi/iSER? We've seen similar timeouts that don't > seem to correspond to hardware issues. From what we can tell, something > causes iscsi heartbeats not to be processed, so the client eventually times > out the block device and tries to reinitialize it. > > In our case, we're running VMs using KVM on linux hosts. The guest detects > block device death, and won't recover without a reboot. > > FWIW, switching to iscsi directly over IPoIB works great for identical > workloads. We've seen this with 151006 and I think 151008. We've not yet > tried it with 151010. This smells like some problem in comstar's iscsi/iser > driver. > -nld > > > On Mon, May 12, 2014 at 5:13 PM, David Bomba wrote: > >> Hi guys, >> >> We have ~ 10 OmniOS powered ZFS storage arrays used to drive Virtual >> Machines under XenServer + VMWare using Infiniband interconnect. >> >> Our usual recipe is to use either LSI HBA or Areca Cards in pass through >> mode using internal drives SAS drives.. >> >> This has worked flawlessly with Omnios 6/8. >> >> Recently we deployed a slightly different configuration >> >> HP DL380 G6 >> 64GB ram >> X5650 proc >> LSI 9208-e card >> HP MDS 600 / SSA 70 external enclosure >> 30 TOSHIBA-MK2001TRKB-1001-1.82TB SAS2 drives in mirrored configuration. >> >> despite the following message in dmesg the array appeared to be working >> as expected >> >> scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,340f at 8/pci1000,30b0 at 0(mpt_sas1): >> May 13 04:01:07 s6 Log info 0x31140000 received for target 11. >> >> Despite this message we pushed into production and whilst the performance >> of the array has been good, as soon as we perform high write IO performance >> goes from 22k IOPS down to 100IOPS, this causes the target to disconnect >> from hypervisors and general mayhem ensues for the VMs.\ >> >> During this period where performance degrades, there are no other >> messages coming into dmesg. >> >> Where should we begin to debug this? Could this be a symptom of not >> enough RAM? We have flashed the LSI cards to the latest firmware with no >> change in performance. >> >> Thanks in advance! >> _______________________________________________ >> OmniOS-discuss mailing list >> OmniOS-discuss at lists.omniti.com >> http://lists.omniti.com/mailman/listinfo/omnios-discuss >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mir at miras.org Tue May 13 00:08:49 2014 From: mir at miras.org (Michael Rasmussen) Date: Tue, 13 May 2014 02:08:49 +0200 Subject: [OmniOS-discuss] Comstar Disconnects under high load. In-Reply-To: References: <9F8E3465-C32F-459A-AA49-8FCFA841E078@gmail.com> <91233CBD-6429-49D7-BAAE-6AE5C014E42B@omniti.com> <4293D0D4-F57B-45BD-A5F0-4F44F9B2D12E@gmail.com> <196DE647-59D7-4D1C-A710-DFEAD4B43AFE@omniti.com> Message-ID: <20140513020849.06c9d5ca@sleipner.datanom.net> On Tue, 13 May 2014 08:43:41 +1000 David Bomba wrote: > Has there been any changes to mpt_sas driver in 010? > Reliability improvements to the mpt_sas(7D) driver for LSI HBAs. (http://omnios.omniti.com/wiki.php/ReleaseNotes) -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: Never invest your money in anything that eats or needs repainting. -- Billy Rose -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From narayan.desai at gmail.com Tue May 13 01:29:34 2014 From: narayan.desai at gmail.com (Narayan Desai) Date: Mon, 12 May 2014 20:29:34 -0500 Subject: [OmniOS-discuss] Comstar Disconnects under high load. In-Reply-To: References: <9F8E3465-C32F-459A-AA49-8FCFA841E078@gmail.com> Message-ID: Hm, how clean is your fabric? Any errors, deadlocks, etc? -nld On Mon, May 12, 2014 at 6:41 PM, David Bomba wrote: > Hi Narayan, > > We do not use iSER. > > We use SRP for VMWare, and IPoIB for XenServer. > > In our case, our VMs operate as expected. However when copying data > between Storage Repo's that is when we see the disconnects irrespective of > SCSI transport. > > > On 13 May 2014 09:32, Narayan Desai wrote: > >> Are you perchance using iscsi/iSER? We've seen similar timeouts that >> don't seem to correspond to hardware issues. From what we can tell, >> something causes iscsi heartbeats not to be processed, so the client >> eventually times out the block device and tries to reinitialize it. >> >> In our case, we're running VMs using KVM on linux hosts. The guest >> detects block device death, and won't recover without a reboot. >> >> FWIW, switching to iscsi directly over IPoIB works great for identical >> workloads. We've seen this with 151006 and I think 151008. We've not yet >> tried it with 151010. This smells like some problem in comstar's iscsi/iser >> driver. >> -nld >> >> >> On Mon, May 12, 2014 at 5:13 PM, David Bomba wrote: >> >>> Hi guys, >>> >>> We have ~ 10 OmniOS powered ZFS storage arrays used to drive Virtual >>> Machines under XenServer + VMWare using Infiniband interconnect. >>> >>> Our usual recipe is to use either LSI HBA or Areca Cards in pass through >>> mode using internal drives SAS drives.. >>> >>> This has worked flawlessly with Omnios 6/8. >>> >>> Recently we deployed a slightly different configuration >>> >>> HP DL380 G6 >>> 64GB ram >>> X5650 proc >>> LSI 9208-e card >>> HP MDS 600 / SSA 70 external enclosure >>> 30 TOSHIBA-MK2001TRKB-1001-1.82TB SAS2 drives in mirrored configuration. >>> >>> despite the following message in dmesg the array appeared to be working >>> as expected >>> >>> scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,340f at 8/pci1000,30b0 at 0(mpt_sas1): >>> May 13 04:01:07 s6 Log info 0x31140000 received for target 11. >>> >>> Despite this message we pushed into production and whilst the >>> performance of the array has been good, as soon as we perform high write IO >>> performance goes from 22k IOPS down to 100IOPS, this causes the target to >>> disconnect from hypervisors and general mayhem ensues for the VMs.\ >>> >>> During this period where performance degrades, there are no other >>> messages coming into dmesg. >>> >>> Where should we begin to debug this? Could this be a symptom of not >>> enough RAM? We have flashed the LSI cards to the latest firmware with no >>> change in performance. >>> >>> Thanks in advance! >>> _______________________________________________ >>> OmniOS-discuss mailing list >>> OmniOS-discuss at lists.omniti.com >>> http://lists.omniti.com/mailman/listinfo/omnios-discuss >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From johan.kragsterman at capvert.se Tue May 13 06:20:25 2014 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Tue, 13 May 2014 08:20:25 +0200 Subject: [OmniOS-discuss] Comstar Disconnects under high load. In-Reply-To: References: , <9F8E3465-C32F-459A-AA49-8FCFA841E078@gmail.com> <91233CBD-6429-49D7-BAAE-6AE5C014E42B@omniti.com> <4293D0D4-F57B-45BD-A5F0-4F44F9B2D12E@gmail.com> <196DE647-59D7-4D1C-A710-DFEAD4B43AFE@omniti.com> Message-ID: -----"OmniOS-discuss" skrev: ----- Till: Dan McDonald Fr?n: David Bomba S?nt av: "OmniOS-discuss" Datum: 2014-05-13 00:44 Kopia: "OmniOS-discuss at lists.omniti.com" ?rende: Re: [OmniOS-discuss] Comstar Disconnects under high load. Out of ~ 10 storage boxes, this is the first time we have seen this pop up. And yes, workloads are identical pretty much. I'll upgrade to 010 and see if that makes a difference. Has there been any changes to mpt_sas driver in 010? Dave Hi, David! Is this setup still in production? If so, do you have the possibility to take it out of production? For me it sounds like a H/W problem, which you'll need to track down with changing/switching things around, like controllers/ports in the jbod, controllers/ports in the server, cables, perhaps backplanes in the jbod, doing zpool export/import with different configs, etc... If you got a spare SAS controller with external ports, or/and a spare server DL180/380 G6(or similar), you can move the jbod quite easily, and in that way track down the problems. Regards Johan On 13/05/2014, at 8:41 AM, Dan McDonald wrote: > > On May 12, 2014, at 6:34 PM, David Bomba wrote: > >> Previous configurations were >> >> HP DL180 G6 >> 96GB RAM >> Areca 1882-ix using passthrough >> 25x600Gb Toshiba MBF2600RC SAS 10k drives in mirrored config >> L5640 Proc >> >> We have 4 of these using a mixture of 006 and 008 >> >> New hardware runs 008. > > Hmmm, and the same kind of load works on your older boxes?!? > > One other thing --> 010 is out now. ?You may wish to upgrade your newest HW to our newest stable release. > > Dan > _______________________________________________ OmniOS-discuss mailing list OmniOS-discuss at lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss From johan.kragsterman at capvert.se Tue May 13 08:50:23 2014 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Tue, 13 May 2014 10:50:23 +0200 Subject: [OmniOS-discuss] dladm situation Message-ID: Hi! Got myself in a situation here... I wanted to switch a nic, because I suspected a problem with it. Since I know the dladm phys/link use to remember the old ones, I used dladm delete-phys to remove the old ones before I put in the new one... The nic was an intel quad port, and I put in a similar in the same slot. No phys... No link... Switched to another intel nic in another slot... No phys... No link... And dladm doesn't have any feature for adding a phys... Someone got any advice for me here...except for not doing this again...? Best regards from/Med v?nliga h?lsningar fr?n Johan Kragsterman Capvert From jdg117 at elvis.arl.psu.edu Tue May 13 09:26:56 2014 From: jdg117 at elvis.arl.psu.edu (John D Groenveld) Date: Tue, 13 May 2014 05:26:56 -0400 Subject: [OmniOS-discuss] dladm situation In-Reply-To: Your message of "Tue, 13 May 2014 10:50:23 +0200." References: Message-ID: <201405130926.s4D9QuYp019921@elvis.arl.psu.edu> In message , Johan Kragsterman writes: >The nic was an intel quad port, and I put in a similar in the same slot. > >No phys... >No link... > >Switched to another intel nic in another slot... > >No phys... >No link... Do the new NICs show up in prtconf(1M) -pv? Did you run devfsadm(1M)? John groenveld at acm.org From richard.elling at richardelling.com Tue May 13 12:48:34 2014 From: richard.elling at richardelling.com (Richard Elling) Date: Tue, 13 May 2014 05:48:34 -0700 Subject: [OmniOS-discuss] ZFS and Usage-creep In-Reply-To: References: Message-ID: <0841C295-470E-455A-9AE0-910783BEEDDB@richardelling.com> Hi Tim, On May 12, 2014, at 1:41 PM, Tim Brown wrote: > I have a question about ZFS usage and how to predictability allocate > space. I have scoured the web trying to get a good answer, but have > yet to find one. > > I am advertising 2TB datastores to our VMware cluster over fiber > channel using comstar. I use this command to create the dataset. > > zfs create -V 2047g vmpool01/datastores/ds01 [sidebar discussion] By default, the volblocksize is 8k. Depending on the zpool configuration and physical sector size, the allocated space for the data can be larger than you expect. In general, for pools, we talk about size, free, and allocated properties. It is often not easy to correlate logical size of file systems to allocated space in the pool. > > It all works great but some of my datasets are using far more that the > 2047g(more than double in one case). Here are some examples: > > zfs list > NAME USED AVAIL REFER MOUNTPOINT > ... > vmpool01 27.4T 20.7T 469K /vmpool01 > vmpool01/datastores 27.4T 20.7T 384K /vmpool01/datastores > vmpool01/datastores/ds01 3.10T 20.7T 3.10T - > vmpool01/datastores/ds02 2.06T 21.4T 1.34T - > vmpool01/datastores/ds03 2.69T 20.7T 2.69T - > vmpool01/datastores/ds04 2.49T 20.7T 2.49T - > vmpool01/datastores/ds05 3.69T 20.7T 3.69T - > vmpool01/datastores/ds06 4.67T 20.7T 4.67T - > vmpool01/datastores/ds07 2.47T 20.7T 2.47T - > vmpool01/datastores/ds08 2.06T 20.8T 1.92T - > ... > > Can someone explain this to me or is there a document somewhere that > can tell me how to predict the usage? Thanks. Cleverly hidden in the zfs man page :-). See the discussion on used and usedby*. Also, a handy option to zfs list is: zfs list -o space The -o space option breaks out the usedby* to give better visibility. -- richard > > - Tim > -- > > > Tim Brown > Network System Manager > Muskegon Area ISD > http://www.muskegonisd.org > 231-767-7237 > > Always be yourself. > Unless you can be a pirate. > Then always be a pirate. > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss From johan.kragsterman at capvert.se Tue May 13 17:52:41 2014 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Tue, 13 May 2014 19:52:41 +0200 Subject: [OmniOS-discuss] Ang: fmdump help? In-Reply-To: <5EEBB8AD-17E2-40D5-8F5C-03C8E31B3710@omniti.com> References: <5EEBB8AD-17E2-40D5-8F5C-03C8E31B3710@omniti.com>, <4BE71163-8256-4D62-BA73-66A2495340F4@omniti.com>, <21A096E4-8BC6-4485-9CB3-F857E80FFE92@omniti.com>, Message-ID: An HTML attachment was scrubbed... URL: From danmcd at omniti.com Thu May 15 01:04:01 2014 From: danmcd at omniti.com (Dan McDonald) Date: Wed, 14 May 2014 21:04:01 -0400 Subject: [OmniOS-discuss] The OmniOS "bloody" repo - now updated! Message-ID: <1276BF18-A86C-4275-897A-9C3F6BA4C18E@omniti.com> I've uploaded new bits (r151011) to the "bloody" repo, and I've made available ISO and USB images. I've installed with the ISO successfully on a VM. You can visit the "Get OmniOS" page: http://omnios.omniti.com/wiki.php/Installation and see the links. Remember the difference between bloody and stable: http://omnios.omniti.com/wiki.php/StableVsBloody If you've still a bloody machine somewhere, know now that the http://pkg.omniti.com/omnios/bloody/ URI now serves r151011 packages. This is the first time bloody has been active in a while, from what I can tell (it wasn't active when I started here). If you like bleeding edge (and officially unsupported) bits, this is your place. I may be upgrading all packages wholesale (which may make upgrading take longer), but I may only do them as they change. I don't think I'll spin release media nearly as often as I will update the IPS server either, but again, this IS bloody, so that's the chance you and I take. :) There's not a WHOLE lot of difference between r151010 and bloody right now (just a few commits on illumos-omnios mostly), but that'll evolve over time. Thank you folks, and please engage here with your feedback on the bloody repo. Dan McDonald -- OmniTI illumos engineer From rt at steait.net Thu May 15 09:07:04 2014 From: rt at steait.net (Rune Tipsmark) Date: Thu, 15 May 2014 09:07:04 +0000 Subject: [OmniOS-discuss] OmniOS Panic on high ZFS Write Load Message-ID: <11151a8c4fe1409aae19a8862c8144ee@EX1301.steait.net> My server panics on high write load using VMware to provision thick disk to the LU over infiniband. I get this error here http://i.imgur.com/fxk79zJ.png every time I put over 1.5GB/sec load on my ZFS box. Tried various disks, controllers, omnios distributions, OI distributions etc. Always the same, easy to reproduce. Googled for ever to find anything, but nothing. Does anyone have any idea? I don't really want to abandon ZFS just yet. Venlig hilsen / Best regards, Rune Tipsmark -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobi at oetiker.ch Thu May 15 12:05:58 2014 From: tobi at oetiker.ch (Tobias Oetiker) Date: Thu, 15 May 2014 14:05:58 +0200 (CEST) Subject: [OmniOS-discuss] zfs diskusage Message-ID: Today we were out of diskspace on one of our pools ... a few removed snapshots later all is fine, except that I find that I don't realy understand the numbers ... can anyone elighten me? # zpool list fast NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT fast 4.34T 1.74T 2.61T - 39% 1.22x ONLINE - # zfs list fast NAME USED AVAIL REFER MOUNTPOINT fast 2.59T 716G 78.5K /fast Why does the 'zpool list' claim that 2.61T is free (61%) while 'zfs list' sees 716G free (27%) I know there is raidz2 and compression so the numbers don't match up, but I don't understand why the ratio is so different between the two. I checked on other filesystems and there the view from zpool and zfs look much more similar. cheers tobi ps. the dedup ratio is a leftover from a time when I tried dedup. $ zpool get all fast NAME PROPERTY VALUE SOURCE fast size 4.34T - fast capacity 39% - fast altroot - default fast health ONLINE - fast guid 16524146496274345089 default fast version - default fast bootfs - default fast delegation on default fast autoreplace off default fast cachefile - default fast failmode wait default fast listsnapshots off default fast autoexpand off default fast dedupditto 0 default fast dedupratio 1.22x - fast free 2.61T - fast allocated 1.74T - fast readonly off - fast comment - default fast expandsize 0 - fast freeing 0 default fast feature at async_destroy enabled local fast feature at empty_bpobj active local fast feature at lz4_compress active local fast feature at multi_vdev_crash_dump enabled local fast feature at spacemap_histogram active local fast feature at extensible_dataset enabled local $ zfs get all fast NAME PROPERTY VALUE SOURCE fast type filesystem - fast creation Fri Jan 4 17:19 2013 - fast used 2.59T - fast available 716G - fast referenced 78.5K - fast compressratio 1.81x - fast mounted yes - fast quota none default fast reservation none default fast recordsize 128K default fast mountpoint /fast default fast sharenfs off default fast checksum on default fast compression lz4 local fast atime on default fast devices on default fast exec on default fast setuid on default fast readonly off default fast zoned off default fast snapdir hidden default fast aclmode discard default fast aclinherit restricted default fast canmount on default fast xattr on default fast copies 1 default fast version 5 - fast utf8only off - fast normalization none - fast casesensitivity sensitive - fast vscan off default fast nbmand off default fast sharesmb off default fast refquota none default fast refreservation none default fast primarycache all default fast secondarycache all default fast usedbysnapshots 0 - fast usedbydataset 78.5K - fast usedbychildren 2.59T - fast usedbyrefreservation 0 - fast logbias latency default fast dedup off local fast mlslabel none default fast sync standard default fast refcompressratio 1.00x - fast written 78.5K - fast logicalused 2.22T - fast logicalreferenced 19.5K - -- Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland www.oetiker.ch tobi at oetiker.ch +41 62 775 9902 From zmalone at omniti.com Thu May 15 14:12:41 2014 From: zmalone at omniti.com (Zach Malone) Date: Thu, 15 May 2014 10:12:41 -0400 Subject: [OmniOS-discuss] OmniOS Panic on high ZFS Write Load In-Reply-To: <11151a8c4fe1409aae19a8862c8144ee@EX1301.steait.net> References: <11151a8c4fe1409aae19a8862c8144ee@EX1301.steait.net> Message-ID: That's odd. Any chance you have the crash dump somewhere that we could download it? If not, do you mind capturing a crash dump the next time you get the system to fail this way? I'm guessing that it is driver related, because I think someone else would have run into the same if it was universal. -Zach On Thu, May 15, 2014 at 5:07 AM, Rune Tipsmark wrote: > My server panics on high write load using VMware to provision thick disk to > the LU over infiniband. > > > > I get this error here http://i.imgur.com/fxk79zJ.png every time I put over > 1.5GB/sec load on my ZFS box. > > > > Tried various disks, controllers, omnios distributions, OI distributions > etc. > > > > Always the same, easy to reproduce. > > > > Googled for ever to find anything, but nothing. > > > > Does anyone have any idea? I don?t really want to abandon ZFS just yet. > > > > Venlig hilsen / Best regards, > Rune Tipsmark > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > From slefevre at indy.rr.com Thu May 15 14:16:43 2014 From: slefevre at indy.rr.com (Scott LeFevre) Date: Thu, 15 May 2014 10:16:43 -0400 Subject: [OmniOS-discuss] ZFS and Usage-creep In-Reply-To: References: Message-ID: <1400163403.1326.2.camel@exilis.si-consulting.us> Tim, I'm interested if you sorted out the answer to your question from Richard's response? I've seen this situation as well and I'm curious for clear answer. Cheers, -- Scott LeFevre On Mon, 2014-05-12 at 16:41 -0400, Tim Brown wrote: > I have a question about ZFS usage and how to predictability allocate > space. I have scoured the web trying to get a good answer, but have > yet to find one. > > I am advertising 2TB datastores to our VMware cluster over fiber > channel using comstar. I use this command to create the dataset. > > zfs create -V 2047g vmpool01/datastores/ds01 > > It all works great but some of my datasets are using far more that the > 2047g(more than double in one case). Here are some examples: > > zfs list > NAME USED AVAIL REFER MOUNTPOINT > ... > vmpool01 27.4T 20.7T 469K /vmpool01 > vmpool01/datastores 27.4T 20.7T 384K /vmpool01/datastores > vmpool01/datastores/ds01 3.10T 20.7T 3.10T - > vmpool01/datastores/ds02 2.06T 21.4T 1.34T - > vmpool01/datastores/ds03 2.69T 20.7T 2.69T - > vmpool01/datastores/ds04 2.49T 20.7T 2.49T - > vmpool01/datastores/ds05 3.69T 20.7T 3.69T - > vmpool01/datastores/ds06 4.67T 20.7T 4.67T - > vmpool01/datastores/ds07 2.47T 20.7T 2.47T - > vmpool01/datastores/ds08 2.06T 20.8T 1.92T - > ... > > Can someone explain this to me or is there a document somewhere that > can tell me how to predict the usage? Thanks. > > - Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From narayan.desai at gmail.com Thu May 15 14:18:48 2014 From: narayan.desai at gmail.com (Narayan Desai) Date: Thu, 15 May 2014 09:18:48 -0500 Subject: [OmniOS-discuss] OmniOS Panic on high ZFS Write Load In-Reply-To: <11151a8c4fe1409aae19a8862c8144ee@EX1301.steait.net> References: <11151a8c4fe1409aae19a8862c8144ee@EX1301.steait.net> Message-ID: Are you using iSER, or iSCSI over IPoIB? Have you checked your fabrics for errors (not that those should cause the system to panic). Also, what kind of infiniband gear are you using? -nld On Thu, May 15, 2014 at 4:07 AM, Rune Tipsmark wrote: > My server panics on high write load using VMware to provision thick disk > to the LU over infiniband. > > > > I get this error here http://i.imgur.com/fxk79zJ.png every time I put > over 1.5GB/sec load on my ZFS box. > > > > Tried various disks, controllers, omnios distributions, OI distributions > etc. > > > > Always the same, easy to reproduce. > > > > Googled for ever to find anything, but nothing. > > > > Does anyone have any idea? I don?t really want to abandon ZFS just yet. > > > > Venlig hilsen / Best regards, > Rune Tipsmark > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Thu May 15 14:41:09 2014 From: danmcd at omniti.com (Dan McDonald) Date: Thu, 15 May 2014 10:41:09 -0400 Subject: [OmniOS-discuss] OmniOS Panic on high ZFS Write Load In-Reply-To: References: <11151a8c4fe1409aae19a8862c8144ee@EX1301.steait.net> Message-ID: On May 15, 2014, at 10:12 AM, Zach Malone wrote: > That's odd. Any chance you have the crash dump somewhere that we > could download it? If not, do you mind capturing a crash dump the > next time you get the system to fail this way? I'm guessing that it > is driver related, because I think someone else would have run into > the same if it was universal. The stacktrace he showed in the picture is a VM subsystem crash. There are OLD illumos bugs showing this, in particular: https://www.illumos.org/issues/1618 What OmniOS version are you running? Also, how much memory do you have on this system, and have you done any crazy tunings to increase kernel memory usage? Everything Zach said about getting a system dump applies here doubly, as well. Thanks, Dan From danmcd at omniti.com Thu May 15 14:44:13 2014 From: danmcd at omniti.com (Dan McDonald) Date: Thu, 15 May 2014 10:44:13 -0400 Subject: [OmniOS-discuss] OmniOS Panic on high ZFS Write Load In-Reply-To: References: <11151a8c4fe1409aae19a8862c8144ee@EX1301.steait.net> Message-ID: On May 15, 2014, at 10:41 AM, Dan McDonald wrote: > > What OmniOS version are you running? Also, how much memory do you have on this system, and have you done any crazy tunings to increase kernel memory usage? Sorry, you said you tried this on many versions. If you can, stick with r151010 (our latest stable) and get a system dump from this box. It's possible too, as Narayan points out, checking for HW errors is helpful. Also, I may ask you to reproduce this bug with kernel memory debugging enabled. If something is using freed memory, that'd be nice to know. And finally, are you using 3rd-party binary drivers? Or the native ones in your distro? Dan From danmcd at omniti.com Thu May 15 15:08:59 2014 From: danmcd at omniti.com (Dan McDonald) Date: Thu, 15 May 2014 11:08:59 -0400 Subject: [OmniOS-discuss] zfs diskusage In-Reply-To: References: Message-ID: On May 15, 2014, at 8:05 AM, Tobias Oetiker wrote: > Today we were out of diskspace on one of our pools ... a few removed > snapshots later all is fine, except that I find that I don't realy > understand the numbers ... can anyone elighten me? > > # zpool list fast > NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT > fast 4.34T 1.74T 2.61T - 39% 1.22x ONLINE - > > # zfs list fast > NAME USED AVAIL REFER MOUNTPOINT > fast 2.59T 716G 78.5K /fast > > Why does the 'zpool list' claim that 2.61T is free (61%) > while 'zfs list' sees 716G free (27%) > > I know there is raidz2 and compression so the numbers don't match > up, but I don't understand why the ratio is so different between > the two. Richard Elling addressed something similar on a different thread: http://lists.omniti.com/pipermail/omnios-discuss/2014-May/002609.html You're running raidz2, and that's likely why you're seeing the discrepency between zpool and zfs. Try Richard's advice of running "zfs list -o space" for the breakdown. Dan From skiselkov.ml at gmail.com Thu May 15 15:33:12 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Thu, 15 May 2014 17:33:12 +0200 Subject: [OmniOS-discuss] OmniOS Panic on high ZFS Write Load In-Reply-To: <11151a8c4fe1409aae19a8862c8144ee@EX1301.steait.net> References: <11151a8c4fe1409aae19a8862c8144ee@EX1301.steait.net> Message-ID: <5374DE38.6000503@gmail.com> On 5/15/14, 11:07 AM, Rune Tipsmark wrote: > My server panics on high write load using VMware to provision thick disk > to the LU over infiniband. > > I get this error here http://i.imgur.com/fxk79zJ.png every time I put > over 1.5GB/sec load on my ZFS box. > > Tried various disks, controllers, omnios distributions, OI distributions > etc. > > Always the same, easy to reproduce. > > Googled for ever to find anything, but nothing. > > Does anyone have any idea? I don?t really want to abandon ZFS just yet. Your system stored a crash dump on your dump device. Have a look here on what to do with it and how to extract some meaningful info from it so that developers can help you: http://wiki.illumos.org/display/illumos/How+To+Report+Problems Cheers, -- Saso From tobi at oetiker.ch Fri May 16 09:07:52 2014 From: tobi at oetiker.ch (Tobias Oetiker) Date: Fri, 16 May 2014 11:07:52 +0200 (CEST) Subject: [OmniOS-discuss] zfs diskusage In-Reply-To: References: Message-ID: Hi Dan, Yesterday Dan McDonald wrote: > On May 15, 2014, at 8:05 AM, Tobias Oetiker wrote: > > > Today we were out of diskspace on one of our pools ... a few removed > > snapshots later all is fine, except that I find that I don't realy > > understand the numbers ... can anyone elighten me? > > > > # zpool list fast > > NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT > > fast 4.34T 1.74T 2.61T - 39% 1.22x ONLINE - > > > > # zfs list fast > > NAME USED AVAIL REFER MOUNTPOINT > > fast 2.59T 716G 78.5K /fast > > > > Why does the 'zpool list' claim that 2.61T is free (61%) > > while 'zfs list' sees 716G free (27%) > > > > I know there is raidz2 and compression so the numbers don't match > > up, but I don't understand why the ratio is so different between > > the two. > > Richard Elling addressed something similar on a different thread: > > http://lists.omniti.com/pipermail/omnios-discuss/2014-May/002609.html > > You're running raidz2, and that's likely why you're seeing the discrepency between zpool and zfs. > > Try Richard's advice of running "zfs list -o space" for the breakdown. well that looks nicer, but the numbers don't change ... the way it is, it seems very difficult to judge how much space is still available ... 61% free vs 27% free seems to be quite a big difference in my eyes. cheers tobi > Dan > > -- Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland www.oetiker.ch tobi at oetiker.ch +41 62 775 9902 From rafibeyli at gmail.com Fri May 16 13:36:39 2014 From: rafibeyli at gmail.com (Hafiz Rafibeyli) Date: Fri, 16 May 2014 16:36:39 +0300 (EEST) Subject: [OmniOS-discuss] comstar sas hba target mode In-Reply-To: <1596865209.517751.1400247222357.JavaMail.zimbra@cantekstil.com.tr> Message-ID: <2139419469.517943.1400247399824.JavaMail.zimbra@cantekstil.com.tr> Hello, anyway to use SAS HBA(lsi IT firmware)as a comstar target? hafiz -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From tobi at oetiker.ch Fri May 16 13:38:58 2014 From: tobi at oetiker.ch (Tobias Oetiker) Date: Fri, 16 May 2014 15:38:58 +0200 (CEST) Subject: [OmniOS-discuss] zfs diskusage (solved) In-Reply-To: References: Message-ID: Yesterday Tobias Oetiker wrote: > Today we were out of diskspace on one of our pools ... a few removed > snapshots later all is fine, except that I find that I don't realy > understand the numbers ... can anyone elighten me? > > # zpool list fast > NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT > fast 4.34T 1.74T 2.61T - 39% 1.22x ONLINE - > > # zfs list fast > NAME USED AVAIL REFER MOUNTPOINT > fast 2.59T 716G 78.5K /fast > > Why does the 'zpool list' claim that 2.61T is free (61%) > while 'zfs list' sees 716G free (27%) > > I know there is raidz2 and compression so the numbers don't match > up, but I don't understand why the ratio is so different between > the two. > > I checked on other filesystems and there the view from zpool and > zfs look much more similar. answering my own question with some help from dan and irc: a) zpool shows the actual free space on the disks ... blocks not allocated. Since it is a raidz2 pool, we loose 2 disks for redundancy. b) zfs shows the space realy used ... though this does not realy add up yet. c) The missing piece was the zvols ... zfs by default does thick provisioning why you create a volume ... so creating an 200G zvol reduces the available space in zfs by 200G (and then some) without actually allocating any space ... so the free space in zpool does not change ... d) (Not sure this is true, but I guess) In connection with compression a volume will in all likelyhood never occupie the space allocated. what fell out of this for me, is that I switched the less important volumes to thin provisioning ... (it could be done with the -s switch at creation time): # zfs set refreservation=0 pool-2/randomstuff/unimportant-volume cheers tobi -- Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland www.oetiker.ch tobi at oetiker.ch +41 62 775 9902 From matthew.lagoe at subrigo.net Fri May 16 14:03:29 2014 From: matthew.lagoe at subrigo.net (Matthew Lagoe) Date: Fri, 16 May 2014 07:03:29 -0700 Subject: [OmniOS-discuss] Dtrace not working In-Reply-To: <013001cf710e$92ffc430$b8ff4c90$@subrigo.net> References: <013001cf710e$92ffc430$b8ff4c90$@subrigo.net> Message-ID: <013b01cf710f$a06e8740$e14b95c0$@subrigo.net> I am looking at moving from openindiana to omnios and I have a dtrace script (below) that works on openindiana however I have tried running it on the latest omnios r151010 and it doesn't run with the error ": No such file or directory" I am running it with ./wfileio.d, the other DTT scripts seem to work fine When I run it with dtrace -s wfileio.d I also get an error "dtrace: failed to compile script wfileio.d: line 39: failed to set option 'quiet': Invalid option name #!/usr/sbin/dtrace -s /* * wfileio.d - write file I/O stats, with cache miss rate. * Written using DTrace (Solaris 10 3/05) * * This script provides statistics on the number of reads and the bytes * read from filesystems (logical), and the number of bytes read from * disk (physical). A summary is printed every five seconds by file. * * A total miss-rate is also provided for the file system cache. * * $Id: wfileio.d 3 2007-08-01 10:50:08Z brendan $ * * USAGE: wfileio.d * * IDEA: Richard McDougall, Solaris Internals 2nd Ed, FS Chapter. * * COPYRIGHT: Copyright (c) 2006 Brendan Gregg. * * CDDL HEADER START * * The contents of this file are subject to the terms of the * Common Development and Distribution License, Version 1.0 only * (the "License"). You may not use this file except in compliance * with the License. * * You can obtain a copy of the license at Docs/cddl1.txt * or http://www.opensolaris.org/os/licensing. * See the License for the specific language governing permissions * and limitations under the License. * * CDDL HEADER END * * 19-Mar-2006 Brendan Gregg Created this. * 23-Apr-2006 " " Last update. * 20-Mar-2014 Linda Kateley Modified into writes */ #pragma D option quiet self int trace; uint64_t lbytes; uint64_t pbytes; dtrace:::BEGIN { trace("Tracing...\n"); } fbt::fop_write:entry /self->trace == 0 && args[0]->v_path/ { self->pathname = cleanpath(args[0]->v_path); @wio[self->pathname, "logical"] = count(); lbytes += args[1]->uio_resid; self->size = args[1]->uio_resid; self->uiop = args[1]; } fbt::fop_write:return /self->size/ { @wbytes[self->pathname, "logical"] = sum(self->size - self->uiop->uio_resid); self->size = 0; self->uiop = 0; self->pathname = 0; } io::bdev_strategy:start /self->size && args[0]->b_flags & B_READ/ { @wio[self->pathname, "physical"] = count(); @wbytes[self->pathname, "physical"] = sum(args[0]->b_bcount); pbytes += args[0]->b_bcount; } profile:::tick-5s { trunc(@wio, 20); trunc(@wbytes, 20); printf("\033[H\033[2J"); printf("\nWrit IOPS, top 20 (count)\n"); printa("%-54s %10s %10 at d\n", @wio); printf("\nWrite Bandwidth, top 20 (bytes)\n"); printa("%-54s %10s %10 at d\n", @wbytes); trunc(@wio); } -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Fri May 16 14:18:23 2014 From: danmcd at omniti.com (Dan McDonald) Date: Fri, 16 May 2014 10:18:23 -0400 Subject: [OmniOS-discuss] Dtrace not working In-Reply-To: <013b01cf710f$a06e8740$e14b95c0$@subrigo.net> References: <013001cf710e$92ffc430$b8ff4c90$@subrigo.net> <013b01cf710f$a06e8740$e14b95c0$@subrigo.net> Message-ID: On May 16, 2014, at 10:03 AM, Matthew Lagoe wrote: > I am looking at moving from openindiana to omnios and I have a dtrace script (below) that works on openindiana however I have tried running it on the latest omnios r151010 and it doesn?t run with the error ": No such file or directory" > > I am running it with ./wfileio.d, the other DTT scripts seem to work fine > > When I run it with dtrace ?s wfileio.d I also get an error ?dtrace: failed to compile script wfileio.d: line 39: failed to set option ?quiet?: Invalid option name > Odd. I just copied your script out of this mail and it runs fine on my r151010 machine, both as executable and with "dtrace -s". Dan From matthew.lagoe at subrigo.net Fri May 16 14:22:08 2014 From: matthew.lagoe at subrigo.net (Matthew Lagoe) Date: Fri, 16 May 2014 07:22:08 -0700 Subject: [OmniOS-discuss] Dtrace not working In-Reply-To: References: <013001cf710e$92ffc430$b8ff4c90$@subrigo.net> <013b01cf710f$a06e8740$e14b95c0$@subrigo.net> Message-ID: <015401cf7112$3b6f8cb0$b24ea610$@subrigo.net> That's even more strange... any ideas on what could be causing this or how I can track down the problem? I did do a pkg update on 151010 if that matters and it's a clean install with no special package managers etc only "non standard" package is DTT -----Original Message----- From: Dan McDonald [mailto:danmcd at omniti.com] Sent: Friday, May 16, 2014 07:18 AM To: Matthew Lagoe Cc: omnios-discuss at lists.omniti.com Subject: Re: [OmniOS-discuss] Dtrace not working On May 16, 2014, at 10:03 AM, Matthew Lagoe wrote: > I am looking at moving from openindiana to omnios and I have a dtrace script (below) that works on openindiana however I have tried running it on the latest omnios r151010 and it doesn't run with the error ": No such file or directory" > > I am running it with ./wfileio.d, the other DTT scripts seem to work fine > > When I run it with dtrace -s wfileio.d I also get an error "dtrace: failed to compile script wfileio.d: line 39: failed to set option 'quiet': Invalid option name > Odd. I just copied your script out of this mail and it runs fine on my r151010 machine, both as executable and with "dtrace -s". Dan From mir at miras.org Fri May 16 14:36:36 2014 From: mir at miras.org (Michael Rasmussen) Date: Fri, 16 May 2014 16:36:36 +0200 Subject: [OmniOS-discuss] Dtrace not working In-Reply-To: References: <013001cf710e$92ffc430$b8ff4c90$@subrigo.net> <013b01cf710f$a06e8740$e14b95c0$@subrigo.net> Message-ID: <20140516163636.006f9725@sleipner.datanom.net> On Fri, 16 May 2014 10:18:23 -0400 Dan McDonald wrote: > > Odd. I just copied your script out of this mail and it runs fine on my r151010 machine, both as executable and with "dtrace -s". > I can confirm that it also runs flawlessly here. -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: Ships are safe in harbor, but they were never meant to stay there. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From rt at steait.net Fri May 16 16:57:47 2014 From: rt at steait.net (Rune Tipsmark) Date: Fri, 16 May 2014 16:57:47 +0000 Subject: [OmniOS-discuss] OmniOS Panic on high ZFS Write Load In-Reply-To: References: <11151a8c4fe1409aae19a8862c8144ee@EX1301.steait.net> Message-ID: <91c4be9960c14f8caa3d720e7cdaf42a@EX1301.steait.net> Hi guys, After having tried various distros as mentioned and after having tried SLC and MLC PCI-E devices as well as SSD disks I think I actually found the issue. Previously I had a bunch of SATA disks connected to my SAS controller as well as a bunch of SAS disks... now that I removed the SATA disks and only have SAS disks left I have not been able to reproduce the issue (regardless the fact I didn't even use the SAS controller for some tests that crashes). Very weird and what a waste of a few hundred hours of reinstalling/testing, swapping cables, switches, memory, messing with bios settings and what have we. I now have two stable pools which each write a reasonable ~430 MB/sec with sync=always on without crashing. Lesson - stay far away from SATA disks on LSI 9207-4i4e Thanks for all the feedback. Br, Rune -----Original Message----- From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] On Behalf Of Dan McDonald Sent: Thursday, May 15, 2014 7:44 AM To: omnios-discuss at lists.omniti.com Subject: Re: [OmniOS-discuss] OmniOS Panic on high ZFS Write Load On May 15, 2014, at 10:41 AM, Dan McDonald wrote: > > What OmniOS version are you running? Also, how much memory do you have on this system, and have you done any crazy tunings to increase kernel memory usage? Sorry, you said you tried this on many versions. If you can, stick with r151010 (our latest stable) and get a system dump from this box. It's possible too, as Narayan points out, checking for HW errors is helpful. Also, I may ask you to reproduce this bug with kernel memory debugging enabled. If something is using freed memory, that'd be nice to know. And finally, are you using 3rd-party binary drivers? Or the native ones in your distro? Dan _______________________________________________ OmniOS-discuss mailing list OmniOS-discuss at lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss From danmcd at omniti.com Fri May 16 17:41:08 2014 From: danmcd at omniti.com (Dan McDonald) Date: Fri, 16 May 2014 13:41:08 -0400 Subject: [OmniOS-discuss] OmniOS Panic on high ZFS Write Load In-Reply-To: <91c4be9960c14f8caa3d720e7cdaf42a@EX1301.steait.net> References: <11151a8c4fe1409aae19a8862c8144ee@EX1301.steait.net> <91c4be9960c14f8caa3d720e7cdaf42a@EX1301.steait.net> Message-ID: <51CF5048-67F9-49D7-A78B-81926F9F18FC@omniti.com> On May 16, 2014, at 12:57 PM, Rune Tipsmark wrote: > Hi guys, > > After having tried various distros as mentioned and after having tried SLC and MLC PCI-E devices as well as SSD disks I think I actually found the issue. > > Previously I had a bunch of SATA disks connected to my SAS controller as well as a bunch of SAS disks... now that I removed the SATA disks and only have SAS disks left I have not been able to reproduce the issue (regardless the fact I didn't even use the SAS controller for some tests that crashes). Very weird and what a waste of a few hundred hours of reinstalling/testing, swapping cables, switches, memory, messing with bios settings and what have we. > > I now have two stable pools which each write a reasonable ~430 MB/sec with sync=always on without crashing. > > Lesson - stay far away from SATA disks on LSI 9207-4i4e Were you using a JBOD or other expander? I've *heard* you can direclty attach SATA disks to an mpt_sas board if you're careful. But generally speaking, it's operationally foolish to attach SATA drives anywhere other than to dedicated SATA ports. Thanks, Dan From dswartz at druber.com Fri May 16 17:51:57 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Fri, 16 May 2014 13:51:57 -0400 Subject: [OmniOS-discuss] OmniOS Panic on high ZFS Write Load In-Reply-To: <51CF5048-67F9-49D7-A78B-81926F9F18FC@omniti.com> References: <11151a8c4fe1409aae19a8862c8144ee@EX1301.steait.net> <91c4be9960c14f8caa3d720e7cdaf42a@EX1301.steait.net> <51CF5048-67F9-49D7-A78B-81926F9F18FC@omniti.com> Message-ID: <720c39ad1b2c24118e53dbf8119484eb.squirrel@webmail.druber.com> > > > Were you using a JBOD or other expander? I've *heard* you can direclty > attach SATA disks to an mpt_sas board if you're careful. > > But generally speaking, it's operationally foolish to attach SATA drives > anywhere other than to dedicated SATA ports. I have two samsung 840 pro ssds as l2arc connected directly to an LSI HBA and have never had any issues. They are connected with a forward breakout cable, NOT on the jbod though :) From jimklimov at cos.ru Fri May 16 12:08:47 2014 From: jimklimov at cos.ru (Jim Klimov) Date: Fri, 16 May 2014 14:08:47 +0200 Subject: [OmniOS-discuss] zfs diskusage In-Reply-To: References: Message-ID: <5d7f6d05-f130-427f-8cc0-e0dce5daa085@email.android.com> 16 ??? 2014??. 11:07:52 CEST, Tobias Oetiker ?????: >Hi Dan, > >Yesterday Dan McDonald wrote: > >> On May 15, 2014, at 8:05 AM, Tobias Oetiker wrote: >> >> > Today we were out of diskspace on one of our pools ... a few >removed >> > snapshots later all is fine, except that I find that I don't realy >> > understand the numbers ... can anyone elighten me? >> > >> > # zpool list fast >> > NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT >> > fast 4.34T 1.74T 2.61T - 39% 1.22x ONLINE - >> > >> > # zfs list fast >> > NAME USED AVAIL REFER MOUNTPOINT >> > fast 2.59T 716G 78.5K /fast >> > >> > Why does the 'zpool list' claim that 2.61T is free (61%) >> > while 'zfs list' sees 716G free (27%) >> > >> > I know there is raidz2 and compression so the numbers don't match >> > up, but I don't understand why the ratio is so different between >> > the two. >> >> Richard Elling addressed something similar on a different thread: >> >> http://lists.omniti.com/pipermail/omnios-discuss/2014-May/002609.html >> >> You're running raidz2, and that's likely why you're seeing the >discrepency between zpool and zfs. >> >> Try Richard's advice of running "zfs list -o space" for the >breakdown. > >well that looks nicer, but the numbers don't change ... the way it >is, it seems very difficult to judge how much space is still >available ... 61% free vs 27% free seems to be quite a big >difference in my eyes. > >cheers >tobi > > >> Dan >> >> Do you have volumes or other sparrse reservations? If there are no allocated bytes, that's in zpool free space (times the overhead factor of raidz parities), but if space is reserved for these datasets - that is not in free space (not available for writing) of the pool's root filesystem dataset. With this consideration in mind, do the numbers fit? Alternatively, i have a similar questionably sized pool - a raidz1 over 4*4tb disks, with about 1tb free in zfs list and 2.7tb unallocated in zpool list, and almost no volumes (none big anyway). I did not yet look deeper (i.e. knto quotas ans reservations), but now that i remembered of it - something does not add up too ;) //jim -- Typos courtesy of K-9 Mail on my Samsung Android From richard.elling at richardelling.com Fri May 16 19:57:12 2014 From: richard.elling at richardelling.com (Richard Elling) Date: Fri, 16 May 2014 12:57:12 -0700 Subject: [OmniOS-discuss] Dtrace not working In-Reply-To: <013b01cf710f$a06e8740$e14b95c0$@subrigo.net> References: <013001cf710e$92ffc430$b8ff4c90$@subrigo.net> <013b01cf710f$a06e8740$e14b95c0$@subrigo.net> Message-ID: <597B7451-D179-44D8-B7DA-CD9D90C220C5@RichardElling.com> Try the changes embedded below... On May 16, 2014, at 7:03 AM, Matthew Lagoe wrote: > I am looking at moving from openindiana to omnios and I have a dtrace script (below) that works on openindiana however I have tried running it on the latest omnios r151010 and it doesn?t run with the error ": No such file or directory" > > I am running it with ./wfileio.d, the other DTT scripts seem to work fine > > When I run it with dtrace ?s wfileio.d I also get an error ?dtrace: failed to compile script wfileio.d: line 39: failed to set option ?quiet?: Invalid option name > > #!/usr/sbin/dtrace -s #!/usr/sbin/dtrace -qs > /* > * wfileio.d - write file I/O stats, with cache miss rate. > * Written using DTrace (Solaris 10 3/05) > * > * This script provides statistics on the number of reads and the bytes > * read from filesystems (logical), and the number of bytes read from > * disk (physical). A summary is printed every five seconds by file. > * > * A total miss-rate is also provided for the file system cache. > * > * $Id: wfileio.d 3 2007-08-01 10:50:08Z brendan $ > * > * USAGE: wfileio.d > * > * IDEA: Richard McDougall, Solaris Internals 2nd Ed, FS Chapter. > * > * COPYRIGHT: Copyright (c) 2006 Brendan Gregg. > * > * CDDL HEADER START > * > * The contents of this file are subject to the terms of the > * Common Development and Distribution License, Version 1.0 only > * (the "License"). You may not use this file except in compliance > * with the License. > * > * You can obtain a copy of the license at Docs/cddl1.txt > * or http://www.opensolaris.org/os/licensing. > * See the License for the specific language governing permissions > * and limitations under the License. > * > * CDDL HEADER END > * > * 19-Mar-2006 Brendan Gregg Created this. > * 23-Apr-2006 " " Last update. > * 20-Mar-2014 Linda Kateley Modified into writes > */ > > #pragma D option quiet remove this #pragma -- richard > > self int trace; > uint64_t lbytes; > uint64_t pbytes; > > dtrace:::BEGIN > { > trace("Tracing...\n"); > } > > fbt::fop_write:entry > /self->trace == 0 && args[0]->v_path/ > { > self->pathname = cleanpath(args[0]->v_path); > @wio[self->pathname, "logical"] = count(); > lbytes += args[1]->uio_resid; > self->size = args[1]->uio_resid; > self->uiop = args[1]; > } > > fbt::fop_write:return > /self->size/ > { > @wbytes[self->pathname, "logical"] = > sum(self->size - self->uiop->uio_resid); > self->size = 0; > self->uiop = 0; > self->pathname = 0; > } > > io::bdev_strategy:start > /self->size && args[0]->b_flags & B_READ/ > { > @wio[self->pathname, "physical"] = count(); > @wbytes[self->pathname, "physical"] = sum(args[0]->b_bcount); > pbytes += args[0]->b_bcount; > } > > profile:::tick-5s > { > trunc(@wio, 20); > trunc(@wbytes, 20); > printf("\033[H\033[2J"); > printf("\nWrit IOPS, top 20 (count)\n"); > printa("%-54s %10s %10 at d\n", @wio); > printf("\nWrite Bandwidth, top 20 (bytes)\n"); > printa("%-54s %10s %10 at d\n", @wbytes); > trunc(@wio); > } > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss -- Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.lagoe at subrigo.net Sat May 17 04:14:48 2014 From: matthew.lagoe at subrigo.net (Matthew Lagoe) Date: Fri, 16 May 2014 21:14:48 -0700 Subject: [OmniOS-discuss] Slow write performance Message-ID: <021401cf7186$8de44f20$a9aced60$@subrigo.net> I have a system that I am building, 2 x E5-2620 with 3 x LSI 9207-8e one drive from each mirror plugged into one HBA, configuration is below On reads I get about 653 MB/s (which is great) and writes 263 MB/s A single drive gets around 120MB/s read and 90MB/s write and is a ST3000NM0023 sas Seagate I would assume write performance should go up with number of vdev's however it seems to only increase by ~60%. Is this expected? NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c4t5000C50057B946C7d0 ONLINE 0 0 0 c4t5000C50057B9792Bd0 ONLINE 0 0 0 c4t5000C50057BA72B3d0 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 c4t5000C50057BA7F0Bd0 ONLINE 0 0 0 c4t5000C50057BFA69Bd0 ONLINE 0 0 0 c4t5000C50057C1A177d0 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 c4t5000C50057C3CDF3d0 ONLINE 0 0 0 c4t5000C5005815632Fd0 ONLINE 0 0 0 c4t5000C5005815650Fd0 ONLINE 0 0 0 mirror-3 ONLINE 0 0 0 c4t5000C5005817ECF7d0 ONLINE 0 0 0 c4t5000C50058185583d0 ONLINE 0 0 0 c4t5000C500581C8397d0 ONLINE 0 0 0 mirror-4 ONLINE 0 0 0 c4t5000C500581CB967d0 ONLINE 0 0 0 c4t5000C500581CD21Bd0 ONLINE 0 0 0 c4t5000C500581F147Fd0 ONLINE 0 0 0 -------------- next part -------------- An HTML attachment was scrubbed... URL: From rt at steait.net Sat May 17 04:23:02 2014 From: rt at steait.net (Rune Tipsmark) Date: Sat, 17 May 2014 04:23:02 +0000 Subject: [OmniOS-discuss] OmniOS Panic on high ZFS Write Load In-Reply-To: <51CF5048-67F9-49D7-A78B-81926F9F18FC@omniti.com> References: <11151a8c4fe1409aae19a8862c8144ee@EX1301.steait.net> <91c4be9960c14f8caa3d720e7cdaf42a@EX1301.steait.net> <51CF5048-67F9-49D7-A78B-81926F9F18FC@omniti.com> Message-ID: <2ddd6d5ef8334f0f98cc573d5f2fea2f@EX1301.steait.net> SAS expander and 9 western digital WD4003FZEX Now with 10 Seagate ST4000NM0023 instead things seem to work much better. /Rune -----Original Message----- From: Dan McDonald [mailto:danmcd at omniti.com] Sent: Friday, May 16, 2014 10:41 AM To: Rune Tipsmark Cc: omnios-discuss at lists.omniti.com Subject: Re: [OmniOS-discuss] OmniOS Panic on high ZFS Write Load On May 16, 2014, at 12:57 PM, Rune Tipsmark wrote: > Hi guys, > > After having tried various distros as mentioned and after having tried SLC and MLC PCI-E devices as well as SSD disks I think I actually found the issue. > > Previously I had a bunch of SATA disks connected to my SAS controller as well as a bunch of SAS disks... now that I removed the SATA disks and only have SAS disks left I have not been able to reproduce the issue (regardless the fact I didn't even use the SAS controller for some tests that crashes). Very weird and what a waste of a few hundred hours of reinstalling/testing, swapping cables, switches, memory, messing with bios settings and what have we. > > I now have two stable pools which each write a reasonable ~430 MB/sec with sync=always on without crashing. > > Lesson - stay far away from SATA disks on LSI 9207-4i4e Were you using a JBOD or other expander? I've *heard* you can direclty attach SATA disks to an mpt_sas board if you're careful. But generally speaking, it's operationally foolish to attach SATA drives anywhere other than to dedicated SATA ports. Thanks, Dan From rt at steait.net Sat May 17 04:37:21 2014 From: rt at steait.net (Rune Tipsmark) Date: Sat, 17 May 2014 04:37:21 +0000 Subject: [OmniOS-discuss] Slow write performance In-Reply-To: <021401cf7186$8de44f20$a9aced60$@subrigo.net> References: <021401cf7186$8de44f20$a9aced60$@subrigo.net> Message-ID: Not sure if it's expected but for reference here are some numbers from my two pools, the drives are Seagate ST4000NM0023, controllers is LSI 9207-4i4e NAME STATE READ WRITE CKSUM pool01 ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c1t5000C50055FC9533d0 ONLINE 0 0 0 c1t5000C50055FE6A63d0 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 c1t5000C5005708296Fd0 ONLINE 0 0 0 c1t5000C5005708351Bd0 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 c1t5000C500570858EFd0 ONLINE 0 0 0 c1t5000C50057085A6Bd0 ONLINE 0 0 0 logs c9d0 ONLINE 0 0 0 cache c7d0 ONLINE 0 0 0 c11d0 ONLINE 0 0 0 root at zfs10:~# time dd if=/dev/zero of=/pool02/dd.tst bs=1024000 count=20000 20000+0 records in 20000+0 records out 20480000000 bytes (20 GB) copied, 50.6698 s, 404 MB/s And the other pool NAME STATE READ WRITE CKSUM pool02 ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c1t5000C50057086307d0 ONLINE 0 0 0 c1t5000C50057086B67d0 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 c1t5000C500570870D3d0 ONLINE 0 0 0 c1t5000C50057089753d0 ONLINE 0 0 0 logs c10d0 ONLINE 0 0 0 cache c12d0 ONLINE 0 0 0 c8d0 ONLINE 0 0 0 root at zfs10:~# time dd if=/dev/zero of=/pool01/dd.tst bs=1024000 count=20000 20000+0 records in 20000+0 records out 20480000000 bytes (20 GB) copied, 50.2413 s, 408 MB/s Maybe try creating a pool from disks on one of the controllers and test. Venlig hilsen / Best regards, Rune Tipsmark From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] On Behalf Of Matthew Lagoe Sent: Friday, May 16, 2014 9:15 PM To: omnios-discuss at lists.omniti.com Subject: [OmniOS-discuss] Slow write performance I have a system that I am building, 2 x E5-2620 with 3 x LSI 9207-8e one drive from each mirror plugged into one HBA, configuration is below On reads I get about 653 MB/s (which is great) and writes 263 MB/s A single drive gets around 120MB/s read and 90MB/s write and is a ST3000NM0023 sas Seagate I would assume write performance should go up with number of vdev's however it seems to only increase by ~60%. Is this expected? NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c4t5000C50057B946C7d0 ONLINE 0 0 0 c4t5000C50057B9792Bd0 ONLINE 0 0 0 c4t5000C50057BA72B3d0 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 c4t5000C50057BA7F0Bd0 ONLINE 0 0 0 c4t5000C50057BFA69Bd0 ONLINE 0 0 0 c4t5000C50057C1A177d0 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 c4t5000C50057C3CDF3d0 ONLINE 0 0 0 c4t5000C5005815632Fd0 ONLINE 0 0 0 c4t5000C5005815650Fd0 ONLINE 0 0 0 mirror-3 ONLINE 0 0 0 c4t5000C5005817ECF7d0 ONLINE 0 0 0 c4t5000C50058185583d0 ONLINE 0 0 0 c4t5000C500581C8397d0 ONLINE 0 0 0 mirror-4 ONLINE 0 0 0 c4t5000C500581CB967d0 ONLINE 0 0 0 c4t5000C500581CD21Bd0 ONLINE 0 0 0 c4t5000C500581F147Fd0 ONLINE 0 0 0 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobi at oetiker.ch Sat May 17 05:22:20 2014 From: tobi at oetiker.ch (Tobias Oetiker) Date: Sat, 17 May 2014 07:22:20 +0200 (CEST) Subject: [OmniOS-discuss] zfs diskusage In-Reply-To: <5d7f6d05-f130-427f-8cc0-e0dce5daa085@email.android.com> References: <5d7f6d05-f130-427f-8cc0-e0dce5daa085@email.android.com> Message-ID: Hi Jim, Yesterday Jim Klimov wrote: > Do you have volumes or other sparrse reservations? If there are > no allocated bytes, that's in zpool free space (times the > overhead factor of raidz parities), but if space is reserved for > these datasets - that is not in free space (not available for > writing) of the pool's root filesystem dataset. With this > consideration in mind, do the numbers fit? they do indeed ... see my other post :-) thanks tobi -- Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland www.oetiker.ch tobi at oetiker.ch +41 62 775 9902 From piers at mm.st Tue May 20 01:59:02 2014 From: piers at mm.st (Piers Dawson-Damer) Date: Tue, 20 May 2014 11:59:02 +1000 Subject: [OmniOS-discuss] where to look for VirtIO drivers? Keen to test OmniOS on OpenStack KVM Message-ID: Is anyone aware of Illumos/OmniOS VirtIO disk & nic drivers? I?m keen to test OmniOS on OpenStack KVM TIA Piers From danmcd at omniti.com Tue May 20 02:06:36 2014 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 19 May 2014 22:06:36 -0400 Subject: [OmniOS-discuss] where to look for VirtIO drivers? Keen to test OmniOS on OpenStack KVM In-Reply-To: References: Message-ID: <87803E31-DFBD-482A-9269-9319D9141EFB@omniti.com> On May 19, 2014, at 9:59 PM, Piers Dawson-Damer wrote: > Is anyone aware of Illumos/OmniOS VirtIO disk & nic drivers? > I?m keen to test OmniOS on OpenStack KVM There are none in OmniOS or upstream in illumos-gate, but I believe Nexenta's distro has them, and if someone with time & patience could upstream them to illumos, we'd all benefit. Start by looking here: https://github.com/Nexenta/illumos-nexenta/tree/master/usr/src/uts/common/io/ Sorry I can't be of more immediate assistance, Dan From derek at umiacs.umd.edu Tue May 20 02:28:54 2014 From: derek at umiacs.umd.edu (Derek Yarnell) Date: Mon, 19 May 2014 22:28:54 -0400 Subject: [OmniOS-discuss] where to look for VirtIO drivers? Keen to test OmniOS on OpenStack KVM In-Reply-To: <87803E31-DFBD-482A-9269-9319D9141EFB@omniti.com> References: <87803E31-DFBD-482A-9269-9319D9141EFB@omniti.com> Message-ID: <537ABDE6.60207@umiacs.umd.edu> On 5/19/14, 10:06 PM, Dan McDonald wrote: > > On May 19, 2014, at 9:59 PM, Piers Dawson-Damer wrote: > >> Is anyone aware of Illumos/OmniOS VirtIO disk & nic drivers? >> I?m keen to test OmniOS on OpenStack KVM > > There are none in OmniOS or upstream in illumos-gate, but I believe Nexenta's distro has them, and if someone with time & patience could upstream them to illumos, we'd all benefit. Start by looking here: > > https://github.com/Nexenta/illumos-nexenta/tree/master/usr/src/uts/common/io/ > Hi, This I don't think is totally true, VirtIO NIC driver support isn't in Illumos-gate yet but disk drivers have been in for a bit. https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/io/vioblk/vioblk.c I have been running r151005 which had them included for awhile now with no issues to report. Though they seem to not have been included in any recent stable OmniOS release (up to r151008, haven't upgraded to 10) so I don't know why that is. I haven't needed the network performance as much as the realtek support seems good enough for this small scale ZFS NFS fileserver we run using vioblk drivers. Thanks, derek -- Derek T. Yarnell University of Maryland Institute for Advanced Computer Studies From rafibeyli at gmail.com Wed May 21 06:50:44 2014 From: rafibeyli at gmail.com (Hafiz Rafibeyli) Date: Wed, 21 May 2014 09:50:44 +0300 (EEST) Subject: [OmniOS-discuss] sas hba comstar target mode In-Reply-To: <2139419469.517943.1400247399824.JavaMail.zimbra@cantekstil.com.tr> References: <2139419469.517943.1400247399824.JavaMail.zimbra@cantekstil.com.tr> Message-ID: <615018844.581155.1400655044073.JavaMail.zimbra@cantekstil.com.tr> Hello, anyway to use SAS HBA(lsi IT firmware)as a comstar target? hafiz -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From svavar at januar.is Wed May 21 09:53:39 2014 From: svavar at januar.is (=?UTF-8?Q?Svavar_=C3=96rn_Eysteinsson?=) Date: Wed, 21 May 2014 09:53:39 +0000 Subject: [OmniOS-discuss] Kernel Panic - ZFS on iSCSI target and transferring data. Message-ID: Hi. I recently got a kernel panic on my OmniOS ZFS storage server. I have a 500GB iSCSI target from my ISP that I have created a zpool on, and one ZFS dataset. My OmniOS machines uses this zpool to archive some data not used in production. I mainly use rsync to move files between. This morning, which the iSCSI connection had been up since yesterday I was going to sync about 20GB of files to the iSCSI target. My server got into panic mode. This is what was in my messages.log file : May 21 09:13:17 media savecore: [ID 570001 auth.error] reboot after panic: BAD TRAP: type=8 (#df Double fault) rp=ffffff04e3069f10 addr=0 May 21 09:13:17 media savecore: [ID 365739 auth.error] Saving compressed system crash dump in /var/crash/unknown/vmdump.0 May 21 09:14:24 media savecore: [ID 849871 auth.error] Decompress the crash dump with May 21 09:14:24 media 'savecore -vf /var/crash/unknown/vmdump.0' May 21 09:14:24 media fmd: [ID 377184 daemon.error] SUNW-MSG-ID: SUNOS-8000-KL, TYPE: Defect, VER: 1, SEVERITY: Major May 21 09:14:24 media EVENT-TIME: Wed May 21 09:14:24 GMT 2014 May 21 09:14:24 media PLATFORM: X10SAE, CSN: 0123456789, HOSTNAME: media May 21 09:14:24 media SOURCE: software-diagnosis, REV: 0.1 May 21 09:14:24 media EVENT-ID: f8b8e00b-a409-6f50-aaa6-bc6c3ebadad0 May 21 09:14:24 media DESC: The system has rebooted after a kernel panic. Refer to http://illumos.org/msg/SUNOS-8000-KL for more information. May 21 09:14:24 media AUTO-RESPONSE: The failed system image was dumped to the dump device. If savecore is enabled (see dumpadm(1M)) a copy of the dump will be written to the savecore directory /var/crash/unknown. May 21 09:14:24 media IMPACT: There may be some performance impact while the panic is copied to the savecore directory. Disk space usage by panics can be substantial. May 21 09:14:24 media REC-ACTION: If savecore is not enabled then please take steps to preserve the crash image. May 21 09:14:24 media Use 'fmdump -Vp -u f8b8e00b-a409-6f50-aaa6-bc6c3ebadad0' to view more panic detail. Please refer to the knowledge article for additional information. By issuing fmdump -Vp -u f8b8e00b-a409-6f50-aaa6-bc6c3ebadad0 I have this data : TIME UUID SUNW-MSG-ID ma? 21 2014 09:14:24.861678000 f8b8e00b-a409-6f50-aaa6-bc6c3ebadad0 SUNOS-8000-KL TIME CLASS ENA ma? 21 09:14:24.7657 ireport.os.sunos.panic.dump_available 0x0000000000000000 ma? 21 09:13:17.7666 ireport.os.sunos.panic.dump_pending_on_device 0x0000000000000000 nvlist version: 0 version = 0x0 class = list.suspect uuid = f8b8e00b-a409-6f50-aaa6-bc6c3ebadad0 code = SUNOS-8000-KL diag-time = 1400663664 781451 de = fmd:///module/software-diagnosis fault-list-sz = 0x1 fault-list = (array of embedded nvlists) (start fault-list[0]) nvlist version: 0 version = 0x0 class = defect.sunos.kernel.panic certainty = 0x64 asru = sw:///:path=/var/crash/unknown/.f8b8e00b-a409-6f50-aaa6-bc6c3ebadad0 resource = sw:///:path=/var/crash/unknown/.f8b8e00b-a409-6f50-aaa6-bc6c3ebadad0 savecore-succcess = 1 dump-dir = /var/crash/unknown dump-files = vmdump.0 os-instance-uuid = f8b8e00b-a409-6f50-aaa6-bc6c3ebadad0 panicstr = BAD TRAP: type=8 (#df Double fault) rp=ffffff04e3069f10 addr=0 panicstack = unix:real_mode_stop_cpu_stage2_end+9de3 () | unix:trap+ca5 () | unix:_patch_xrstorq_rbx+196 () | zfs:zio_vdev_delegated_io+86 () | zfs:vdev_queue_aggregate+298 () | zfs:vdev_queue_io_to_issue+5e () | zfs:vdev_queue_io_done+88 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | crashtime = 1400662215 panic-time = Wed May 21 08:50:15 2014 GMT (end fault-list[0]) fault-status = 0x1 severity = Major __ttl = 0x1 __tod = 0x537c6e70 0x335c29b0 Does anyone see anything ? I have no clue or knowledge/experience in debugging kernel based crashes. The only feature that I have enabled on this zPool and or ZFS dataset is a Lz4 compression on the zfs dataset. Is there any zfs, iSCSI improvements in the latest OmniOS release ? Any help, and or information would be much appreciated. Thanks allot people. Best regards, Svavar Orn -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Wed May 21 15:20:10 2014 From: danmcd at omniti.com (Dan McDonald) Date: Wed, 21 May 2014 11:20:10 -0400 Subject: [OmniOS-discuss] Kernel Panic - ZFS on iSCSI target and transferring data. In-Reply-To: References: Message-ID: <2E6C034B-4C42-4CBA-809F-47E3EA884FA9@omniti.com> On May 21, 2014, at 5:53 AM, Svavar ?rn Eysteinsson wrote: > panicstr = BAD TRAP: type=8 (#df Double fault) rp=ffffff04e3069f10 addr=0 > panicstack = unix:real_mode_stop_cpu_stage2_end+9de3 () | unix:trap+ca5 () | unix:_patch_xrstorq_rbx+196 () | zfs:zio_vdev_delegated_io+86 () | zfs:vdev_queue_aggregate+298 () | zfs:vdev_queue_io_to_issue+5e () | zfs:vdev_queue_io_done+88 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | Oh how cute, the kernel somehow managed to infinitely recurse, causing a stack overflow! > The only feature that I have enabled on this zPool and or ZFS dataset is a Lz4 compression on the zfs dataset. Hmmm, lz4 compression and stack overflow? I see one illumos bug related to that: https://www.illumos.org/issues/3705 But that bug's fix has been in OmniOS since r151006. You're not running something older, are you? There ARE new ZFS fixes in r151010, and if you can, I'd highly recommend the upgrade. Dan From svavar at januar.is Wed May 21 15:44:45 2014 From: svavar at januar.is (=?UTF-8?Q?Svavar_=C3=96rn_Eysteinsson?=) Date: Wed, 21 May 2014 15:44:45 +0000 Subject: [OmniOS-discuss] Kernel Panic - ZFS on iSCSI target and transferring data. In-Reply-To: <2E6C034B-4C42-4CBA-809F-47E3EA884FA9@omniti.com> References: <2E6C034B-4C42-4CBA-809F-47E3EA884FA9@omniti.com> Message-ID: Well, I will update to r151010 as soon as possible. And yes, i'm running the OmniOS v11 r151008 ( OmniOS 5.11 omnios-6de5e81 ) On 21 May 2014 15:20, Dan McDonald wrote: > > On May 21, 2014, at 5:53 AM, Svavar ?rn Eysteinsson > wrote: > > > > > panicstr = BAD TRAP: type=8 (#df Double fault) > rp=ffffff04e3069f10 addr=0 > > panicstack = unix:real_mode_stop_cpu_stage2_end+9de3 () > | unix:trap+ca5 () | unix:_patch_xrstorq_rbx+196 () | > zfs:zio_vdev_delegated_io+86 () | zfs:vdev_queue_aggregate+298 () | > zfs:vdev_queue_io_to_issue+5e () | zfs:vdev_queue_io_done+88 () | > zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | > zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | > zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | > zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | > zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | > zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | > zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | > zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | > zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | > zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | > zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | > zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | > zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | > zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | > zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | > zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | > zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | > zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | > zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | > zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | > zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | > zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | > zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | > zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | > zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | > zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | > zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | > zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | > zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | > zfs:zio_execute+88 () | zfs:vdev_queue_io_done+78 () | > zfs:zio_vdev_io_done+80 () | zfs:zio_execute+88 () | > zfs:vdev_queue_io_done+78 () | zfs:zio_vdev_io_done+80 () | > > Oh how cute, the kernel somehow managed to infinitely recurse, causing a > stack overflow! > > > The only feature that I have enabled on this zPool and or ZFS dataset is > a Lz4 compression on the zfs dataset. > > Hmmm, lz4 compression and stack overflow? I see one illumos bug related > to that: > > https://www.illumos.org/issues/3705 > > But that bug's fix has been in OmniOS since r151006. You're not running > something older, are you? > > There ARE new ZFS fixes in r151010, and if you can, I'd highly recommend > the upgrade. > > Dan > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Fri May 23 01:34:42 2014 From: danmcd at omniti.com (Dan McDonald) Date: Thu, 22 May 2014 21:34:42 -0400 Subject: [OmniOS-discuss] DHCP server of choice for OmniOS? Message-ID: I just migrated my home server from OI on old hardware, to OmniOS on nice new hardware. Modulo me forgetting to migrate things until I discovered I needed them, it's been a painless process. The ONLY thing I have left to migrate is my DHCP service. It's not big - I expect to manage 8 or 16 addresses out of a /24 subnet that's mostly statics, and I'm not serving Sun Rays or anything that ties me to the old Solaris DHCP server. On the other hand, the Solaris DHCP server is in the illumos-omnios gate, and apparently the ISC DHCP server isn't in either master or omniti-ms branches of omnios-build. I'd like to tap the collective wisdom of the community for your suggestions. What DHCP server do you use on your OmniOS box (if any). Thanks, Dan From hakansom at ohsu.edu Fri May 23 02:09:47 2014 From: hakansom at ohsu.edu (Marion Hakanson) Date: Thu, 22 May 2014 19:09:47 -0700 Subject: [OmniOS-discuss] DHCP server of choice for OmniOS? In-Reply-To: Message from Dan McDonald of "Thu, 22 May 2014 21:34:42 EDT." Message-ID: <201405230209.s4N29lWv020562@kyklops.ohsu.edu> I'm using ISC DHCP from pkgsrc. Jonathan Perkin's builds work so well, it's worth putting up with the SmartOS-centric "/opt/local" install path. At least until work/life allows me some time to build pkgsrc myself (:-). http://pkgsrc.joyent.com/installing.html Regards, Marion ================================================================ Subject: [OmniOS-discuss] DHCP server of choice for OmniOS? From: Dan McDonald Date: Thu, 22 May 2014 21:34:42 -0400 (18:34 PDT) To: omnios-discuss I just migrated my home server from OI on old hardware, to OmniOS on nice new hardware. Modulo me forgetting to migrate things until I discovered I needed them, it's been a painless process. The ONLY thing I have left to migrate is my DHCP service. It's not big - I expect to manage 8 or 16 addresses out of a /24 subnet that's mostly statics, and I'm not serving Sun Rays or anything that ties me to the old Solaris DHCP server. On the other hand, the Solaris DHCP server is in the illumos-omnios gate, and apparently the ISC DHCP server isn't in either master or omniti-ms branches of omnios-build. I'd like to tap the collective wisdom of the community for your suggestions. What DHCP server do you use on your OmniOS box (if any). Thanks, Dan From sk at kram.io Fri May 23 07:23:50 2014 From: sk at kram.io (Steffen Kram) Date: Fri, 23 May 2014 09:23:50 +0200 Subject: [OmniOS-discuss] DHCP server of choice for OmniOS? In-Reply-To: References: Message-ID: <44D809AA-9756-4474-8685-1C234625FBC7@kram.io> Hi Dan, I?m using ISC DHCP. It?s not a big deal to build it for Omnios. You can as well use my version or my build scripts from http://scott.mathematik.uni-ulm.de. Cheers, Steffen Am 23.05.2014 um 03:34 schrieb Dan McDonald : > I just migrated my home server from OI on old hardware, to OmniOS on nice new hardware. Modulo me forgetting to migrate things until I discovered I needed them, it's been a painless process. > > The ONLY thing I have left to migrate is my DHCP service. It's not big - I expect to manage 8 or 16 addresses out of a /24 subnet that's mostly statics, and I'm not serving Sun Rays or anything that ties me to the old Solaris DHCP server. On the other hand, the Solaris DHCP server is in the illumos-omnios gate, and apparently the ISC DHCP server isn't in either master or omniti-ms branches of omnios-build. > > I'd like to tap the collective wisdom of the community for your suggestions. What DHCP server do you use on your OmniOS box (if any). > > Thanks, > Dan > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: From cperez at cmpcs.com Sat May 24 17:44:06 2014 From: cperez at cmpcs.com (Carlos M. Perez) Date: Sat, 24 May 2014 17:44:06 +0000 Subject: [OmniOS-discuss] Problem upgrading r151006 to r151008 Message-ID: <23eba2114a65439d8558439c490ceb89@BLUPR06MB673.namprd06.prod.outlook.com> Hi, I followed the directions at http://omnios.omniti.com/wiki.php/Upgrade_r151006_r151008, by updating the required packages. I took it one step further and froze the system at r151006 to make sure we didn't accidentally get any 151008 components during the pkg update. After a reboot, the system shows no updates needed, and we unfroze the 151006 stuff. However, we are unable to perform the upgrade, as there seems to be some dependency on curl, which is currently at root at rundc1-zfs02:~# pkg info /web/curl Name: web/curl Summary: curl - command line tool for transferring data with URL syntax State: Installed Publisher: omnios Version: 7.36.0 Build Release: 5.11 Branch: 0.151006 Packaging Date: Mon Apr 14 21:40:24 2014 Size: 3.35 MB FMRI: pkg://omnios/web/curl at 7.36.0,5.11-0.151006:20140414T214024Z Here is the output from pkg update: (Hopefully it doesn't get too mangled) root at rundc1-zfs02:~# pkg update Creating Plan \ pkg update: No solution was found to satisfy constraints Plan Creation: Package solver has not found a solution to update to latest available versions. This may indicate an overly constrained set of packages are installed. latest incorporations: pkg://omnios/consolidation/osnet/osnet-incorporation at 0.5.11,5.11-0.151008:20131204T022427Z pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151008:20131206T160517Z pkg://omnios/entire at 11,5.11-0.151008:20131205T195242Z pkg://omnios/incorporation/jeos/illumos-gate at 11,5.11-0.151008:20131204T024149Z The following indicates why the system cannot update to the latest version: No suitable version of required package pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151008:20131205T191259Z found: Reject: pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151008:20131205T191259Z Reason: A version for 'incorporate' dependency on pkg:/web/curl at 7.32.0,5.11-0.151008 cannot be found No suitable version of required package pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151008:20131205T223747Z found: Reject: pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151008:20131205T223747Z Reason: A version for 'incorporate' dependency on pkg:/web/curl at 7.33.0,5.11-0.151008 cannot be found No suitable version of required package pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151008:20131205T195253Z found: Reject: pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151008:20131205T195253Z Reason: A version for 'incorporate' dependency on pkg:/web/curl at 7.33.0,5.11-0.151008 cannot be found No suitable version of required package pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151008:20131204T231613Z found: Reject: pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151008:20131204T231613Z Reason: A version for 'incorporate' dependency on pkg:/web/curl at 7.32.0,5.11-0.151008 cannot be found No suitable version of required package pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151008:20131206T160517Z found: Reject: pkg://omnios/incorporation/jeos/omnios-userland at 11,5.11-0.151008:20131206T160517Z Reason: A version for 'incorporate' dependency on pkg:/web/curl at 7.33.0,5.11-0.151008 cannot be found Thanks in advance Carlos M. Perez CMP Consulting Services 305-669-1515 -------------- next part -------------- An HTML attachment was scrubbed... URL: From filip.marvan at aira.cz Mon May 26 07:36:02 2014 From: filip.marvan at aira.cz (Filip Marvan) Date: Mon, 26 May 2014 09:36:02 +0200 Subject: [OmniOS-discuss] Strange ARC reads numbers In-Reply-To: <58058A78-6619-4E2C-B3FB-38B012EAAD34@RichardElling.com> References: <3BE0DEED8863E5429BAE4CAEDF6245650365044D4776@AIRA-SRV.aira.local> <3BE0DEED8863E5429BAE4CAEDF6245650365045016A2@AIRA-SRV.aira.local> <58058A78-6619-4E2C-B3FB-38B012EAAD34@RichardElling.com> Message-ID: <3BE0DEED8863E5429BAE4CAEDF6245650365046810FA@AIRA-SRV.aira.local> Hello, just for information, after two weeks, numbers of ARC assesses came back to high numbers as before deletion of data (you can see that in screenshot). And I try to delete the same amount of data on different storage server, and the accesses to ARC droped in the same way as on first pool Interesting. Filip Marvan From: Richard Elling [mailto:richard.elling at richardelling.com] Sent: Thursday, May 08, 2014 12:47 AM To: Filip Marvan Cc: omnios-discuss at lists.omniti.com Subject: Re: [OmniOS-discuss] Strange ARC reads numbers On May 7, 2014, at 1:44 AM, Filip Marvan wrote: Hi Richard, thank you for your reply. 1. Workload is still the same or very similar. Zvols, which we deleted from our pool were disconnected from KVM server a few days before, so the only change was, that we deleted that zvols with all snapshots. 2. As you wrote, our customers are fine for now :) We have monitoring of all our virtual servers running from that storage server, and there is no noticeable change in workload or latencies. good, then there might not be an actual problem, just a puzzle :-) 3. That could be the reason, of course. But in the graph are only data from arcstat.pl script. We can see, that arcstat is reporting heavy read accesses every 5 seconds (propably some update of ARC after ZFS writes data to disks from ZIL? All of them are marked as "cache hits" by arcstat script) and with only few ARC accesses between that 5 seconds periody. Before we deleted that zvols (about 0.7 TB data from 10 TB pool, which have 5 TB of free space) there were about 40k accesses every 5 seconds, now there are no more than 2k accesses every 5 seconds. This is expected behaviour for older ZFS releases that used a txg_timeout of 5 seconds. You should see a burst of write activity around that timeout and it can include reads for zvols. Unfortunately, the zvol code is not very efficient and you will see a lot more reads than you expect. -- richard Most of our zvols have 8K volblocksize (including deleted zvols), only few have 64K. Unfortunately I have no data about size of the read before that change. But we have two more storage servers, with similary high ARC read accesses every 5 seconds as on the first pool before deletion. Maybe I should try to delete some data on that pools and see what happen with more detailed monitoring. Thank you, Filip _____ From: Richard Elling [mailto:richard.elling at richardelling.com] Sent: Wednesday, May 07, 2014 3:56 AM To: Filip Marvan Cc: omnios-discuss at lists.omniti.com Subject: Re: [OmniOS-discuss] Strange ARC reads numbers Hi Filip, There are two primary reasons for reduction in the number of ARC reads. 1. the workload isn't reading as much as it used to 2. the latency of reads has increased 3. your measurement is b0rken there are three reasons... The data you shared clearly shows reduction in reads, but doesn't contain the answers to the cause. Usually, if #2 is the case, then the phone will be ringing with angry customers on the other end. If the above 3 are not the case, then perhaps it is something more subtle. The arcstat reads does not record the size of the read. To get the read size for zvols is a little tricky, you can infer it from the pool statistics in iostat. The subtleness here is that if the volblocksize is different between the old and new zvols, then the number of (block) reads will be different for the same workload. -- richard -- Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: arcread_back_dikobraz2.png Type: image/png Size: 18616 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: arcread_dikobraz1.png Type: image/png Size: 18991 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6247 bytes Desc: not available URL: From richard.elling at richardelling.com Mon May 26 19:57:45 2014 From: richard.elling at richardelling.com (Richard Elling) Date: Mon, 26 May 2014 12:57:45 -0700 Subject: [OmniOS-discuss] Strange ARC reads numbers In-Reply-To: <3BE0DEED8863E5429BAE4CAEDF6245650365046810FA@AIRA-SRV.aira.local> References: <3BE0DEED8863E5429BAE4CAEDF6245650365044D4776@AIRA-SRV.aira.local> <3BE0DEED8863E5429BAE4CAEDF6245650365045016A2@AIRA-SRV.aira.local> <58058A78-6619-4E2C-B3FB-38B012EAAD34@RichardElling.com> <3BE0DEED8863E5429BAE4CAEDF6245650365046810FA@AIRA-SRV.aira.local> Message-ID: Hi Filip, You can get more insight to the "read" count by looking at the breakdown of the various demand/prefetch and metadata/data counters. -- richard On May 26, 2014, at 12:36 AM, Filip Marvan wrote: > Hello, > > just for information, after two weeks, numbers of ARC assesses came back to high numbers as before deletion of data (you can see that in screenshot). > And I try to delete the same amount of data on different storage server, and the accesses to ARC droped in the same way as on first pool > > Interesting. > > Filip Marvan > > > > > From: Richard Elling [mailto:richard.elling at richardelling.com] > Sent: Thursday, May 08, 2014 12:47 AM > To: Filip Marvan > Cc: omnios-discuss at lists.omniti.com > Subject: Re: [OmniOS-discuss] Strange ARC reads numbers > > On May 7, 2014, at 1:44 AM, Filip Marvan wrote: > > > Hi Richard, > > thank you for your reply. > > 1. Workload is still the same or very similar. Zvols, which we deleted from our pool were disconnected from KVM server a few days before, so the only change was, that we deleted that zvols with all snapshots. > 2. As you wrote, our customers are fine for now :) We have monitoring of all our virtual servers running from that storage server, and there is no noticeable change in workload or latencies. > > good, then there might not be an actual problem, just a puzzle :-) > > > 3. That could be the reason, of course. But in the graph are only data from arcstat.pl script. We can see, that arcstat is reporting heavy read accesses every 5 seconds (propably some update of ARC after ZFS writes data to disks from ZIL? All of them are marked as "cache hits" by arcstat script) and with only few ARC accesses between that 5 seconds periody. Before we deleted that zvols (about 0.7 TB data from 10 TB pool, which have 5 TB of free space) there were about 40k accesses every 5 seconds, now there are no more than 2k accesses every 5 seconds. > > This is expected behaviour for older ZFS releases that used a txg_timeout of 5 seconds. You should > see a burst of write activity around that timeout and it can include reads for zvols. Unfortunately, the > zvol code is not very efficient and you will see a lot more reads than you expect. > -- richard > > > > > Most of our zvols have 8K volblocksize (including deleted zvols), only few have 64K. Unfortunately I have no data about size of the read before that change. But we have two more storage servers, with similary high ARC read accesses every 5 seconds as on the first pool before deletion. Maybe I should try to delete some data on that pools and see what happen with more detailed monitoring. > > Thank you, > Filip > > > From: Richard Elling [mailto:richard.elling at richardelling.com] > Sent: Wednesday, May 07, 2014 3:56 AM > To: Filip Marvan > Cc: omnios-discuss at lists.omniti.com > Subject: Re: [OmniOS-discuss] Strange ARC reads numbers > > Hi Filip, > > There are two primary reasons for reduction in the number of ARC reads. > 1. the workload isn't reading as much as it used to > 2. the latency of reads has increased > 3. your measurement is b0rken > there are three reasons... > > The data you shared clearly shows reduction in reads, but doesn't contain the answers > to the cause. Usually, if #2 is the case, then the phone will be ringing with angry customers > on the other end. > > If the above 3 are not the case, then perhaps it is something more subtle. The arcstat reads > does not record the size of the read. To get the read size for zvols is a little tricky, you can > infer it from the pool statistics in iostat. The subtleness here is that if the volblocksize is > different between the old and new zvols, then the number of (block) reads will be different > for the same workload. > -- richard > > -- > > Richard.Elling at RichardElling.com > +1-760-896-4422 > > > > -- Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Tue May 27 01:44:54 2014 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 26 May 2014 21:44:54 -0400 Subject: [OmniOS-discuss] DHCP server of choice for OmniOS? In-Reply-To: <44D809AA-9756-4474-8685-1C234625FBC7@kram.io> References: <44D809AA-9756-4474-8685-1C234625FBC7@kram.io> Message-ID: On May 23, 2014, at 3:23 AM, Steffen Kram wrote: > Hi Dan, > > I?m using ISC DHCP. It?s not a big deal to build it for Omnios. You can as well use my version or my build scripts from http://scott.mathematik.uni-ulm.de. I ended up downloading, compiling, and smoke-testing the latest ISC DHCP without any build scripts (sorry Steffen). Oddly enough, thanks to some work from Oracle for S11, ISC DHCP is far more suitable for Illumos and OmniOS than even I'd imagined in the first place. It seemed to work okay for my simple smoke test, and if I don't so something more substantial in the next week, I'll give it a more thorough smoke test when we have company over to my house soon. Would people be interested in seeing OmniOS ship with ISC DHCP? Would people wish to be rid of the old Sun DHCP server? (Or are there Sun Ray owners out there who use OmniOS to serve them? IIRC, Sun Rays need the old Sun DHCP server.) Thanks, Dan From tobi at oetiker.ch Tue May 27 04:14:32 2014 From: tobi at oetiker.ch (Tobias Oetiker) Date: Tue, 27 May 2014 06:14:32 +0200 (CEST) Subject: [OmniOS-discuss] DHCP server of choice for OmniOS? In-Reply-To: References: <44D809AA-9756-4474-8685-1C234625FBC7@kram.io> Message-ID: Dan, Yesterday Dan McDonald wrote: > > Would people be interested in seeing OmniOS ship with ISC DHCP? > Would people wish to be rid of the old Sun DHCP server? (Or are > there Sun Ray owners out there who use OmniOS to serve them? > IIRC, Sun Rays need the old Sun DHCP server.) back in the day when we were still running sunray, we did so with an isc dhcp running on linux and it worked fine. cheers tobi -- Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland www.oetiker.ch tobi at oetiker.ch +41 62 775 9902 From jimklimov at cos.ru Tue May 27 05:18:38 2014 From: jimklimov at cos.ru (Jim Klimov) Date: Tue, 27 May 2014 07:18:38 +0200 Subject: [OmniOS-discuss] DHCP server of choice for OmniOS? In-Reply-To: References: <44D809AA-9756-4474-8685-1C234625FBC7@kram.io> Message-ID: <3ab04551-80b9-489c-9e29-d0635a04cd92@email.android.com> 27 ??? 2014??. 3:44:54 CEST, Dan McDonald ?????: > >On May 23, 2014, at 3:23 AM, Steffen Kram wrote: > >> Hi Dan, >> >> I?m using ISC DHCP. It?s not a big deal to build it for Omnios. You >can as well use my version or my build scripts from >http://scott.mathematik.uni-ulm.de. > >I ended up downloading, compiling, and smoke-testing the latest ISC >DHCP without any build scripts (sorry Steffen). Oddly enough, thanks >to some work from Oracle for S11, ISC DHCP is far more suitable for >Illumos and OmniOS than even I'd imagined in the first place. It >seemed to work okay for my simple smoke test, and if I don't so >something more substantial in the next week, I'll give it a more >thorough smoke test when we have company over to my house soon. > >Would people be interested in seeing OmniOS ship with ISC DHCP? Would >people wish to be rid of the old Sun DHCP server? (Or are there Sun >Ray owners out there who use OmniOS to serve them? IIRC, Sun Rays need >the old Sun DHCP server.) > >Thanks, >Dan > >_______________________________________________ >OmniOS-discuss mailing list >OmniOS-discuss at lists.omniti.com >http://lists.omniti.com/mailman/listinfo/omnios-discuss Actually, sun rays don't *require* a sun dhcp server. The installer can set up and preconfigure one, and you can configure the server via command line instead of changing some config files and restarting. While sun rays historically have used some 'vendor macros' in dhcp (which isc dhcp can also probably deliver), later versions supported dns as the means of fetching the main information (the config server and the session rendering server), and any complex info could be downloaded from the config/firmware server via tftp, and so only needed the usual ip addressing and dns setup info from dhcp. My access point could serve that, a windows dhcp could... so certainly no hard requirement of sun dhcp for production use. I guess the command-line configurability was key for smooth integration. Without it, an admin would have to know what/how to set up in dns and dhcp, and how to apply this to their server software of choice - which is not arcane magic and was covered in some blogs and mailing list archives, and even official docs. So, unless sun dhcp can be coerced into working again with recent illumos-gates, at least the sunray case is not a roadblock to dropping it... workarounds exist, even if a bit inconvenient maybe. Or even not so much - if standard dhcp plus a couple of names in default dns zone suffice, and tftp'ed config files cover the rest adequately. Hth, Jim -- Typos courtesy of K-9 Mail on my Samsung Android From e.savi at almapro.it Tue May 27 13:29:37 2014 From: e.savi at almapro.it (Emiliano Savi) Date: Tue, 27 May 2014 13:29:37 +0000 Subject: [OmniOS-discuss] php-54 from omniti Message-ID: <9914F656E174644F925C575B02520AA063DD95C7@ALMAPRO-EX.almapro.local> Hi, how did you get php-54 working ? I installed the latest versions of OmniOS, installed and started apache22, downloaded php-54 and then ? I come from Windows so don't spit on me :p -------------- next part -------------- An HTML attachment was scrubbed... URL: From dswartz at druber.com Wed May 28 01:11:07 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Tue, 27 May 2014 21:11:07 -0400 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> References: <1393535433.707.6.camel@exilis.si-consulting.us> <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> Message-ID: So I've been running with sync=disabled on my vsphere NFS datastore. I've been willing to do so because I have a big-ass UPS, and do hourly backups. But, I'm thinking of going to an active/passive connection to my JBOD, using Saso's blog post on zfs zfs-create.blogspot.com. Here's why I think I can't keep using sync=disabled (I would love to have my logic sanity checked.) If you switch manually from host A to B, all is well, since before host A exports the pool, any pending writes will be completed (so even though we lied to vsphere, it's okay.) On the other hand, if host A crashes/hangs and host B takes over, forcibly importing the pool, you could end up with the following scenario: vsphere issues writes for blocks A, B, C, D and E. A and B have been written. C and D were sent to host A, and ACKed, so vsphere thinks all is well. Host A has not yet committed blocks C and D to disk. Host B imports the pool, assumes the virtual IP for the NFS share and vsphere reconnects to the datastore. Since it thinks it has written blocks A-D, it then issues a write for block E. Host B commits that to disk. vsphere thinks blocks A-E were written to disk, when in fact, blocks C and D were not. Silent data corruption, and as far as I can tell, no way to know this happened, so if I ever did have a forced failover, I would have to rollback every single VM to the last known, good snapshot. Anyway, I decided to see what would happen write-wise with an SLOG SSD. I took a samsung 840PRO used for l2arc and made that a log device. I ran crystaldiskmark before and after. Prior to the SLOG, I was getting about 90MB/sec (gigabit enet), which is pretty good. Afterward, it went down to 8MB/sec! I pulled the SSD and plugged it into my windows 7 workstation, formatted it and deleted the partition, which should have TRIM'ed it. I reinserted it as SLOG and re-ran the test. 50MB/sec. Still not great, but this is after all an MLC device, not SLC, and that's probably 'good enough'. Looking at open-zfs.org, it looks like out of illumos, freebsd and ZoL, only freebsd has TRIM now. I don't want to have to re-TRIM the thing every few weeks (or however long it takes). Does over-provisioning help? From danmcd at omniti.com Wed May 28 01:14:41 2014 From: danmcd at omniti.com (Dan McDonald) Date: Tue, 27 May 2014 21:14:41 -0400 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: References: <1393535433.707.6.camel@exilis.si-consulting.us> <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> Message-ID: <964CDA82-39F2-4612-9FFD-DA8CE1408BB1@omniti.com> On May 27, 2014, at 9:11 PM, Dan Swartzendruber wrote: > Does over-provisioning help? > It might. I'm no ZFS performance expert. You're better off asking that question on the Illumos ZFS list. I've mentioned before here that Nexenta had a prototype of TRIM/UNMAP use by ZFS, but I do not know what its status is. Again, that's a better question for the Illumos ZFS list than here. I'm sorry I can't be of more immediate assistance, Dan From dswartz at druber.com Wed May 28 01:28:05 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Tue, 27 May 2014 21:28:05 -0400 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <964CDA82-39F2-4612-9FFD-DA8CE1408BB1@omniti.com> References: <1393535433.707.6.camel@exilis.si-consulting.us> <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> <964CDA82-39F2-4612-9FFD-DA8CE1408BB1@omniti.com> Message-ID: <272779f2e3d5e9ba49e21383cefa0d60.squirrel@webmail.druber.com> > > On May 27, 2014, at 9:11 PM, Dan Swartzendruber > wrote: > >> Does over-provisioning help? >> > > It might. > > I'm no ZFS performance expert. You're better off asking that question on > the Illumos ZFS list. > > I've mentioned before here that Nexenta had a prototype of TRIM/UNMAP use > by ZFS, but I do not know what its status is. Again, that's a better > question for the Illumos ZFS list than here. Fair enough. Since I'm using omnios, I thought I'd asked on that list :) > I'm sorry I can't be of more immediate assistance, NP. From bfriesen at simple.dallas.tx.us Wed May 28 02:22:00 2014 From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn) Date: Tue, 27 May 2014 21:22:00 -0500 (CDT) Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: References: <1393535433.707.6.camel@exilis.si-consulting.us> <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> Message-ID: On Tue, 27 May 2014, Dan Swartzendruber wrote: > > So I've been running with sync=disabled on my vsphere NFS datastore. I've > been willing to do so because I have a big-ass UPS, and do hourly backups. > But, I'm thinking of going to an active/passive connection to my JBOD, > using Saso's blog post on zfs zfs-create.blogspot.com. Here's why I think > I can't keep using sync=disabled (I would love to have my logic sanity > checked.) If you switch manually from host A to B, all is well, since Zfs does not depend on sync writes ('zil') for pool integrity. It does depend on cache flush across all disks for pool integrity. The harm from sync=disabled is that when the system comes back up, the data may not be coherent from the application's perspective (lost transactions, part of the written file incorrect/missing). Your 'vsphere' fits in the realm of applications. If zfs imports the pool, it will choose the latest transaction group. The zfs cache flush is done on all the disks before it writes the new transaction group id (in a separate transaction). If there is somehow still something wrong, it is possible to import using an older transaction group (losing more recent data). Note that the latest transaction group might be from five minutes ago. The big-ass UPS helps, but is not a fool-proof solution since something else (hardware, OS, or power) might fail. It looks to me like Sa?o's design is active/standby failover. Zpool import on the standby should obtain a clean transaction group as long as the originally active system is still not using the pool. The result would be similar to the power fail situation. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ From dswartz at druber.com Wed May 28 02:57:10 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Tue, 27 May 2014 22:57:10 -0400 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: References: <1393535433.707.6.camel@exilis.si-consulting.us> <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> Message-ID: <24441dee7f2995812bc171752256e680.squirrel@webmail.druber.com> cked.) If you switch manually from host A to B, all is well, since > > Zfs does not depend on sync writes ('zil') for pool integrity. It > does depend on cache flush across all disks for pool integrity. The > harm from sync=disabled is that when the system comes back up, the > data may not be coherent from the application's perspective (lost > transactions, part of the written file incorrect/missing). Your > 'vsphere' fits in the realm of applications. Sorry if I was unclear. Yes, I understand the above point. In the scenario I referenced, data could fail to be written to a file (or directory even worse), with no indication whatsoever, since if the failover happens quickly enough vsphere will consider the delay to be ok. > If zfs imports the pool, it will choose the latest transaction group. > The zfs cache flush is done on all the disks before it writes the new > transaction group id (in a separate transaction). If there is somehow > still something wrong, it is possible to import using an older > transaction group (losing more recent data). > > Note that the latest transaction group might be from five minutes ago. > > The big-ass UPS helps, but is not a fool-proof solution since > something else (hardware, OS, or power) might fail. > > It looks to me like Sa?o's design is active/standby failover. Zpool > import on the standby should obtain a clean transaction group as long > as the originally active system is still not using the pool. The > result would be similar to the power fail situation. That was my understanding, yes. From jimklimov at cos.ru Wed May 28 07:09:50 2014 From: jimklimov at cos.ru (Jim Klimov) Date: Wed, 28 May 2014 09:09:50 +0200 Subject: [OmniOS-discuss] Strange ARC reads numbers In-Reply-To: <3BE0DEED8863E5429BAE4CAEDF6245650365046810FA@AIRA-SRV.aira.local> References: <3BE0DEED8863E5429BAE4CAEDF6245650365044D4776@AIRA-SRV.aira.local> <3BE0DEED8863E5429BAE4CAEDF6245650365045016A2@AIRA-SRV.aira.local> <58058A78-6619-4E2C-B3FB-38B012EAAD34@RichardElling.com> <3BE0DEED8863E5429BAE4CAEDF6245650365046810FA@AIRA-SRV.aira.local> Message-ID: <619a23b7-a785-488f-b581-8959c92aa746@email.android.com> 26 ??? 2014??. 9:36:02 CEST, Filip Marvan ?????: >Hello, > > > >just for information, after two weeks, numbers of ARC assesses came >back to >high numbers as before deletion of data (you can see that in >screenshot). > >And I try to delete the same amount of data on different storage >server, and >the accesses to ARC droped in the same way as on first pool > > > >Interesting. > > > >Filip Marvan > > > > > > > > > >From: Richard Elling [mailto:richard.elling at richardelling.com] >Sent: Thursday, May 08, 2014 12:47 AM >To: Filip Marvan >Cc: omnios-discuss at lists.omniti.com >Subject: Re: [OmniOS-discuss] Strange ARC reads numbers > > > >On May 7, 2014, at 1:44 AM, Filip Marvan wrote: > > > > > >Hi Richard, > > > >thank you for your reply. > > > >1. Workload is still the same or very similar. Zvols, which we deleted >from >our pool were disconnected from KVM server a few days before, so the >only >change was, that we deleted that zvols with all snapshots. > >2. As you wrote, our customers are fine for now :) We have monitoring >of all >our virtual servers running from that storage server, and there is no >noticeable change in workload or latencies. > > > >good, then there might not be an actual problem, just a puzzle :-) > > > > > >3. That could be the reason, of course. But in the graph are only data >from >arcstat.pl script. We can see, that arcstat is reporting heavy read >accesses >every 5 seconds (propably some update of ARC after ZFS writes data to >disks >from ZIL? All of them are marked as "cache hits" by arcstat script) and >with >only few ARC accesses between that 5 seconds periody. Before we deleted >that >zvols (about 0.7 TB data from 10 TB pool, which have 5 TB of free >space) >there were about 40k accesses every 5 seconds, now there are no more >than 2k >accesses every 5 seconds. > > > >This is expected behaviour for older ZFS releases that used a >txg_timeout of >5 seconds. You should > >see a burst of write activity around that timeout and it can include >reads >for zvols. Unfortunately, the > >zvol code is not very efficient and you will see a lot more reads than >you >expect. > > -- richard > > > > > > > > > >Most of our zvols have 8K volblocksize (including deleted zvols), only >few >have 64K. Unfortunately I have no data about size of the read before >that >change. But we have two more storage servers, with similary high ARC >read >accesses every 5 seconds as on the first pool before deletion. Maybe I >should try to delete some data on that pools and see what happen with >more >detailed monitoring. > > > >Thank you, > >Filip > > > > > > _____ > >From: Richard Elling [mailto:richard.elling at richardelling.com] >Sent: Wednesday, May 07, 2014 3:56 AM >To: Filip Marvan >Cc: omnios-discuss at lists.omniti.com >Subject: Re: [OmniOS-discuss] Strange ARC reads numbers > > > >Hi Filip, > > > >There are two primary reasons for reduction in the number of ARC reads. > > 1. the workload isn't reading as much as it used to > > 2. the latency of reads has increased > > 3. your measurement is b0rken > >there are three reasons... > > > >The data you shared clearly shows reduction in reads, but doesn't >contain >the answers > >to the cause. Usually, if #2 is the case, then the phone will be >ringing >with angry customers > >on the other end. > > > >If the above 3 are not the case, then perhaps it is something more >subtle. >The arcstat reads > >does not record the size of the read. To get the read size for zvols is >a >little tricky, you can > >infer it from the pool statistics in iostat. The subtleness here is >that if >the volblocksize is > >different between the old and new zvols, then the number of (block) >reads >will be different > >for the same workload. > > -- richard > > > >-- > > > >Richard.Elling at RichardElling.com >+1-760-896-4422 > > > > > > > >------------------------------------------------------------------------ > >_______________________________________________ >OmniOS-discuss mailing list >OmniOS-discuss at lists.omniti.com >http://lists.omniti.com/mailman/listinfo/omnios-discuss Reads from L2ARC suggest that this data is being read, but it is not hot enough to stick in the main RAM ARC. Deleting the datasets seemingly caused this data to no longer be read, perhaps because those blocks are no longer referenced by the pool. IIRC in your first post you wrote this happens to your older snapshots, and now it seems that the situation repeats as your system accumulated new snapshots after that mass deletion a few weeks back. To me it sums up to: "somebody mass-reads your available snapshots". Do you have 'zfssnap=visible' so that $dataset/.zfs directories are always visible (not only upon direct request) and do you have daemons or cronjobs or something of that kind (possibly a slocate/mlocate updatedb job, or an rsync backup) that reads your posix filesystem structure? Since it does not seem that your whole datasets are being re-read (exact guess depends on amount of unique data related to l2arc size, of course - and on measurable presence of reads from the main pool devices), regular accesses to just the FS metadata might explain the symptoms. Though backups that do read the file data (perhaps "rsync -c", or tar, or zfs send of any dataset type redone over and over for some reason) and sufficiently small unique data in the snapshots might also fit this explanation. HTH, //Jim Klimov -- Typos courtesy of K-9 Mail on my Samsung Android From jimklimov at cos.ru Wed May 28 06:37:45 2014 From: jimklimov at cos.ru (Jim Klimov) Date: Wed, 28 May 2014 08:37:45 +0200 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: References: <1393535433.707.6.camel@exilis.si-consulting.us> <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> Message-ID: 28 ??? 2014??. 3:11:07 CEST, Dan Swartzendruber ?????: > >So I've been running with sync=disabled on my vsphere NFS datastore. >I've >been willing to do so because I have a big-ass UPS, and do hourly >backups. > But, I'm thinking of going to an active/passive connection to my JBOD, >using Saso's blog post on zfs zfs-create.blogspot.com. Here's why I >think >I can't keep using sync=disabled (I would love to have my logic sanity >checked.) If you switch manually from host A to B, all is well, since >before host A exports the pool, any pending writes will be completed >(so >even though we lied to vsphere, it's okay.) On the other hand, if host >A >crashes/hangs and host B takes over, forcibly importing the pool, you >could end up with the following scenario: vsphere issues writes for >blocks >A, B, C, D and E. A and B have been written. C and D were sent to >host >A, and ACKed, so vsphere thinks all is well. Host A has not yet >committed >blocks C and D to disk. Host B imports the pool, assumes the virtual >IP >for the NFS share and vsphere reconnects to the datastore. Since it >thinks it has written blocks A-D, it then issues a write for block E. >Host B commits that to disk. vsphere thinks blocks A-E were written to >disk, when in fact, blocks C and D were not. Silent data corruption, >and >as far as I can tell, no way to know this happened, so if I ever did >have >a forced failover, I would have to rollback every single VM to the last >known, good snapshot. Anyway, I decided to see what would happen >write-wise with an SLOG SSD. I took a samsung 840PRO used for l2arc >and >made that a log device. I ran crystaldiskmark before and after. Prior >to >the SLOG, I was getting about 90MB/sec (gigabit enet), which is pretty >good. Afterward, it went down to 8MB/sec! I pulled the SSD and >plugged >it into my windows 7 workstation, formatted it and deleted the >partition, >which should have TRIM'ed it. I reinserted it as SLOG and re-ran the >test. 50MB/sec. Still not great, but this is after all an MLC device, >not SLC, and that's probably 'good enough'. Looking at open-zfs.org, >it >looks like out of illumos, freebsd and ZoL, only freebsd has TRIM now. >I >don't want to have to re-TRIM the thing every few weeks (or however >long >it takes). Does over-provisioning help? > >_______________________________________________ >OmniOS-discuss mailing list >OmniOS-discuss at lists.omniti.com >http://lists.omniti.com/mailman/listinfo/omnios-discuss My couple of cents: 1) the l2arc and zil usecases are somewhat special since they write data as a ring buffer. Your logical lba's with neighboring addresses are unusually (for ssd) likely to land into same hardware pages during initial creation and during rewrites. So there would be relatively little fragmentation (and little if any cow data relocation by firmware to free up pages for reprogramming). This way overprovisioning can help since there are available pages, and those no longer actively used would now be reserved by the firmware for logical sectors known to be zero or unused, and it should be quick about erasing them. On another hand, the l2arc is likely to use all of whatever range of storage you give it, and use it actively. Unlike an overestimated zil, there is nothing you could trim/unmap in advance from zfs. Though it might maybe help to mark the data ranges with trim before overwriting them, just so the ssd knows it can and should recycle the pages involved. Writes to both are sequential (and maybe in large portions), while reads of l2arc are randomly sized and located and reads of zil are sequential and rare enough to not consider as a performance factor =) 2) did/can you rerun your tests with a manually overprovisioned ssd (with an empty space reserved by partitioning / slicing) and see if the results vary? Probably, the question is more about a change in tendencies rather than absolute numbers, if the latter degrade even after a hardware trim. 3) for failovers like those, did you consider a mirror over iscsi devices (possibly zvols, or even raw disks to avoid some zfs-in-zfs lockups recently discussed) exported from both boxes? This way writes into the application pool that stores your vm's would land onto both boxes, distributed onto the neighbor by the current head node and probably at a hit to latency - though maybe using dedicated direct networking for lower impact to performance. Failover would however rely on a really up-to-date version of the pool, possibly including a mirrored zil with pieces from both boxes. I wonder if you might (or should performance-wise) share and re-attach upon failover the l2arc's like that as well? I think this was discussed while Saso was developing and publishing his solution, and maybe discarded for some reason, so search the zfs-discuss (probably) archives of 1-2 years back for more hints. Or perhaps he has some new insights and opinions developed during this time ;) HTH, //Jim Klimov -- Typos courtesy of K-9 Mail on my Samsung Android From skiselkov.ml at gmail.com Wed May 28 09:36:23 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Wed, 28 May 2014 11:36:23 +0200 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: References: <1393535433.707.6.camel@exilis.si-consulting.us> <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> Message-ID: <5385AE17.6060007@gmail.com> On 5/28/14, 3:11 AM, Dan Swartzendruber wrote: > > So I've been running with sync=disabled on my vsphere NFS datastore. I've > been willing to do so because I have a big-ass UPS, and do hourly backups. > But, I'm thinking of going to an active/passive connection to my JBOD, > using Saso's blog post on zfs zfs-create.blogspot.com. Here's why I think > I can't keep using sync=disabled (I would love to have my logic sanity > checked.) If you switch manually from host A to B, all is well, since > before host A exports the pool, any pending writes will be completed (so > even though we lied to vsphere, it's okay.) On the other hand, if host A > crashes/hangs and host B takes over, forcibly importing the pool, you > could end up with the following scenario: vsphere issues writes for blocks > A, B, C, D and E. A and B have been written. C and D were sent to host > A, and ACKed, so vsphere thinks all is well. Host A has not yet committed > blocks C and D to disk. Host B imports the pool, assumes the virtual IP > for the NFS share and vsphere reconnects to the datastore. Since it > thinks it has written blocks A-D, it then issues a write for block E. > Host B commits that to disk. vsphere thinks blocks A-E were written to > disk, when in fact, blocks C and D were not. Silent data corruption, and > as far as I can tell, no way to know this happened, so if I ever did have > a forced failover, I would have to rollback every single VM to the last > known, good snapshot. Anyway, I decided to see what would happen > write-wise with an SLOG SSD. I took a samsung 840PRO used for l2arc and > made that a log device. I ran crystaldiskmark before and after. Prior to > the SLOG, I was getting about 90MB/sec (gigabit enet), which is pretty > good. Afterward, it went down to 8MB/sec! I pulled the SSD and plugged > it into my windows 7 workstation, formatted it and deleted the partition, > which should have TRIM'ed it. I reinserted it as SLOG and re-ran the > test. 50MB/sec. Still not great, but this is after all an MLC device, > not SLC, and that's probably 'good enough'. Looking at open-zfs.org, it > looks like out of illumos, freebsd and ZoL, only freebsd has TRIM now. I > don't want to have to re-TRIM the thing every few weeks (or however long > it takes). Does over-provisioning help? Hi Dan, First off, the Samsung 840 Pro apparently doesn't have power loss protection, so DON'T use it for slog (ZIL). Use some enterprise-class SSD that has proper protection of its DRAM contents. Even better, if you have the cash to spend, get a ZeusRAM - these are true NVRAM devices with extremely low latency. If you use an SSD for slog, do a secure erase on it and then partition it so that you leave something like 1/3 of it unused and untouched by the OS. Evidence suggests that that might dramatically improve write IOPS consistency: http://www.anandtech.com/show/6489/playing-with-op Cheers, -- Saso From dswartz at druber.com Wed May 28 13:51:36 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Wed, 28 May 2014 09:51:36 -0400 Subject: [OmniOS-discuss] Status of TRIM support? Message-ID: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> (merging comments to Saso and Jim) I don't think I mentioned my environment - if not, my apologies. This is a SOHO/Lab setup, so things like zeusram are non-starters. The basic network infrastructure is gigabit, so iSCSI ZIL would suck badly, I suspect. As far as over-provisioning the 840PRO, I have it sliced for 16GB. Once it's been running for awhile, I will re-run the disk benchmark. I understand the 840PRO doesn't have a supercap - this was basically just a performance analysis to see how it stacks up compared to sync=disabled and on-pool ZIL. If I go this route, I will need to look for a decent/affordable unit with supercap. One other test I can try is with a 15K 76GB SAS 2.5-inch drive I salvaged from a dead server. It should have about 1/2 the latency of a 7200rpm sata drive, and if so would get me up to about 40MB/sec, which is still not good, but better than on-pool ZIL. I'll find out later. I have googled a fair amount and there seems to be 'work in progress' for TRIM support for ZoL and illumos, but no real indication I could find as to when either might support it. From chip at innovates.com Wed May 28 14:08:27 2014 From: chip at innovates.com (Schweiss, Chip) Date: Wed, 28 May 2014 09:08:27 -0500 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> Message-ID: The 840 Pro doesn't have a super cap, but it does properly honor cache flushes which ZFS will do on a log device. This drastically reduces it's write performance and makes it a poor choice for a log device. Intel has several SATA SSDs with proper super-cap protected caches that make good log devices. They are nothing like a ZeusRAM, but will protect your transactions. Keep in mind vSphere is a 100% sync write NFS client. If the log device doesn't perform well, neither will vSphere. If you trust your backups I would stick to ZIL disabled. If you want to do HA, you need to look at an all SAS solution, which is not cheap. -Chip On Wed, May 28, 2014 at 8:51 AM, Dan Swartzendruber wrote: > (merging comments to Saso and Jim) > > I don't think I mentioned my environment - if not, my apologies. This is > a SOHO/Lab setup, so things like zeusram are non-starters. The basic > network infrastructure is gigabit, so iSCSI ZIL would suck badly, I > suspect. As far as over-provisioning the 840PRO, I have it sliced for > 16GB. Once it's been running for awhile, I will re-run the disk > benchmark. I understand the 840PRO doesn't have a supercap - this was > basically just a performance analysis to see how it stacks up compared to > sync=disabled and on-pool ZIL. If I go this route, I will need to look > for a decent/affordable unit with supercap. One other test I can try is > with a 15K 76GB SAS 2.5-inch drive I salvaged from a dead server. It > should have about 1/2 the latency of a 7200rpm sata drive, and if so would > get me up to about 40MB/sec, which is still not good, but better than > on-pool ZIL. I'll find out later. I have googled a fair amount and there > seems to be 'work in progress' for TRIM support for ZoL and illumos, but > no real indication I could find as to when either might support it. > > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dswartz at druber.com Wed May 28 14:16:54 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Wed, 28 May 2014 10:16:54 -0400 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> Message-ID: <1c6e3a58602f1d5f78d258ec93383975.squirrel@webmail.druber.com> > The 840 Pro doesn't have a super cap, but it does properly honor cache > flushes which ZFS will do on a log device. This drastically reduces it's > write performance and makes it a poor choice for a log device. > > Intel has several SATA SSDs with proper super-cap protected caches that > make good log devices. They are nothing like a ZeusRAM, but will protect > your transactions. > > Keep in mind vSphere is a 100% sync write NFS client. If the log device > doesn't perform well, neither will vSphere. Correct. Hence the sucky numbers with sync=standard and on-pool ZIL :) > If you trust your backups I would stick to ZIL disabled. If you want to > do HA, you need to look at an all SAS solution, which is not cheap. This is what I am not 100% clear on. My JBOD has dual inputs, but it isn't (yet) clear to me if the two input connectors talk to the separate SAS ports on the SAS drives, or both go to the primary port, and rely on the hosts co-operating (I've googled a lot for this kind of info, and come up dry). I need to move one of the server motherboards to a new case, at which point I will plug that 2nd HBA into the 2nd SAS IN port on the JBOD - if they see different WWNs, they are talking to the separate ports on the SAS drives, else they are both talking to the primary port, in which case it would seem I could use a SATA SDD for ZIL, no? From skiselkov.ml at gmail.com Wed May 28 14:24:24 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Wed, 28 May 2014 16:24:24 +0200 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> Message-ID: <5385F198.3070602@gmail.com> On 5/28/14, 3:51 PM, Dan Swartzendruber wrote: > (merging comments to Saso and Jim) > > I don't think I mentioned my environment - if not, my apologies. This is > a SOHO/Lab setup, so things like zeusram are non-starters. The basic > network infrastructure is gigabit, so iSCSI ZIL would suck badly, I > suspect. As far as over-provisioning the 840PRO, I have it sliced for > 16GB. Once it's been running for awhile, I will re-run the disk > benchmark. I understand the 840PRO doesn't have a supercap - this was > basically just a performance analysis to see how it stacks up compared to > sync=disabled and on-pool ZIL. If I go this route, I will need to look > for a decent/affordable unit with supercap. One other test I can try is > with a 15K 76GB SAS 2.5-inch drive I salvaged from a dead server. It > should have about 1/2 the latency of a 7200rpm sata drive, and if so would > get me up to about 40MB/sec, which is still not good, but better than > on-pool ZIL. I'll find out later. I have googled a fair amount and there > seems to be 'work in progress' for TRIM support for ZoL and illumos, but > no real indication I could find as to when either might support it. If you want IOPS performance out of HDDs, short-stroke the hell out of them. The 76 GB drive should be reduced to something like 2-4 GB so that only the outermost tracks are used. But a good power-protected SSD can nowadays be had for relatively little and will absolutely crush the short-stroked HDD in throughput. Cheers, -- Saso From skiselkov.ml at gmail.com Wed May 28 14:34:44 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Wed, 28 May 2014 16:34:44 +0200 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> Message-ID: <5385F404.5090005@gmail.com> On 5/28/14, 4:08 PM, Schweiss, Chip wrote: > Intel has several SATA SSDs with proper super-cap protected caches that > make good log devices. I'd recommend looking at a Intel DC S3700. The 200 GB or 400 GB varieties promise ~30000 4k random write IOPS and actually seem to deliver: http://www.anandtech.com/show/7065/intel-ssd-dc-s3500-review-480gb-part-1/3 They're also not so expensive that it'll break your bank: http://www.amazon.com/S3700-Internal-Solid-State-Drive/dp/B00A8NWD68 > If you want to do HA, you need to look at an all SAS solution, which > is not cheap. I've had good results with SATA SSDs sitting behind LSI SAS interposers in dual-path SAS JBODs. YMMV though. Cheers, -- Saso From doug at will.to Wed May 28 14:46:42 2014 From: doug at will.to (Doug Hughes) Date: Wed, 28 May 2014 10:46:42 -0400 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <5385F404.5090005@gmail.com> References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> <5385F404.5090005@gmail.com> Message-ID: <5385F6D2.6080401@will.to> On 5/28/2014 10:34 AM, Saso Kiselkov wrote: > On 5/28/14, 4:08 PM, Schweiss, Chip wrote: >> Intel has several SATA SSDs with proper super-cap protected caches that >> make good log devices. > > I'd recommend looking at a Intel DC S3700. The 200 GB or 400 GB > varieties promise ~30000 4k random write IOPS and actually seem to deliver: > http://www.anandtech.com/show/7065/intel-ssd-dc-s3500-review-480gb-part-1/3 > They're also not so expensive that it'll break your bank: > http://www.amazon.com/S3700-Internal-Solid-State-Drive/dp/B00A8NWD68 > Second this. The DC S3700 are very good. But, I tend to use the Intel 320 which are often available on amazon for just over $1/GB up to 600GB. They don't have as good of specs as the DC3700 (which are newer), but they do have the capacitor bank to flush on power outage, and they are very value priced for SOHO. They do quite well as ZIL devices. We use them extensively. From jimklimov at cos.ru Wed May 28 15:08:59 2014 From: jimklimov at cos.ru (Jim Klimov) Date: Wed, 28 May 2014 17:08:59 +0200 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> Message-ID: <33ab4d29-f703-488f-902b-3a96168e1277@email.android.com> 28 ??? 2014??. 15:51:36 CEST, Dan Swartzendruber ?????: >(merging comments to Saso and Jim) > >I don't think I mentioned my environment - if not, my apologies. This >is >a SOHO/Lab setup, so things like zeusram are non-starters. The basic >network infrastructure is gigabit, so iSCSI ZIL would suck badly, I >suspect. As far as over-provisioning the 840PRO, I have it sliced for >16GB. Once it's been running for awhile, I will re-run the disk >benchmark. I understand the 840PRO doesn't have a supercap - this was >basically just a performance analysis to see how it stacks up compared >to >sync=disabled and on-pool ZIL. If I go this route, I will need to look >for a decent/affordable unit with supercap. One other test I can try >is >with a 15K 76GB SAS 2.5-inch drive I salvaged from a dead server. It >should have about 1/2 the latency of a 7200rpm sata drive, and if so >would >get me up to about 40MB/sec, which is still not good, but better than >on-pool ZIL. I'll find out later. I have googled a fair amount and >there >seems to be 'work in progress' for TRIM support for ZoL and illumos, >but >no real indication I could find as to when either might support it. > > > >_______________________________________________ >OmniOS-discuss mailing list >OmniOS-discuss at lists.omniti.com >http://lists.omniti.com/mailman/listinfo/omnios-discuss Actually, i think that if you have the hdd dedicated for zil, then you only write sequentially to it, so the head hovers where it should be. Track-to-track seek time can be discarded and the worst latency is a single rotation. With much enough data (liky sync=always) you have a 15krpm streaming write... some 200MBps? //Jim -- Typos courtesy of K-9 Mail on my Samsung Android From dswartz at druber.com Wed May 28 15:17:49 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Wed, 28 May 2014 11:17:49 -0400 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <33ab4d29-f703-488f-902b-3a96168e1277@email.android.com> References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> <33ab4d29-f703-488f-902b-3a96168e1277@email.android.com> Message-ID: > > Actually, i think that if you have the hdd dedicated for zil, then you > only write sequentially to it, so the head hovers where it should be. > Track-to-track seek time can be discarded and the worst latency is a > single rotation. With much enough data (liky sync=always) you have a > 15krpm streaming write... some 200MBps? This is what I want to test out later tonight. I will post my results... From chip at innovates.com Wed May 28 16:00:09 2014 From: chip at innovates.com (Schweiss, Chip) Date: Wed, 28 May 2014 11:00:09 -0500 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <5385F6D2.6080401@will.to> References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> <5385F404.5090005@gmail.com> <5385F6D2.6080401@will.to> Message-ID: On Wed, May 28, 2014 at 9:46 AM, Doug Hughes wrote: > > Second this. The DC S3700 are very good. > > But, I tend to use the Intel 320 which are often available on amazon for > just over $1/GB up to 600GB. They don't have as good of specs as the DC3700 > (which are newer), but they do have the capacitor bank to flush on power > outage, and they are very value priced for SOHO. They do quite well as ZIL > devices. We use them extensively. > > > Let me add that if going for a SATA via interposer solution, I have not been able to get the DC S3700 to work behind an interposer. I'd be curious to hear from anyone that has about what specific model and firmware is being used. I have about 100 LSI SAS interposer in production with Samsung 840 Pro SSDs behind them. Some are L2ARC others are in a pure SSD scratch pool. I have had 3 interposer failures. Each time has caused the pool to be hung. Twice as L2ARC, once as a data disk. -Chip -------------- next part -------------- An HTML attachment was scrubbed... URL: From nsmith at careyweb.com Wed May 28 17:44:13 2014 From: nsmith at careyweb.com (Nate Smith) Date: Wed, 28 May 2014 13:44:13 -0400 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> <5385F404.5090005@gmail.com> <5385F6D2.6080401@will.to> Message-ID: <31db4e1d-4086-473d-8e55-b4ed3874a16e@careyweb.com> Third on S3700. They're the best mix of price/reliability. Has anyone used the Seagate 600 Pro series? ST240FP0021? -Nate From lists at marzocchi.net Wed May 28 18:11:12 2014 From: lists at marzocchi.net (Olaf Marzocchi) Date: Wed, 28 May 2014 20:11:12 +0200 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <31db4e1d-4086-473d-8e55-b4ed3874a16e@careyweb.com> References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> <5385F404.5090005@gmail.com> <5385F6D2.6080401@will.to> <31db4e1d-4086-473d-8e55-b4ed3874a16e@careyweb.com> Message-ID: I never tried them, but I know that the M500 and the M550 also have the capacitor. Less write cycles, but probably enough for SOHO. Anandtech tested them both. Olaf Il giorno 28/mag/2014, alle ore 19:44, Nate Smith ha scritto: > > Third on S3700. They're the best mix of price/reliability. > > Has anyone used the Seagate 600 Pro series? ST240FP0021? > > -Nate > > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss From dswartz at druber.com Wed May 28 18:55:57 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Wed, 28 May 2014 14:55:57 -0400 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: References: <1393535433.707.6.camel@exilis.si-consulting.us> <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> Message-ID: > It looks to me like Sa?o's design is active/standby failover. Zpool > import on the standby should obtain a clean transaction group as long > as the originally active system is still not using the pool. The > result would be similar to the power fail situation. As long as the right fencing is done in the case where the active node goes south, agreed. In my case, I have 3 server, two running vsphere and one running illumos. All active guests run on server V1, with V2 as a HA backup for V1. Since V2 is doing little else, it also hosts a virtualized illumos appliance, which currently has two 1TB disks for a hourly zfs send replication job. I intend to put an HBA in V2 and pass it through to the storage appliance and go from there. The only fly in the ointment is that while V1 can be readily fenced using the on-board IPMI, I have no easy way to fence the virtualized appliance. I seem to recall seeing a vmware fencing agent, but it may not be reliable enough for me (e.g. what if the reason the virtualized appliance is not working properly is because the host is wigging out?) It struck me that since nothing else normally runs on V2, I can fence the virtualized appliance by fencing the host it runs on using V2's onboard IPMI. If a hard failover needs to be done, the standby appliance will need to import the pool with '-f', which is scary if your fencing is not extremely reliable... From chip at innovates.com Wed May 28 19:18:24 2014 From: chip at innovates.com (Schweiss, Chip) Date: Wed, 28 May 2014 14:18:24 -0500 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: References: <1393535433.707.6.camel@exilis.si-consulting.us> <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> Message-ID: On Wed, May 28, 2014 at 1:55 PM, Dan Swartzendruber wrote: > > It looks to me like Sa?o's design is active/standby failover. Zpool > > import on the standby should obtain a clean transaction group as long > > as the originally active system is still not using the pool. The > > result would be similar to the power fail situation. > > As long as the right fencing is done in the case where the active node > goes south, agreed. In my case, I have 3 server, two running vsphere and > one running illumos. All active guests run on server V1, with V2 as a HA > backup for V1. Since V2 is doing little else, it also hosts a virtualized > illumos appliance, which currently has two 1TB disks for a hourly zfs send > replication job. I intend to put an HBA in V2 and pass it through to the > storage appliance and go from there. The only fly in the ointment is that > while V1 can be readily fenced using the on-board IPMI, I have no easy way > to fence the virtualized appliance. I seem to recall seeing a vmware > fencing agent, but it may not be reliable enough for me (e.g. what if the > reason the virtualized appliance is not working properly is because the > host is wigging out?) It struck me that since nothing else normally runs > on V2, I can fence the virtualized appliance by fencing the host it runs > on using V2's onboard IPMI. If a hard failover needs to be done, the > standby appliance will need to import the pool with '-f', which is scary > if your fencing is not extremely reliable... > Assuming you have real SAS devices in the pool, not SATA with interposers, you can use SCSI reservations. This can block the other host from accessing a pool you are about to take over. sg3_utils has utilities for managing SCSI reservations. -Chip -------------- next part -------------- An HTML attachment was scrubbed... URL: From dswartz at druber.com Wed May 28 19:22:13 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Wed, 28 May 2014 15:22:13 -0400 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: References: <1393535433.707.6.camel@exilis.si-consulting.us> <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> Message-ID: > > Assuming you have real SAS devices in the pool, not SATA with interposers, > you can use SCSI reservations. This can block the other host from > accessing a pool you are about to take over. > > sg3_utils has utilities for managing SCSI reservations. The data pool is in fact all SAS. 8 1TB nearline drives. Thanks, I will check out the above... From ian at ianshome.com Wed May 28 22:40:14 2014 From: ian at ianshome.com (Ian Collins) Date: Thu, 29 May 2014 10:40:14 +1200 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <5385AE17.6060007@gmail.com> References: <1393535433.707.6.camel@exilis.si-consulting.us> <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> <5385AE17.6060007@gmail.com> Message-ID: <538665CE.90603@ianshome.com> Saso Kiselkov wrote: > Hi Dan, > > First off, the Samsung 840 Pro apparently doesn't have power loss > protection, so DON'T use it for slog (ZIL). Use some enterprise-class > SSD that has proper protection of its DRAM contents. Even better, if you > have the cash to spend, get a ZeusRAM - these are true NVRAM devices > with extremely low latency. > > If you use an SSD for slog, do a secure erase on it and then partition > it so that you leave something like 1/3 of it unused and untouched by > the OS. Evidence suggests that that might dramatically improve write > IOPS consistency: > http://www.anandtech.com/show/6489/playing-with-op Reading there I'd say the conclusion is "leave something like 1/3 of it unused and untouched unless you are using an S3700" :) I'm glad I am... -- Ian. From ian at ianshome.com Wed May 28 22:51:18 2014 From: ian at ianshome.com (Ian Collins) Date: Thu, 29 May 2014 10:51:18 +1200 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <31db4e1d-4086-473d-8e55-b4ed3874a16e@careyweb.com> References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> <5385F404.5090005@gmail.com> <5385F6D2.6080401@will.to> <31db4e1d-4086-473d-8e55-b4ed3874a16e@careyweb.com> Message-ID: <53866866.8040702@ianshome.com> Nate Smith wrote: > Third on S3700. They're the best mix of price/reliability. > > Has anyone used the Seagate 600 Pro series? ST240FP0021? Only the non-pro version. I ran a quick comparison with S3700s and ended up using them as cache devices. -- Ian. From nsmith at careyweb.com Wed May 28 23:06:51 2014 From: nsmith at careyweb.com (Nate Smith) Date: Wed, 28 May 2014 19:06:51 -0400 Subject: [OmniOS-discuss] =?iso-8859-1?q?Status_of_TRIM_support=3F?= In-Reply-To: <53866866.8040702@ianshome.com> Message-ID: <20140528230651.f9d63cc2@mail.careyweb.com> The nice thing about the pros is they do have power loss protection. _____ From: Ian Collins [mailto:ian at ianshome.com] To: Nate Smith [mailto:nsmith at careyweb.com] Cc: 'omnios-discuss' [mailto:omnios-discuss at lists.omniti.com] Sent: Wed, 28 May 2014 18:51:18 -0500 Subject: Re: [OmniOS-discuss] Status of TRIM support? Nate Smith wrote: > Third on S3700. They're the best mix of price/reliability. > > Has anyone used the Seagate 600 Pro series? ST240FP0021? Only the non-pro version. I ran a quick comparison with S3700s and ended up using them as cache devices. -- Ian. -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.elling at richardelling.com Wed May 28 23:14:41 2014 From: richard.elling at richardelling.com (Richard Elling) Date: Wed, 28 May 2014 16:14:41 -0700 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> Message-ID: <625E5991-DA44-4E24-90D4-972A69980EF9@RichardElling.com> On May 28, 2014, at 7:08 AM, Schweiss, Chip wrote: > The 840 Pro doesn't have a super cap, but it does properly honor cache flushes which ZFS will do on a log device. This drastically reduces it's write performance and makes it a poor choice for a log device. This is a common issue for Flash SSDs, one that shouldn't be underestimated. I've measured cache flushes in the dozens of ms range on some "value" SSDs. For the enterprise-grade Flash SSD market, supercaps and better algorithms are usually well worth the expense. For the better models, you can disable cache flush in sd.conf allowing you to have fast "barriers" for those devices that support them without risking data on the devices that do need proper cache flushing. This is the preferred method over disabling ZFS cache flush altogether. -- richard -- Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewb at icc-usa.com Thu May 29 00:06:31 2014 From: andrewb at icc-usa.com (Andrew Brant) Date: Thu, 29 May 2014 00:06:31 +0000 Subject: [OmniOS-discuss] Intel X540-AT2 WARNING: ixgbe0&1 : Failed to initialize adapter Message-ID: <78e5ccda9b8a4f009ed3ad9e35e55f54@ICC-EXCHANGE.icc.local> Trying to sort this out on a new build running the latest OmniOS release, the adapter is on the Illumos HCL and works like a charm when the system is booted into the live CentOS environment. Tried the X540 based Supermicro add-on card for comparison and that initialized just fine. Any suggestions? Cheers, Andrew http://illumos.org/hcl/ Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 Driver: ixgbe PCI ID: 8086,1528 Originating manifest: driver-network-ixgbe.mf # cat /etc/*release OmniOS v11 r151011 May 28 12:00:06 omnios ixgbe: [ID 611667 kern.warning] WARNING: ixgbe0: Failed to initialize adapter May 28 12:00:14 omnios ixgbe: [ID 611667 kern.warning] WARNING: ixgbe1: Failed to initialize adapter # lspci -d 8086:1528 03:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01) 03:00.1 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01) # grep ixgbe /etc/path_to_inst "/pci at 0,0/pci8086,e04 at 2/pci152d,899f at 0" 0 "ixgbe" ## X540-AT2 Controller with the "Failed to initialize adapter" message ## "/pci at 0,0/pci8086,e04 at 2/pci152d,899f at 0,1" 1 "ixgbe" ## X540-AT2 Controller with the "Failed to initialize adapter" message ## "/pci at 76,0/pci8086,e0a at 3,2/pci15d9,734 at 0" 2 "ixgbe" ## X540 based Supermicro AOC - works like a charm ## "/pci at 76,0/pci8086,e0a at 3,2/pci15d9,734 at 0,1" 3 "ixgbe" ## X540 based Supermicro AOC - works like a charm ## # fmadm faulty --------------- ------------------------------------ -------------- --------- TIME EVENT-ID MSG-ID SEVERITY --------------- ------------------------------------ -------------- --------- May 28 11:59:34 0280ba01-15bc-e4cd-b78a-e8f3de8631ff PCIEX-8000-0A Critical Host : omnios Platform : S810-X52LR Chassis_id : To-be-filled-by-O.E.M. Product_sn : Fault class : fault.io.pciex.device-interr Affects : dev:////pci at 0,0/pci8086,e04 at 2/pci152d,899f at 0 faulted and taken out of service FRU : "MB" (hc://:product-id=S810-X52LR:server-id=omnios:chassis-id=To-be-filled-by-O.E.M./motherboard=0) faulty -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at omniti.com Thu May 29 00:42:39 2014 From: danmcd at omniti.com (Dan McDonald) Date: Wed, 28 May 2014 20:42:39 -0400 Subject: [OmniOS-discuss] Intel X540-AT2 WARNING: ixgbe0&1 : Failed to initialize adapter In-Reply-To: <78e5ccda9b8a4f009ed3ad9e35e55f54@ICC-EXCHANGE.icc.local> References: <78e5ccda9b8a4f009ed3ad9e35e55f54@ICC-EXCHANGE.icc.local> Message-ID: <19BD2035-12A5-4AF3-8291-D7222E544816@omniti.com> On May 28, 2014, at 8:06 PM, Andrew Brant wrote: > Trying to sort this out on a new build running the latest OmniOS release, the adapter is on the Illumos HCL and works like a charm when the system is booted into the live CentOS environment. > > Tried the X540 based Supermicro add-on card for comparison and that initialized just fine. > > Any suggestions? The error message you have corresponds to this code: /* * Initialize driver parameters */ if (ixgbe_init_driver_settings(ixgbe) != IXGBE_SUCCESS) { ixgbe_error(ixgbe, "Failed to initialize driver settings"); goto attach_fail; } If you can, could you please run the attached dtrace script as follows: ./downstack.d ixgbe_init_driver_settings and then run "ifconfig ixgbe0 plumb" (assuming ixgbe0 is one of the bad ones) to narrow it down to where it fails? The output may be quite large. One thing else: > # cat /etc/*release > OmniOS v11 r151011 This is the "bloody" release, not the latest supported one, FYI. Regardless, the X540 board you mention SHOULD work. > > May 28 12:00:06 omnios ixgbe: [ID 611667 kern.warning] WARNING: ixgbe0: Failed to initialize adapter > May 28 12:00:14 omnios ixgbe: [ID 611667 kern.warning] WARNING: ixgbe1: Failed to initialize adapter Is there any other ixgbe output around here? Uttering "dmesg | grep ixgbe" and seeing if there's anything else complain-y would be nice. It'll be even larger than the DTrace output, but can you put the output of "prtconf -v" from your system somewhere, even in a mail here as an attachment? Thanks, Dan -------------- next part -------------- A non-text attachment was scrubbed... Name: downstack.d Type: application/octet-stream Size: 220 bytes Desc: not available URL: From danmcd at omniti.com Thu May 29 00:47:22 2014 From: danmcd at omniti.com (Dan McDonald) Date: Wed, 28 May 2014 20:47:22 -0400 Subject: [OmniOS-discuss] Intel X540-AT2 WARNING: ixgbe0&1 : Failed to initialize adapter In-Reply-To: <19BD2035-12A5-4AF3-8291-D7222E544816@omniti.com> References: <78e5ccda9b8a4f009ed3ad9e35e55f54@ICC-EXCHANGE.icc.local> <19BD2035-12A5-4AF3-8291-D7222E544816@omniti.com> Message-ID: <3D612930-1AC8-4992-9B68-04FE0B8EE9D0@omniti.com> AAAAH. My fault. Wrong code. > Failed to initialize adapter That's what you said. THIS code: /* * Initialize chipset hardware */ if (ixgbe_init(ixgbe) != IXGBE_SUCCESS) { ixgbe_error(ixgbe, "Failed to initialize adapter"); goto attach_fail; } And do use the DTrace script I sent, but use "ixgbe_init" instead of "ixgbe_init_driver_settings" as the argument. Sorry about that, Dan From jimklimov at cos.ru Thu May 29 09:31:58 2014 From: jimklimov at cos.ru (Jim Klimov) Date: Thu, 29 May 2014 11:31:58 +0200 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <53866866.8040702@ianshome.com> References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> <5385F404.5090005@gmail.com> <5385F6D2.6080401@will.to> <31db4e1d-4086-473d-8e55-b4ed3874a16e@careyweb.com> <53866866.8040702@ianshome.com> Message-ID: <4e3d53eb-2dc7-4e1c-a6c2-1fce94e4e435@email.android.com> 29 ??? 2014??. 0:51:18 CEST, Ian Collins ?????: >Nate Smith wrote: >> Third on S3700. They're the best mix of price/reliability. >> >> Has anyone used the Seagate 600 Pro series? ST240FP0021? > >Only the non-pro version. I ran a quick comparison with S3700s and >ended up using them as cache devices. I have a couple of ST120FN0021 used as a mirror of rpool/zil/l2arc in my brother's HP N54L rig for the past half a year. They are partitioned to use 100gb since a sibling (but unsold there) model with allegedly same internals and more overprovisioning has much better performance and reliability ratings from the vendor. Performance is still good (scrub of the 23gb full rpool completes in under 2min, reaching well over 300MB/s in peaks and averaging around 200-220 most of the time and dropping to 80-100 sometimes). Picked for low price, good promised specs, including powerloss protection. An DC S3700 seemed better but thrice the price, and a 3500 seemed subpar in reliability ratings. Jim -- Typos courtesy of K-9 Mail on my Samsung Android From dswartz at druber.com Thu May 29 15:19:39 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Thu, 29 May 2014 11:19:39 -0400 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <5385F6D2.6080401@will.to> References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> <5385F404.5090005@gmail.com> <5385F6D2.6080401@will.to> Message-ID: <805ae7df31bf478133f6d58bf9b72c0e.squirrel@webmail.druber.com> > On 5/28/2014 10:34 AM, Saso Kiselkov wrote: >> On 5/28/14, 4:08 PM, Schweiss, Chip wrote: >>> Intel has several SATA SSDs with proper super-cap protected caches that >>> make good log devices. >> >> I'd recommend looking at a Intel DC S3700. The 200 GB or 400 GB >> varieties promise ~30000 4k random write IOPS and actually seem to >> deliver: >> http://www.anandtech.com/show/7065/intel-ssd-dc-s3500-review-480gb-part-1/3 >> They're also not so expensive that it'll break your bank: >> http://www.amazon.com/S3700-Internal-Solid-State-Drive/dp/B00A8NWD68 >> > > Second this. The DC S3700 are very good. Hmmm, looking on amazon, I see the 100GB s3700 is $235 or so. I has lower random perf than the 200/400GB units, but still claiming 19K IOPS. Given this is for an SLOG, 200/400GB is a waste of space - the only reason I can see to do that would be the higher random IOPS. Is that likely to matter here? From doug at will.to Thu May 29 15:25:36 2014 From: doug at will.to (Doug Hughes) Date: Thu, 29 May 2014 11:25:36 -0400 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <805ae7df31bf478133f6d58bf9b72c0e.squirrel@webmail.druber.com> References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> <5385F404.5090005@gmail.com> <5385F6D2.6080401@will.to> <805ae7df31bf478133f6d58bf9b72c0e.squirrel@webmail.druber.com> Message-ID: <53875170.3030803@will.to> On 5/29/2014 11:19 AM, Dan Swartzendruber wrote: >> On 5/28/2014 10:34 AM, Saso Kiselkov wrote: >>> On 5/28/14, 4:08 PM, Schweiss, Chip wrote: >>>> Intel has several SATA SSDs with proper super-cap protected caches that >>>> make good log devices. >>> >>> I'd recommend looking at a Intel DC S3700. The 200 GB or 400 GB >>> varieties promise ~30000 4k random write IOPS and actually seem to >>> deliver: >>> http://www.anandtech.com/show/7065/intel-ssd-dc-s3500-review-480gb-part-1/3 >>> They're also not so expensive that it'll break your bank: >>> http://www.amazon.com/S3700-Internal-Solid-State-Drive/dp/B00A8NWD68 >>> >> >> Second this. The DC S3700 are very good. > > Hmmm, looking on amazon, I see the 100GB s3700 is $235 or so. I has lower > random perf than the 200/400GB units, but still claiming 19K IOPS. Given > this is for an SLOG, 200/400GB is a waste of space - the only reason I can > see to do that would be the higher random IOPS. Is that likely to matter > here? > The higher price is the reason I tend to prefer the 320 series that come in around $1/GB and have smaller sizes available. I use them for OS + slog. From danmcd at omniti.com Thu May 29 15:48:14 2014 From: danmcd at omniti.com (Dan McDonald) Date: Thu, 29 May 2014 11:48:14 -0400 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <53875170.3030803@will.to> References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> <5385F404.5090005@gmail.com> <5385F6D2.6080401@will.to> <805ae7df31bf478133f6d58bf9b72c0e.squirrel@webmail.druber.com> <53875170.3030803@will.to> Message-ID: <96F4E722-E75F-430D-85F6-D49BF24CA0AD@omniti.com> On May 29, 2014, at 11:25 AM, Doug Hughes wrote: > > The higher price is the reason I tend to prefer the 320 series that come in around $1/GB and have smaller sizes available. I use them for OS + slog. What about the S3500? I've heard that's more the drop-in replacement for the 320 series. (ObDisclosure: I use a pair as part-rpool/part-mirrored-slog for my home server. Blog post about HDC2.0 coming RSN.) Dan From skiselkov.ml at gmail.com Thu May 29 15:58:12 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Thu, 29 May 2014 17:58:12 +0200 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <96F4E722-E75F-430D-85F6-D49BF24CA0AD@omniti.com> References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> <5385F404.5090005@gmail.com> <5385F6D2.6080401@will.to> <805ae7df31bf478133f6d58bf9b72c0e.squirrel@webmail.druber.com> <53875170.3030803@will.to> <96F4E722-E75F-430D-85F6-D49BF24CA0AD@omniti.com> Message-ID: <53875914.8000103@gmail.com> On 5/29/14, 5:48 PM, Dan McDonald wrote: > > On May 29, 2014, at 11:25 AM, Doug Hughes wrote: >> >> The higher price is the reason I tend to prefer the 320 series that come in around $1/GB and have smaller sizes available. I use them for OS + slog. > > What about the S3500? I've heard that's more the drop-in replacement for the 320 series. > > (ObDisclosure: I use a pair as part-rpool/part-mirrored-slog for my home server. Blog post about HDC2.0 coming RSN.) The DC S3500 is reported to have only about 1/3 - 1/2 the performance of the DC S3700, see: http://www.anandtech.com/show/7065/intel-ssd-dc-s3500-review-480gb-part-1/3 May be more than enough for SOHO, though. Cheers, -- Saso From danmcd at omniti.com Thu May 29 16:02:51 2014 From: danmcd at omniti.com (Dan McDonald) Date: Thu, 29 May 2014 12:02:51 -0400 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <53875914.8000103@gmail.com> References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> <5385F404.5090005@gmail.com> <5385F6D2.6080401@will.to> <805ae7df31bf478133f6d58bf9b72c0e.squirrel@webmail.druber.com> <53875170.3030803@will.to> <96F4E722-E75F-430D-85F6-D49BF24CA0AD@omniti.com> <53875914.8000103@gmail.com> Message-ID: <67EFFC31-BA8D-4A66-A5BA-DB4715069FCE@omniti.com> On May 29, 2014, at 11:58 AM, Saso Kiselkov wrote: > On 5/29/14, 5:48 PM, Dan McDonald wrote: >> >> On May 29, 2014, at 11:25 AM, Doug Hughes wrote: >>> >>> The higher price is the reason I tend to prefer the 320 series that come in around $1/GB and have smaller sizes available. I use them for OS + slog. >> >> What about the S3500? I've heard that's more the drop-in replacement for the 320 series. >> >> (ObDisclosure: I use a pair as part-rpool/part-mirrored-slog for my home server. Blog post about HDC2.0 coming RSN.) > > The DC S3500 is reported to have only about 1/3 - 1/2 the performance of > the DC S3700, see: > http://www.anandtech.com/show/7065/intel-ssd-dc-s3500-review-480gb-part-1/3 > May be more than enough for SOHO, though. Likely yes, but it was a helluva lot cheaper, and seems to be a helluva lot more reliable that the two POS drives it replaced. :) Dan From mir at miras.org Thu May 29 16:25:32 2014 From: mir at miras.org (Michael Rasmussen) Date: Thu, 29 May 2014 18:25:32 +0200 Subject: [OmniOS-discuss] WDC WD10EZEX problems Message-ID: <20140529182532.2212d2f4@sleipner.datanom.net> Hi all, I have a pair of WDC WD10EZEX connected through a LSI 1068E HBA which is only running at 1.5 Gb/s all though the disks is SATA 3.0 Seagate Barracuda 7200.12 which is also SATA 3.0 are running at 3.0 Gb/s. I know LSI 1068E HBA is only SATA 2.0 so 3.0 Gb/s is maximum speed but why is the WDC WD10EZEX only running at 1.5 Gb/s? Stats from smartctl: === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Blue (SATA 6Gb/s) Device Model: WDC WD10EZEX-00RKKA0 Serial Number: WD-WCC1S3904840 LU WWN Device Id: 5 0014ee 2b31a7e0e Firmware Version: 80.00A80 User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS (minor revision not indicated) SATA Version is: SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s) Local Time is: Thu May 29 18:24:35 2014 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.12 Device Model: ST31000524AS Serial Number: 9VPG8NC3 LU WWN Device Id: 5 000c50 04d6f1cc8 Firmware Version: JC4B User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: 7200 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Thu May 29 18:21:29 2014 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: May Euell Gibbons eat your only copy of the manual! -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From dswartz at druber.com Thu May 29 18:15:56 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Thu, 29 May 2014 14:15:56 -0400 Subject: [OmniOS-discuss] Status of TRIM support? In-Reply-To: <5385F6D2.6080401@will.to> References: <90ccefb3a8926725a4192626ce63eedf.squirrel@webmail.druber.com> <5385F404.5090005@gmail.com> <5385F6D2.6080401@will.to> Message-ID: > On 5/28/2014 10:34 AM, Saso Kiselkov wrote: >> On 5/28/14, 4:08 PM, Schweiss, Chip wrote: >>> Intel has several SATA SSDs with proper super-cap protected caches that >>> make good log devices. >> >> I'd recommend looking at a Intel DC S3700. The 200 GB or 400 GB >> varieties promise ~30000 4k random write IOPS and actually seem to >> deliver: >> http://www.anandtech.com/show/7065/intel-ssd-dc-s3500-review-480gb-part-1/3 >> They're also not so expensive that it'll break your bank: >> http://www.amazon.com/S3700-Internal-Solid-State-Drive/dp/B00A8NWD68 >> > > Second this. The DC S3700 are very good. Pulled the trigger on a 100GB S3700 from Amazon... From cperez at cmpcs.com Thu May 29 18:40:30 2014 From: cperez at cmpcs.com (Carlos M. Perez) Date: Thu, 29 May 2014 18:40:30 +0000 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: <20140529182532.2212d2f4@sleipner.datanom.net> References: <20140529182532.2212d2f4@sleipner.datanom.net> Message-ID: A few suggestions on things to check: - Some of the 3GB SATA drives had a jumper that would limit operation to 1.5GB/s. - Is the controller limiting the port? Not familiar with the 1068e, but some of the LSI cards let you control the speeds on each port. - swap the ports and see if the reverse happens. If the Seagate connect at 1.5, you either have the card locking the speed, or the ports are bad. Carlos M. Perez CMP Consulting Services 305-669-1515 On 5/29/14, 12:25 PM, "Michael Rasmussen" wrote: >Hi all, > >I have a pair of WDC WD10EZEX connected through a LSI 1068E HBA which >is only running at 1.5 Gb/s all though the disks is SATA 3.0 > >Seagate Barracuda 7200.12 which is also SATA 3.0 are running at 3.0 >Gb/s. > >I know LSI 1068E HBA is only SATA 2.0 so 3.0 Gb/s is maximum speed but >why is the WDC WD10EZEX only running at 1.5 Gb/s? > >Stats from smartctl: > >=== START OF INFORMATION SECTION === >Model Family: Western Digital Caviar Blue (SATA 6Gb/s) >Device Model: WDC WD10EZEX-00RKKA0 >Serial Number: WD-WCC1S3904840 >LU WWN Device Id: 5 0014ee 2b31a7e0e >Firmware Version: 80.00A80 >User Capacity: 1,000,204,886,016 bytes [1.00 TB] >Sector Sizes: 512 bytes logical, 4096 bytes physical >Device is: In smartctl database [for details use: -P show] >ATA Version is: ATA8-ACS (minor revision not indicated) >SATA Version is: SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s) >Local Time is: Thu May 29 18:24:35 2014 CEST >SMART support is: Available - device has SMART capability. >SMART support is: Enabled > >=== START OF INFORMATION SECTION === >Model Family: Seagate Barracuda 7200.12 >Device Model: ST31000524AS >Serial Number: 9VPG8NC3 >LU WWN Device Id: 5 000c50 04d6f1cc8 >Firmware Version: JC4B >User Capacity: 1,000,204,886,016 bytes [1.00 TB] >Sector Size: 512 bytes logical/physical >Rotation Rate: 7200 rpm >Device is: In smartctl database [for details use: -P show] >ATA Version is: ATA8-ACS T13/1699-D revision 4 >SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s) >Local Time is: Thu May 29 18:21:29 2014 CEST >SMART support is: Available - device has SMART capability. >SMART support is: Enabled > >-- >Hilsen/Regards >Michael Rasmussen > >Get my public GnuPG keys: >michael rasmussen cc >http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E >mir datanom net >http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C >mir miras org >http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 >-------------------------------------------------------------- >/usr/games/fortune -es says: >May Euell Gibbons eat your only copy of the manual! From mir at miras.org Thu May 29 19:17:57 2014 From: mir at miras.org (Michael Rasmussen) Date: Thu, 29 May 2014 21:17:57 +0200 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: <45ECB03F-3F2D-49DA-B4A3-A68FEB0EAB23@ferebee.net> References: <20140529182532.2212d2f4@sleipner.datanom.net> <45ECB03F-3F2D-49DA-B4A3-A68FEB0EAB23@ferebee.net> Message-ID: <20140529211757.1e72acb9@sleipner.datanom.net> On Thu, 29 May 2014 20:58:50 +0200 Chris Ferebee wrote: > > Or, BTW, the cables. They could be marginal, still tolerated @ 3 Gbps by the Seagate, but rejected by the WDs, for instance. > Since all disks are connected through the same SFF-8087 cable I think this could be ruled out? -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: Patch griefs with proverbs. -- William Shakespeare, "Much Ado About Nothing" -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From mir at miras.org Thu May 29 19:20:50 2014 From: mir at miras.org (Michael Rasmussen) Date: Thu, 29 May 2014 21:20:50 +0200 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: References: <20140529182532.2212d2f4@sleipner.datanom.net> Message-ID: <20140529212050.485177d6@sleipner.datanom.net> On Thu, 29 May 2014 18:40:30 +0000 "Carlos M. Perez" wrote: > A few suggestions on things to check: > > - Some of the 3GB SATA drives had a jumper that would limit operation to > 1.5GB/s. > This is an option I will investigate. Jumper pin 5+6 limits speed on WD SATA 3.0 to SATA 2.0 which might be worth looking into. > - Is the controller limiting the port? Not familiar with the 1068e, but > some of the LSI cards let you control the speeds on each port. > > - swap the ports and see if the reverse happens. If the Seagate connect > at 1.5, you either have the card locking the speed, or the ports are bad. > Should be a long shot since the firmware is flashed to pass-through mode. -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: Patch griefs with proverbs. -- William Shakespeare, "Much Ado About Nothing" -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From dswartz at druber.com Thu May 29 19:23:07 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Thu, 29 May 2014 15:23:07 -0400 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: <20140529211757.1e72acb9@sleipner.datanom.net> References: <20140529182532.2212d2f4@sleipner.datanom.net> <45ECB03F-3F2D-49DA-B4A3-A68FEB0EAB23@ferebee.net> <20140529211757.1e72acb9@sleipner.datanom.net> Message-ID: <3f9faa47a3463bf5315adca0c847bf42.squirrel@webmail.druber.com> > On Thu, 29 May 2014 20:58:50 +0200 > Chris Ferebee wrote: > >> >> Or, BTW, the cables. They could be marginal, still tolerated @ 3 Gbps by >> the Seagate, but rejected by the WDs, for instance. >> > Since all disks are connected through the same SFF-8087 cable I think > this could be ruled out? If it's a forward breakout cable, it's 4 cables at the disk end and one at the HBA end, so isn't it possible one of them is flaky? From mir at miras.org Thu May 29 19:30:37 2014 From: mir at miras.org (Michael Rasmussen) Date: Thu, 29 May 2014 21:30:37 +0200 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: <3f9faa47a3463bf5315adca0c847bf42.squirrel@webmail.druber.com> References: <20140529182532.2212d2f4@sleipner.datanom.net> <45ECB03F-3F2D-49DA-B4A3-A68FEB0EAB23@ferebee.net> <20140529211757.1e72acb9@sleipner.datanom.net> <3f9faa47a3463bf5315adca0c847bf42.squirrel@webmail.druber.com> Message-ID: <20140529213037.3de59d65@sleipner.datanom.net> On Thu, 29 May 2014 15:23:07 -0400 "Dan Swartzendruber" wrote: > > On Thu, 29 May 2014 20:58:50 +0200 > > Chris Ferebee wrote: > > > >> > >> Or, BTW, the cables. They could be marginal, still tolerated @ 3 Gbps by > >> the Seagate, but rejected by the WDs, for instance. > >> > > Since all disks are connected through the same SFF-8087 cable I think > > this could be ruled out? > > If it's a forward breakout cable, it's 4 cables at the disk end and one at > the HBA end, so isn't it possible one of them is flaky? > 2 out of 4? And on exactly the same to disk brands? -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: I never liked you, and I always will. -Samuel Goldwyn -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From dswartz at druber.com Thu May 29 19:41:51 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Thu, 29 May 2014 15:41:51 -0400 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: <20140529213037.3de59d65@sleipner.datanom.net> References: <20140529182532.2212d2f4@sleipner.datanom.net> <45ECB03F-3F2D-49DA-B4A3-A68FEB0EAB23@ferebee.net> <20140529211757.1e72acb9@sleipner.datanom.net> <3f9faa47a3463bf5315adca0c847bf42.squirrel@webmail.druber.com> <20140529213037.3de59d65@sleipner.datanom.net> Message-ID: <0258dc37353dd9060c6ce1b8e6d56650.squirrel@webmail.druber.com> > On Thu, 29 May 2014 15:23:07 -0400 > "Dan Swartzendruber" wrote: > >> > On Thu, 29 May 2014 20:58:50 +0200 >> > Chris Ferebee wrote: >> > >> >> >> >> Or, BTW, the cables. They could be marginal, still tolerated @ 3 Gbps >> by >> >> the Seagate, but rejected by the WDs, for instance. >> >> >> > Since all disks are connected through the same SFF-8087 cable I think >> > this could be ruled out? >> >> If it's a forward breakout cable, it's 4 cables at the disk end and one >> at >> the HBA end, so isn't it possible one of them is flaky? >> > 2 out of 4? And on exactly the same to disk brands? Probably not, no :) From danmcd at omniti.com Thu May 29 20:06:27 2014 From: danmcd at omniti.com (Dan McDonald) Date: Thu, 29 May 2014 16:06:27 -0400 Subject: [OmniOS-discuss] Bloody repo has been completely updated Message-ID: <0F369C6A-528A-406D-9DE6-852EF8CF1ABA@omniti.com> In order to clear out some cruft (including an accidental "on-nightly" push out from when I was starting at OmniTI), I've wiped clean and reinstalled the repo directory for bloody. This includes an additional refresh of the illumos-omnios packages from TODAY (because of a ZFS memory leak in some new ZFS features that was just patched today). If you are using bloody, be prepared for a FULL upgrade when you "pkg update". As always, please engage here if you have questions with bloody. New with this update: - NSS up to version 3.16.1 now. - NSPR up to version 4.10.5. (Users who use these two should stress 'em out.) - ZFS filesystem and snapshot limits - ZFS metadata perfomance improvement (tunable, see zfs(1M) for details). - Less (ab)use of ddi_get_time() which will increase reliability in the face of changing wallclock time. - printf(1) command bug fixes. - small ifconfig(1M) fix with its "addif" subcommand. Happy updating! Dan From protonwrangler at gmail.com Thu May 29 20:10:27 2014 From: protonwrangler at gmail.com (Warren Marts) Date: Thu, 29 May 2014 14:10:27 -0600 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: <20140529213037.3de59d65@sleipner.datanom.net> References: <20140529182532.2212d2f4@sleipner.datanom.net> <45ECB03F-3F2D-49DA-B4A3-A68FEB0EAB23@ferebee.net> <20140529211757.1e72acb9@sleipner.datanom.net> <3f9faa47a3463bf5315adca0c847bf42.squirrel@webmail.druber.com> <20140529213037.3de59d65@sleipner.datanom.net> Message-ID: I believe there are some SATA 6Gbps disks that only have 1.5Gbps as a fallback, and do not support 3Gbps. For a spinning hard disk, certainly a 7200 rpm disk, it is not a noticeable handicap to run at 1.5 vs. 3Gbps. On Thu, May 29, 2014 at 1:30 PM, Michael Rasmussen wrote: > On Thu, 29 May 2014 15:23:07 -0400 > "Dan Swartzendruber" wrote: > > > > On Thu, 29 May 2014 20:58:50 +0200 > > > Chris Ferebee wrote: > > > > > >> > > >> Or, BTW, the cables. They could be marginal, still tolerated @ 3 Gbps > by > > >> the Seagate, but rejected by the WDs, for instance. > > >> > > > Since all disks are connected through the same SFF-8087 cable I think > > > this could be ruled out? > > > > If it's a forward breakout cable, it's 4 cables at the disk end and one > at > > the HBA end, so isn't it possible one of them is flaky? > > > 2 out of 4? And on exactly the same to disk brands? > > -- > Hilsen/Regards > Michael Rasmussen > > Get my public GnuPG keys: > michael rasmussen cc > http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E > mir datanom net > http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C > mir miras org > http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 > -------------------------------------------------------------- > /usr/games/fortune -es says: > I never liked you, and I always will. -Samuel Goldwyn > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sigsten at gmail.com Thu May 29 22:49:12 2014 From: sigsten at gmail.com (Sigsten =?ISO-8859-1?Q?=C5kesson?=) Date: Fri, 30 May 2014 00:49:12 +0200 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: <20140529182532.2212d2f4@sleipner.datanom.net> References: <20140529182532.2212d2f4@sleipner.datanom.net> Message-ID: <2759125.hfDRtcGBkT@gegenes> I had similar problems when hooking up a SSD to my 1064E card, and if I remember correct, it was solved by updating the controller firmware. Maybe you're running an old firmware? Regards, Sigsten On Thursday 29 May 2014 18.25.32 Michael Rasmussen wrote: > Hi all, > > I have a pair of WDC WD10EZEX connected through a LSI 1068E HBA which > is only running at 1.5 Gb/s all though the disks is SATA 3.0 > > Seagate Barracuda 7200.12 which is also SATA 3.0 are running at 3.0 > Gb/s. > > I know LSI 1068E HBA is only SATA 2.0 so 3.0 Gb/s is maximum speed but > why is the WDC WD10EZEX only running at 1.5 Gb/s? > > Stats from smartctl: > > === START OF INFORMATION SECTION === > Model Family: Western Digital Caviar Blue (SATA 6Gb/s) > Device Model: WDC WD10EZEX-00RKKA0 > Serial Number: WD-WCC1S3904840 > LU WWN Device Id: 5 0014ee 2b31a7e0e > Firmware Version: 80.00A80 > User Capacity: 1,000,204,886,016 bytes [1.00 TB] > Sector Sizes: 512 bytes logical, 4096 bytes physical > Device is: In smartctl database [for details use: -P show] > ATA Version is: ATA8-ACS (minor revision not indicated) > SATA Version is: SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s) > Local Time is: Thu May 29 18:24:35 2014 CEST > SMART support is: Available - device has SMART capability. > SMART support is: Enabled > > === START OF INFORMATION SECTION === > Model Family: Seagate Barracuda 7200.12 > Device Model: ST31000524AS > Serial Number: 9VPG8NC3 > LU WWN Device Id: 5 000c50 04d6f1cc8 > Firmware Version: JC4B > User Capacity: 1,000,204,886,016 bytes [1.00 TB] > Sector Size: 512 bytes logical/physical > Rotation Rate: 7200 rpm > Device is: In smartctl database [for details use: -P show] > ATA Version is: ATA8-ACS T13/1699-D revision 4 > SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s) > Local Time is: Thu May 29 18:21:29 2014 CEST > SMART support is: Available - device has SMART capability. > SMART support is: Enabled From mir at miras.org Thu May 29 23:59:06 2014 From: mir at miras.org (Michael Rasmussen) Date: Fri, 30 May 2014 01:59:06 +0200 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: <20140529212050.485177d6@sleipner.datanom.net> References: <20140529182532.2212d2f4@sleipner.datanom.net> <20140529212050.485177d6@sleipner.datanom.net> Message-ID: <20140530015906.6986dd0d@sleipner.datanom.net> On Thu, 29 May 2014 21:20:50 +0200 Michael Rasmussen wrote: > On Thu, 29 May 2014 18:40:30 +0000 > "Carlos M. Perez" wrote: > > > A few suggestions on things to check: > > > > - Some of the 3GB SATA drives had a jumper that would limit operation to > > 1.5GB/s. > > > This is an option I will investigate. Jumper pin 5+6 limits speed on WD > SATA 3.0 to SATA 2.0 which might be worth looking into. > And this has solved the problem. Smart now shows the following: === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Blue (SATA 6Gb/s) Device Model: WDC WD10EZEX-00RKKA0 Serial Number: WD-WCC1S3904840 LU WWN Device Id: 5 0014ee 2b31a7e0e Firmware Version: 80.00A80 User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS (minor revision not indicated) SATA Version is: SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s) Local Time is: Fri May 30 01:56:57 2014 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled This raises the questions: 1) Is this a bug in LSI firmware? 2) Is this a bug in the Omnios mpt driver? 3) Is this a firmware bug in WD10EZEX? -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: We are using Linux daily to UP our productivity - so UP yours! (Adapted from Pat Paulsen by Joe Sloan) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From richard.elling at richardelling.com Fri May 30 00:12:09 2014 From: richard.elling at richardelling.com (Richard Elling) Date: Thu, 29 May 2014 17:12:09 -0700 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: <3f9faa47a3463bf5315adca0c847bf42.squirrel@webmail.druber.com> References: <20140529182532.2212d2f4@sleipner.datanom.net> <45ECB03F-3F2D-49DA-B4A3-A68FEB0EAB23@ferebee.net> <20140529211757.1e72acb9@sleipner.datanom.net> <3f9faa47a3463bf5315adca0c847bf42.squirrel@webmail.druber.com> Message-ID: On May 29, 2014, at 12:23 PM, Dan Swartzendruber wrote: >> On Thu, 29 May 2014 20:58:50 +0200 >> Chris Ferebee wrote: >> >>> >>> Or, BTW, the cables. They could be marginal, still tolerated @ 3 Gbps by >>> the Seagate, but rejected by the WDs, for instance. >>> >> Since all disks are connected through the same SFF-8087 cable I think >> this could be ruled out? > > If it's a forward breakout cable, it's 4 cables at the disk end and one at > the HBA end, so isn't it possible one of them is flaky? I hate to be bearer of bad news, but it is much more common that you might think that one or two in a wide port cable are bad. Sometimes it is the connector -- they are finicky. How to troubleshoot: sasinfo hba-port -y shows negotiated speeds sasinfo hba-port -l shows link stats as seen by the HBA -- look for disparity errors sasinfo -ly both of above sg_logs -p 0x18 /dev/rdsk/c0tblah shows link stats as seen by the target -- look for disparity errors note: sasinfo is a separate package in OmniOS, not installed by default. -- richard From mir at miras.org Fri May 30 00:43:18 2014 From: mir at miras.org (Michael Rasmussen) Date: Fri, 30 May 2014 02:43:18 +0200 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: References: <20140529182532.2212d2f4@sleipner.datanom.net> <45ECB03F-3F2D-49DA-B4A3-A68FEB0EAB23@ferebee.net> <20140529211757.1e72acb9@sleipner.datanom.net> <3f9faa47a3463bf5315adca0c847bf42.squirrel@webmail.druber.com> Message-ID: <20140530024318.6a7415d2@sleipner.datanom.net> On Thu, 29 May 2014 17:12:09 -0700 Richard Elling wrote: > > How to troubleshoot: > sasinfo hba-port -y > shows negotiated speeds > sasinfo hba-port -l > shows link stats as seen by the HBA -- look for disparity errors > sasinfo -ly > both of above # sasinfo hba-port -y Error: No Adapters Found. # lspci |grep LSI 02:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08) > sg_logs -p 0x18 /dev/rdsk/c0tblah > shows link stats as seen by the target -- look for disparity errors > # sg_logs -p 0x18 /dev/rdsk/c2t2d0s0 -bash: sg_logs: command not found -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: We come to bury DOS, not to praise it. -- Paul Vojta, vojta at math.berkeley.edu -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From mir at miras.org Fri May 30 00:57:10 2014 From: mir at miras.org (Michael Rasmussen) Date: Fri, 30 May 2014 02:57:10 +0200 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: <2759125.hfDRtcGBkT@gegenes> References: <20140529182532.2212d2f4@sleipner.datanom.net> <2759125.hfDRtcGBkT@gegenes> Message-ID: <20140530025710.6718ac1f@sleipner.datanom.net> On Fri, 30 May 2014 00:49:12 +0200 Sigsten ?kesson wrote: > I had similar problems when hooking up a SSD to my 1064E card, and if I > remember correct, it was solved by updating the controller firmware. > > Maybe you're running an old firmware? > From lsiutils: Current active firmware version is 00192f00 (0.25.47) Firmware image's version is MPTFW-00.25.47.00-IE LSI Logic x86 BIOS image's version is MPTBIOS-6.22.03.00 (2008.08.06) -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: There are two kinds of pedestrians... the quick and the dead. -- Lord Thomas Rober Dewar -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From mir at miras.org Fri May 30 00:59:17 2014 From: mir at miras.org (Michael Rasmussen) Date: Fri, 30 May 2014 02:59:17 +0200 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: References: <20140529182532.2212d2f4@sleipner.datanom.net> <45ECB03F-3F2D-49DA-B4A3-A68FEB0EAB23@ferebee.net> <20140529211757.1e72acb9@sleipner.datanom.net> <3f9faa47a3463bf5315adca0c847bf42.squirrel@webmail.druber.com> Message-ID: <20140530025917.4186c437@sleipner.datanom.net> On Thu, 29 May 2014 17:12:09 -0700 Richard Elling wrote: > How to troubleshoot: > sasinfo hba-port -y > shows negotiated speeds > sasinfo hba-port -l > shows link stats as seen by the HBA -- look for disparity errors > sasinfo -ly > both of above > sg_logs -p 0x18 /dev/rdsk/c0tblah > shows link stats as seen by the target -- look for disparity errors > lsiutils shows the following: SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, down, down, down, down B___T SASAddress PhyNum Handle Parent Type 50022190cd35be00 0001 SAS Initiator 50022190cd35be01 0002 SAS Initiator 50022190cd35be02 0003 SAS Initiator 50022190cd35be03 0004 SAS Initiator 50022190cd35be04 0005 SAS Initiator 50022190cd35be05 0006 SAS Initiator 50022190cd35be06 0007 SAS Initiator 50022190cd35be07 0008 SAS Initiator 0 0 1221000000000000 0 0009 0001 SATA Target 0 1 1221000001000000 1 000a 0002 SATA Target 0 2 1221000002000000 2 000b 0003 SATA Target 0 3 1221000003000000 3 000c 0004 SATA Target Type NumPhys PhyNum Handle PhyNum Handle Port Speed Adapter 8 0 0001 --> 0 0009 0 3.0 1 0002 --> 0 000a 1 3.0 2 0003 --> 0 000b 2 3.0 3 0004 --> 0 000c 3 3.0 Enclosure Handle Slots SASAddress B___T (SEP) 0001 8 50022190cd35be00 -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: There are two kinds of pedestrians... the quick and the dead. -- Lord Thomas Rober Dewar -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From mir at miras.org Fri May 30 01:13:16 2014 From: mir at miras.org (Michael Rasmussen) Date: Fri, 30 May 2014 03:13:16 +0200 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: References: <20140529182532.2212d2f4@sleipner.datanom.net> <45ECB03F-3F2D-49DA-B4A3-A68FEB0EAB23@ferebee.net> <20140529211757.1e72acb9@sleipner.datanom.net> <3f9faa47a3463bf5315adca0c847bf42.squirrel@webmail.druber.com> <20140529213037.3de59d65@sleipner.datanom.net> Message-ID: <20140530031316.17328629@sleipner.datanom.net> On Thu, 29 May 2014 14:10:27 -0600 Warren Marts wrote: > > For a spinning hard disk, certainly a 7200 rpm disk, it is not a noticeable > handicap to run at 1.5 vs. 3Gbps. > After having raised the speed to 3Gb/s on the WD disks I measure an overall increase in pool performance with 15%. So it is noticeable;-) -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: By doing just a little every day, you can gradually let the task completely overwhelm you. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From ian at ianshome.com Fri May 30 04:09:54 2014 From: ian at ianshome.com (Ian Collins) Date: Fri, 30 May 2014 16:09:54 +1200 Subject: [OmniOS-discuss] Illumos issue 1778 Message-ID: <53880492.7000000@ianshome.com> Does anyone have any experience with this issue (Assertion failed: rn->rn_nozpool == B_FALSE, file ../common/libzfs_import.c, line 1077, function zpool_open_func) with OmniOS? I have a large system that used to be happy running OmniOS until the a pool became corrupted and I had to rebuild it. Now any attempt to import a pool (including I due to boots hang the original rpool) triggers the assert. The hardware is unchanged (and still runs Solaris 11 fine) could it be some residual data on the drives? Not good... -- Ian. From richard.elling at richardelling.com Fri May 30 04:59:46 2014 From: richard.elling at richardelling.com (Richard Elling) Date: Thu, 29 May 2014 21:59:46 -0700 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: <20140530025917.4186c437@sleipner.datanom.net> References: <20140529182532.2212d2f4@sleipner.datanom.net> <45ECB03F-3F2D-49DA-B4A3-A68FEB0EAB23@ferebee.net> <20140529211757.1e72acb9@sleipner.datanom.net> <3f9faa47a3463bf5315adca0c847bf42.squirrel@webmail.druber.com> <20140530025917.4186c437@sleipner.datanom.net> Message-ID: <58E04C62-5534-4560-BDC9-1F0078D7A1B9@richardelling.com> Ah, vintage hardware :-) On May 29, 2014, at 5:59 PM, Michael Rasmussen wrote: > On Thu, 29 May 2014 17:12:09 -0700 > Richard Elling wrote: > >> How to troubleshoot: >> sasinfo hba-port -y >> shows negotiated speeds >> sasinfo hba-port -l >> shows link stats as seen by the HBA -- look for disparity errors >> sasinfo -ly >> both of above >> sg_logs -p 0x18 /dev/rdsk/c0tblah >> shows link stats as seen by the target -- look for disparity errors >> > lsiutils shows the following: > SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, down, down, down, down lsiutils is better for the older gear. Look at the link stats in the expert menu. For sg_logs command, you'll want to install sg3_utils. -- richard > > B___T SASAddress PhyNum Handle Parent Type > 50022190cd35be00 0001 SAS Initiator > 50022190cd35be01 0002 SAS Initiator > 50022190cd35be02 0003 SAS Initiator > 50022190cd35be03 0004 SAS Initiator > 50022190cd35be04 0005 SAS Initiator > 50022190cd35be05 0006 SAS Initiator > 50022190cd35be06 0007 SAS Initiator > 50022190cd35be07 0008 SAS Initiator > 0 0 1221000000000000 0 0009 0001 SATA Target > 0 1 1221000001000000 1 000a 0002 SATA Target > 0 2 1221000002000000 2 000b 0003 SATA Target > 0 3 1221000003000000 3 000c 0004 SATA Target > > Type NumPhys PhyNum Handle PhyNum Handle Port Speed > Adapter 8 0 0001 --> 0 0009 0 3.0 > 1 0002 --> 0 000a 1 3.0 > 2 0003 --> 0 000b 2 3.0 > 3 0004 --> 0 000c 3 3.0 > > Enclosure Handle Slots SASAddress B___T (SEP) > 0001 8 50022190cd35be00 > > > -- > Hilsen/Regards > Michael Rasmussen > > Get my public GnuPG keys: > michael rasmussen cc > http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E > mir datanom net > http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C > mir miras org > http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 > -------------------------------------------------------------- > /usr/games/fortune -es says: > There are two kinds of pedestrians... the quick and the dead. > -- Lord Thomas Rober Dewar From cperez at cmpcs.com Fri May 30 05:03:05 2014 From: cperez at cmpcs.com (Carlos M. Perez) Date: Fri, 30 May 2014 05:03:05 +0000 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: <20140530015906.6986dd0d@sleipner.datanom.net> References: <20140529182532.2212d2f4@sleipner.datanom.net> <20140529212050.485177d6@sleipner.datanom.net> <20140530015906.6986dd0d@sleipner.datanom.net> Message-ID: <0facfbd28adc48a894f58b3ffd1b0829@BLUPR06MB673.namprd06.prod.outlook.com> Maybe I'm missing something here, but removing the jumper got the drive recognized to 3Gb, which is the maximum speed of the controller...so Why would a firmware on a card that has a maximum speed of 3Gb be a firmware issue? Why would it be a driver issue? Why would the drive firmware be bad? The jumper forced the drive to limit it's connection to 1.5Gb as it was instructed to. Removing the jumper cleared this and allows the drive to talk at 6Gb. Your controller is 3Gb so it's the maximum it would go anyway. Perhaps the drive spec is available via other methods. Are you asking these questions because you're not seeing 6Gb on the drive info? I don't think you're going to see that with the existing controller... Carlos M. Perez CMP Consulting Services 305-669-1515 > -----Original Message----- > From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] > On Behalf Of Michael Rasmussen > Sent: Thursday, May 29, 2014 7:59 PM > To: omnios-discuss at lists.omniti.com > Subject: Re: [OmniOS-discuss] WDC WD10EZEX problems > > On Thu, 29 May 2014 21:20:50 +0200 > Michael Rasmussen wrote: > > > On Thu, 29 May 2014 18:40:30 +0000 > > "Carlos M. Perez" wrote: > > > > > A few suggestions on things to check: > > > > > > - Some of the 3GB SATA drives had a jumper that would limit > > > operation to 1.5GB/s. > > > > > This is an option I will investigate. Jumper pin 5+6 limits speed on > > WD SATA 3.0 to SATA 2.0 which might be worth looking into. > > > And this has solved the problem. Smart now shows the following: > > === START OF INFORMATION SECTION === > Model Family: Western Digital Caviar Blue (SATA 6Gb/s) > Device Model: WDC WD10EZEX-00RKKA0 > Serial Number: WD-WCC1S3904840 > LU WWN Device Id: 5 0014ee 2b31a7e0e > Firmware Version: 80.00A80 > User Capacity: 1,000,204,886,016 bytes [1.00 TB] > Sector Sizes: 512 bytes logical, 4096 bytes physical > Device is: In smartctl database [for details use: -P show] > ATA Version is: ATA8-ACS (minor revision not indicated) > SATA Version is: SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s) > Local Time is: Fri May 30 01:56:57 2014 CEST > SMART support is: Available - device has SMART capability. > SMART support is: Enabled > > This raises the questions: > 1) Is this a bug in LSI firmware? > 2) Is this a bug in the Omnios mpt driver? > 3) Is this a firmware bug in WD10EZEX? > > -- > Hilsen/Regards > Michael Rasmussen > > Get my public GnuPG keys: > michael rasmussen cc > http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E > mir datanom net > http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C > mir miras org > http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 > -------------------------------------------------------------- > /usr/games/fortune -es says: > We are using Linux daily to UP our productivity - so UP yours! > (Adapted from Pat Paulsen by Joe Sloan) From sigsten at gmail.com Fri May 30 05:31:11 2014 From: sigsten at gmail.com (Sigsten =?ISO-8859-1?Q?=C5kesson?=) Date: Fri, 30 May 2014 07:31:11 +0200 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: <20140530025710.6718ac1f@sleipner.datanom.net> References: <20140529182532.2212d2f4@sleipner.datanom.net> <2759125.hfDRtcGBkT@gegenes> <20140530025710.6718ac1f@sleipner.datanom.net> Message-ID: <5221044.nnMHZrTQ6c@gegenes> On LSI's site, the following version from 2011 is available. SAS3081ER_ Package_P21_IR_IT_Firmware_BIOS_for_MSDOS_Windows FW: 01.33.00.00 BIOS: 6.36.00.00 Give that a try. Regards, Sigsten On Friday 30 May 2014 02.57.10 Michael Rasmussen wrote: > On Fri, 30 May 2014 00:49:12 +0200 > > Sigsten ?kesson wrote: > > I had similar problems when hooking up a SSD to my 1064E card, and if I > > remember correct, it was solved by updating the controller firmware. > > > > Maybe you're running an old firmware? > > From lsiutils: > > Current active firmware version is 00192f00 (0.25.47) > Firmware image's version is MPTFW-00.25.47.00-IE > LSI Logic > x86 BIOS image's version is MPTBIOS-6.22.03.00 (2008.08.06) From mattfrazer at gmail.com Fri May 30 18:08:54 2014 From: mattfrazer at gmail.com (Matthew Frazer) Date: Fri, 30 May 2014 14:08:54 -0400 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: <5221044.nnMHZrTQ6c@gegenes> References: <20140529182532.2212d2f4@sleipner.datanom.net> <2759125.hfDRtcGBkT@gegenes> <20140530025710.6718ac1f@sleipner.datanom.net> <5221044.nnMHZrTQ6c@gegenes> Message-ID: Within the P21 Release notes one finds: [...] Change Summary ( Defects=5) SCGCQ00173656 (DFCT) - 6G SATA drive negotiates to 1.5G speed [...] Seems to fit your situation exactly. -mjf On Fri, May 30, 2014 at 1:31 AM, Sigsten ?kesson wrote: > On LSI's site, the following version from 2011 is available. > > SAS3081ER_ Package_P21_IR_IT_Firmware_BIOS_for_MSDOS_Windows > FW: 01.33.00.00 > BIOS: 6.36.00.00 > > Give that a try. > > Regards, > Sigsten > > > On Friday 30 May 2014 02.57.10 Michael Rasmussen wrote: > > On Fri, 30 May 2014 00:49:12 +0200 > > > > Sigsten ?kesson wrote: > > > I had similar problems when hooking up a SSD to my 1064E card, and if I > > > remember correct, it was solved by updating the controller firmware. > > > > > > Maybe you're running an old firmware? > > > > From lsiutils: > > > > Current active firmware version is 00192f00 (0.25.47) > > Firmware image's version is MPTFW-00.25.47.00-IE > > LSI Logic > > x86 BIOS image's version is MPTBIOS-6.22.03.00 (2008.08.06) > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -- It is in love that we are made, in love we disappear. -Leonard Cohen -------------- next part -------------- An HTML attachment was scrubbed... URL: From ian at ianshome.com Fri May 30 21:13:19 2014 From: ian at ianshome.com (Ian Collins) Date: Sat, 31 May 2014 09:13:19 +1200 Subject: [OmniOS-discuss] Illumos issue 1778 In-Reply-To: <53880492.7000000@ianshome.com> References: <53880492.7000000@ianshome.com> Message-ID: <5388F46F.3060506@ianshome.com> Ian Collins wrote: > Does anyone have any experience with this issue (Assertion failed: > rn->rn_nozpool == B_FALSE, file ../common/libzfs_import.c, line 1077, > function zpool_open_func) with OmniOS? > > I have a large system that used to be happy running OmniOS until the a > pool became corrupted and I had to rebuild it. Now any attempt to > import a pool (including I due to boots hang the original rpool) > triggers the assert. > > The hardware is unchanged (and still runs Solaris 11 fine) could it be > some residual data on the drives? > > Not good... > One strange thing I've noticed with installs after I've hit this problem both OmniOS and Solaris have "windows" as an option on the boot screen... -- Ian. From mir at miras.org Sat May 31 01:10:01 2014 From: mir at miras.org (Michael Rasmussen) Date: Sat, 31 May 2014 03:10:01 +0200 Subject: [OmniOS-discuss] WDC WD10EZEX problems In-Reply-To: References: <20140529182532.2212d2f4@sleipner.datanom.net> <2759125.hfDRtcGBkT@gegenes> <20140530025710.6718ac1f@sleipner.datanom.net> <5221044.nnMHZrTQ6c@gegenes> Message-ID: <20140531031001.751f0054@sleipner.datanom.net> On Fri, 30 May 2014 14:08:54 -0400 Matthew Frazer wrote: > Within the P21 Release notes one finds: > [...] > Change Summary ( Defects=5) > SCGCQ00173656 (DFCT) - 6G SATA drive negotiates to 1.5G speed > [...] > > Seems to fit your situation exactly. > Upgrading firmware and bios did the trick:-) SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G, down, down, down, down Type NumPhys PhyNum Handle PhyNum Handle Port Speed Adapter 8 0 0001 --> 0 0009 0 3.0 1 0002 --> 0 000a 1 3.0 2 0003 --> 0 000b 2 3.0 3 0004 --> 0 000c 3 3.0 -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: If you have never been hated by your child, you have never been a parent. -- Bette Davis -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From johan.kragsterman at capvert.se Sat May 31 10:03:47 2014 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Sat, 31 May 2014 12:03:47 +0200 Subject: [OmniOS-discuss] IBM micron ssd Message-ID: Hi! I've seen some Micron?1.8-inch 64 GB SSD's around, that is supposed to be enterprise class. They're called Micron RealSSD P400e, and come in different sizes. IBM sells them with IBM brand for their servers. They apparently use the Marvell 9174 SATA 6Gb/s controller and 25 nm MLC NAND, and is, according to Micron, "equipped with firmware designed for read-heavy enterprise workloads, including 28% over-provisioning and data protection via memory path error correction". So I wonder if anyone on this list has any knowledge about these drives, and if the controller is compatible with omnios? Best regards from/Med v?nliga h?lsningar fr?n Johan Kragsterman Capvert From danmcd at omniti.com Sat May 31 14:20:53 2014 From: danmcd at omniti.com (Dan McDonald) Date: Sat, 31 May 2014 10:20:53 -0400 Subject: [OmniOS-discuss] IBM micron ssd In-Reply-To: References: Message-ID: On May 31, 2014, at 6:03 AM, Johan Kragsterman wrote: > > Hi! > > > I've seen some Micron 1.8-inch 64 GB SSD's around, that is supposed to be enterprise class. They're called Micron RealSSD P400e, and come in different sizes. > > IBM sells them with IBM brand for their servers. > If it's the one reviewed here: http://www.tomshardware.com/reviews/p400e-review-endurance,3199.html It looks like any other SATA SSD. You just plug it in like any other SATA drive and it works. > They apparently use the Marvell 9174 SATA 6Gb/s controller and 25 nm MLC NAND, and is, according to Micron, "equipped with firmware designed for read-heavy enterprise workloads, including 28% over-provisioning and data protection via memory path error correction". > > So I wonder if anyone on this list has any knowledge about these drives, and if the controller is compatible with omnios? As for the internal controller (since it's acting as a disk drive, not acting as a PCIe card), you'll have to ask how good it is. Since it's a drive, it should just work with OmniOS (or any other illumos variant for that matter). If it DOESN'T, that'd be interesting to know. Dan