[OmniOS-discuss] OmniOS backup box hanging regularly
Jim Klimov
jimklimov at cos.ru
Fri Oct 23 16:54:27 UTC 2015
23 октября 2015 г. 11:23:28 CEST, Jim Klimov <jim at cos.ru> пишет:
>A new bit of info came in. I left the box running along with an SSH
>session running various tracers overnight, and it seems that the system
>dumbly ran out of memory.
>
>However it never logged any forking errors, etc. which were typical for
>similar cases before, and did not recover by time or "magic" (e.g.
>processes dying on ENOMEM and so freeing it up). There is some swap
>free, too (which wouldn't help if it ran out of kernel memory). The
>64Mb free RAM (or 32Mb in some of my older experiences) is the empiric
>minimum under which illumos is good as dead ;)
>
>The box is pingable after all, but no SSH nor local usability. The
>"top" listings froze at 4:25am, that's some 7 hours ago.
>
>last pid: 26331; load avg: 6.65, 5.61, 6.88; up 0+19:42:47
>04:25:17
>208 processes: 178 sleeping, 28 running, 1 zombie, 1 on cpu
>CPU states: 21.5% idle, 3.5% user, 75.0% kernel, 0.0% iowait, 0.0%
>swap
>Memory: 16G phys mem, 64M free mem, 2048M total swap, 1420M free swap
> PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
> 24910 zfssnap 1 60 2 243M 238M run 1:24 1.61% zfs
> 25620 zfssnap 1 53 2 220M 215M run 1:18 1.17% zfs
> 25619 zfssnap 1 53 2 220M 215M run 1:18 1.15% zfs
> 24753 zfssnap 1 53 2 243M 238M run 1:26 1.12% zfs
> 25864 zfssnap 1 53 2 220M 215M run 1:19 0.93% zfs
> 25861 zfssnap 1 53 2 13M 10M sleep 0:03 0.87% zfs
> 22380 zfssnap 1 60 2 764M 721M run 4:34 0.83% zfs
> 22546 zfssnap 1 53 2 698M 672M sleep 4:08 0.79% zfs
> 25857 zfssnap 1 53 2 220M 215M run 1:19 0.78% zfs
> 24224 zfssnap 1 60 2 548M 536M run 3:13 0.76% zfs
> 22901 zfssnap 1 60 2 698M 672M run 4:09 0.75% zfs
> 22551 zfssnap 1 60 2 698M 672M sleep 4:08 0.73% zfs
> 22373 zfssnap 1 60 2 767M 729M run 4:33 0.73% zfs
> 22212 zfssnap 1 60 2 767M 730M sleep 4:36 0.71% zfs
> 24215 zfssnap 1 60 2 549M 537M sleep 3:13 0.69% zfs
>
>
>Heh, by sheer coincidence, it froze 10 hours 1 second after I logged in
>(which in my profile also prints a few lines from top):
>
>last pid: 22036; load avg: 10.7, 10.3, 10.4; up 0+09:42:46
>18:25:16
>126 processes: 115 sleeping, 9 running, 2 on cpu
>CPU states: 0.0% idle, 21.4% user, 78.6% kernel, 0.0% iowait, 0.0%
>swap
>Memory: 16G phys mem, 1713M free mem, 2048M total swap, 2048M free swap
> PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
> 21266 zfssnap 1 46 2 660M 657M run 3:50 28.20% zfs
> 21274 zfssnap 1 45 2 672M 669M run 3:55 16.70% zfs
> 21757 zfssnap 1 50 2 314M 311M run 1:48 8.95% zfs
> 21389 zfssnap 1 52 2 586M 584M run 3:24 8.82% zfs
> 21173 zfssnap 1 53 2 762M 759M cpu/1 4:28 8.76% zfs
>
>
>So at the moment it seems there is some issue with zfs-auto-snapshots
>on OmniOS that I haven't seen in SXCE, OI nor Hipster. Possibly I had
>different implementations of the service in these different OSes (shell
>vs python, and all that at different versions).
>
>I'll see if toning down the frequency of autosnaps (e.g. disable
>"frequent" or "hourly" schedules) would help improve stability. If it
>does - I'd still call it a bug. System should not die like that. And
>the actual load (as in I/O ops) is seemingly not that gigantic.
>
>
>Jim
>
>----- Исходное сообщение -----
>От: Jim Klimov <jim at cos.ru>
>Дата: Thursday, October 22, 2015 20:02
>Тема: [OmniOS-discuss] OmniOS backup box hanging regularly
>Кому (To): OmniOS-discuss <omnios-discuss at lists.omniti.com>
>
>
>> Hello all,
>> I have this HP-Z400 workstation with 16Gb ECC(should be) RAM running
>OmniOS bloody, which acts as a backup server for our production systems
>(regularly rsync'ing large files off Linux boxes, and rotating ZFS
>auto-snapshots to keep its space free). Sometimes it also runs replicas
>of infrastructure (DHCP, DNS) and was set up as a VirtualBox +
>phpVirtualBox host to test that out, but no VMs running.
>> So the essential loads are ZFS snapshots and ZFS scrubs :)
>> And it freezes roughly every week. Stops responding to ping, attempts
>to log in via SSH or physical console - it processes keypresses on the
>latter, but does not present a login prompt. It used to be stable, and
>such regular hangs began around summertime.
>>
>> My primary guess would be for flaky disks, maybe timing out under
>load or going to sleep or whatever... But I have yet to prove it, or
>any other theory. Maybe just CPU is overheating due to regular
>near-100% load with disk I/O... At least I want to rule out OS errors
>and rule out (or point out) operator/box errors as much as possible -
>which is something I can change to try and fix ;)
>> Before I proceed to TL;DR screenshots, I'd overview what I see:
>> * In the "top" output, processes owned by zfssnap lead most of the
>time... But even the SSH shell is noticeably slow to respond (1 sec per
>line when just pressing enter to clear the screen to prepare nice
>screenshots).
>> * SMART was not enabled on 3TB mirrored "pool" SATA disks (is now,
>long tests initiated), but was in place on the "rpool" SAS disk where
>it logged some corrected ECC errors - but none uncorrected.
>> Maybe the cabling should be reseated.
>> * iostat shows disks are generally not busy (they don't audibly
>rattle nor visibly blink all the time, either)
>> * zpool scrubs return clean
>> * there are partitions of the system rpool disk (10K RPM SAS) used as
>log and cache devices for the main data pool on 3TB SATA disks. The
>system disk is fast and underutilized, so what the heck ;) And it was
>not a problem for the first year of this system's honest and stable
>workouts. These devices are pretty empty at the moment.
>>
>> I have enabled deadman panics according to Wiki, but none have
>happened so far:
>> # cat /etc/system | egrep -v '(^\*|^$)'
>> set snooping=1
>> set pcplusmp:apic_panic_on_nmi=1
>> set apix:apic_panic_on_nmi = 1
>
>>
>>
>> In the "top" output, processes owned by zfssnap lead most of the
>time:
>>
>> last pid: 22599; load avg: 12.9, 12.2, 11.2; up 0+09:52:11
> 18:34:41
>> 140 processes: 125 sleeping, 13 running, 2 on cpu
>> CPU states: 0.0% idle, 22.9% user, 77.1% kernel, 0.0% iowait, 0.0%
>swap
>> Memory: 16G phys mem, 1765M free mem, 2048M total swap, 2048M free
>swap
>> Seconds to delay:
>> PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
>> 21389 zfssnap 1 43 2 863M 860M run 5:04 35.61% zfs
>> 22360 zfssnap 1 52 2 118M 115M run 0:37 16.50% zfs
>> 21778 zfssnap 1 52 2 563M 560M run 3:15 13.17% zfs
>> 21278 zfssnap 1 52 2 947M 944M run 5:32 6.91% zfs
>> 21881 zfssnap 1 43 2 433M 431M run 2:31 5.41% zfs
>> 21852 zfssnap 1 52 2 459M 456M run 2:39 5.16% zfs
>> 21266 zfssnap 1 43 2 906M 903M run 5:18 3.95% zfs
>> 21757 zfssnap 1 43 2 597M 594M run 3:26 2.91% zfs
>> 21274 zfssnap 1 52 2 930M 927M cpu/0 5:27 2.78% zfs
>> 22588 zfssnap 1 43 2 30M 27M run 0:08 2.48% zfs
>> 22580 zfssnap 1 52 2 49M 46M run 0:14 0.71% zfs
>> 22038 root 1 59 0 5312K 3816K cpu/1 0:01 0.10% top
>> 22014 root 1 59 0 8020K 4988K sleep 0:00 0.02% sshd
>
>>
>> Average "iostats" are not that busy:
>>
>> # zpool iostat -Td 5
>> Thu Oct 22 18:24:59 CEST 2015
>> capacity operations bandwidth
>> pool alloc free read write read write
>> ---------- ----- ----- ----- ----- ----- -----
>> pool 2.52T 207G 802 116 28.3M 840K
>> rpool 33.0G 118G 0 4 4.52K 58.7K
>> ---------- ----- ----- ----- ----- ----- -----
>> Thu Oct 22 18:25:04 CEST 2015
>> pool 2.52T 207G 0 0 0 0
>> rpool 33.0G 118G 0 10 0 97.9K
>> ---------- ----- ----- ----- ----- ----- -----
>> Thu Oct 22 18:25:09 CEST 2015
>> pool 2.52T 207G 0 0 0 0
>> rpool 33.0G 118G 0 0 0 0
>> ---------- ----- ----- ----- ----- ----- -----
>> Thu Oct 22 18:25:14 CEST 2015
>> pool 2.52T 207G 0 0 0 0
>> rpool 33.0G 118G 0 9 0 93.5K
>> ---------- ----- ----- ----- ----- ----- -----
>> Thu Oct 22 18:25:19 CEST 2015
>> pool 2.52T 207G 0 0 0 0
>> rpool 33.0G 118G 0 0 0 0
>> ---------- ----- ----- ----- ----- ----- -----
>> Thu Oct 22 18:25:24 CEST 2015
>> pool 2.52T 207G 0 0 0 0
>> rpool 33.0G 118G 0 0 0 0
>> ---------- ----- ----- ----- ----- ----- -----
>> Thu Oct 22 18:25:29 CEST 2015
>> pool 2.52T 207G 0 0 0 0
>> rpool 33.0G 118G 0 0 0 0
>> ---------- ----- ----- ----- ----- ----- -----
>> Thu Oct 22 18:25:34 CEST 2015
>> pool 2.52T 207G 0 0 0 0
>> rpool 33.0G 118G 0 0 0 0
>> ---------- ----- ----- ----- ----- ----- -----
>> Thu Oct 22 18:25:39 CEST 2015
>> pool 2.52T 207G 0 0 0 0
>> rpool 33.0G 118G 0 16 0 374K
>> ---------- ----- ----- ----- ----- ----- -----
>> ...
>> Thu Oct 22 18:33:49 CEST 2015
>> pool 2.52T 207G 0 0 0 0
>> rpool 33.0G 118G 0 11 0 94.5K
>> ---------- ----- ----- ----- ----- ----- -----
>> Thu Oct 22 18:33:54 CEST 2015
>> pool 2.52T 207G 0 13 819 80.0K
>> rpool 33.0G 118G 0 0 0 0
>> ---------- ----- ----- ----- ----- ----- -----
>> Thu Oct 22 18:33:59 CEST 2015
>> pool 2.52T 207G 0 129 0 1.06M
>> rpool 33.0G 118G 0 0 0 0
>> ---------- ----- ----- ----- ----- ----- -----
>> Thu Oct 22 18:34:04 CEST 2015
>> pool 2.52T 207G 0 55 0 503K
>> rpool 33.0G 118G 0 11 0 97.9K
>> ---------- ----- ----- ----- ----- ----- -----
>> ...
>> just occasional bursts of work.
>> I've now enabled SMART on the disks (2*3Tb mirror "pool" and 1*300Gb
>"rpool") and ran some short tests and triggered long tests (hopefully
>they'd succeed by tomorrow); current results are:
>>
>>
>> # for D in /dev/rdsk/c0*s0; do echo "===== $D :"; smartctl -d sat,12
>-a $D ; done ; for D in /dev/rdsk/c4*s0 ; do echo "===== $D :";
>smartctl -d scsi -a $D ; done
>> ===== /dev/rdsk/c0t3d0s0 :
>> smartctl 6.0 2012-10-10 r3643 [i386-pc-solaris2.11] (local build)
>> Copyright (C) 2002-12, Bruce Allen, Christian Franke,
>www.smartmontools.org
>> === START OF INFORMATION SECTION ===
>> Device Model: WDC WD3003FZEX-00Z4SA0
>> Serial Number: WD-WCC5D1KKU0PA
>> LU WWN Device Id: 5 0014ee 2610716b7
>> Firmware Version: 01.01A01
>> User Capacity: 3,000,592,982,016 bytes [3.00 TB]
>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>> Rotation Rate: 7200 rpm
>> Device is: Not in smartctl database [for details use: -P
>showall]
>> ATA Version is: ACS-2 (minor revision not indicated)
>> SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
>> Local Time is: Thu Oct 22 18:45:28 2015 CEST
>> SMART support is: Available - device has SMART capability.
>> SMART support is: Enabled
>> === START OF READ SMART DATA SECTION ===
>> SMART overall-health self-assessment test result: PASSED
>> General SMART Values:
>> Offline data collection status: (0x82) Offline data collection
>activity
>> was completed without error.
>> Auto Offline Data Collection:
>Enabled.
>> Self-test execution status: ( 249) Self-test routine in
>progress...
>> 90% of test remaining.
>> Total time to complete Offline
>> data collection: (32880) seconds.
>> Offline data collection
>> capabilities: (0x7b) SMART execute Offline
>immediate.
>> Auto Offline data collection
>on/off support.
>> Suspend Offline collection
>upon new
>> command.
>> Offline surface scan
>supported.
>> Self-test supported.
>> Conveyance Self-test
>supported.
>> Selective Self-test
>supported.
>> SMART capabilities: (0x0003) Saves SMART data before
>entering
>> power-saving mode.
>> Supports SMART auto save
>timer.
>> Error logging capability: (0x01) Error logging supported.
>> General Purpose Logging
>supported.
>> Short self-test routine
>> recommended polling time: ( 2) minutes.
>> Extended self-test routine
>> recommended polling time: ( 357) minutes.
>> Conveyance self-test routine
>> recommended polling time: ( 5) minutes.
>> SCT capabilities: (0x7035) SCT Status supported.
>> SCT Feature Control
>supported.
>> SCT Data Table supported.
>> SMART Attributes Data Structure revision number: 16
>> Vendor Specific SMART Attributes with Thresholds:
>> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
>UPDATED WHEN_FAILED RAW_VALUE
>> 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail
>Always - 0
>> 3 Spin_Up_Time 0x0027 246 154 021 Pre-fail
>Always - 6691
>> 4 Start_Stop_Count 0x0032 100 100 000 Old_age
>Always - 14
>> 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail
>Always - 0
>> 7 Seek_Error_Rate 0x002e 200 200 000 Old_age
>Always - 0
>> 9 Power_On_Hours 0x0032 094 094 000 Old_age
>Always - 4869
>> 10 Spin_Retry_Count 0x0032 100 253 000 Old_age
>Always - 0
>> 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age
>Always - 0
>> 12 Power_Cycle_Count 0x0032 100 100 000 Old_age
>Always - 14
>> 16 Unknown_Attribute 0x0022 130 070 000 Old_age
>Always - 2289651870502
>> 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age
>Always - 12
>> 193 Load_Cycle_Count 0x0032 200 200 000 Old_age
>Always - 2
>> 194 Temperature_Celsius 0x0022 117 111 000 Old_age
>Always - 35
>> 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age
>Always - 0
>> 197 Current_Pending_Sector 0x0032 200 200 000 Old_age
>Always - 0
>> 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age
>Offline - 0
>> 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age
>Always - 0
>> 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age
>Offline - 0
>> SMART Error Log Version: 1
>> No Errors Logged
>> SMART Self-test log structure revision number 1
>> Num Test_Description Status Remaining
>LifeTime(hours) LBA_of_first_error
>> # 1 Short offline Completed without error 00% 4869
> -
>> SMART Selective self-test log data structure revision number 1
>> SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
>> 1 0 0 Not_testing
>> 2 0 0 Not_testing
>> 3 0 0 Not_testing
>> 4 0 0 Not_testing
>> 5 0 0 Not_testing
>> Selective self-test flags (0x0):
>> After scanning selected spans, do NOT read-scan remainder of disk.
>> If Selective self-test is pending on power-up, resume after 0 minute
>delay.
>> ===== /dev/rdsk/c0t5d0s0 :
>> smartctl 6.0 2012-10-10 r3643 [i386-pc-solaris2.11] (local build)
>> Copyright (C) 2002-12, Bruce Allen, Christian Franke,
>www.smartmontools.org
>> === START OF INFORMATION SECTION ===
>> Model Family: Seagate SV35
>> Device Model: ST3000VX000-1ES166
>> Serial Number: Z500S3L8
>> LU WWN Device Id: 5 000c50 079e3757b
>> Firmware Version: CV26
>> User Capacity: 3,000,592,982,016 bytes [3.00 TB]
>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>> Rotation Rate: 7200 rpm
>> Device is: In smartctl database [for details use: -P show]
>> ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b
>> SATA Version is: SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
>> Local Time is: Thu Oct 22 18:45:28 2015 CEST
>> SMART support is: Available - device has SMART capability.
>> SMART support is: Enabled
>> === START OF READ SMART DATA SECTION ===
>> SMART overall-health self-assessment test result: PASSED
>> General SMART Values:
>> Offline data collection status: (0x00) Offline data collection
>activity
>> was never started.
>> Auto Offline Data Collection:
>Disabled.
>> Self-test execution status: ( 249) Self-test routine in
>progress...
>> 90% of test remaining.
>> Total time to complete Offline
>> data collection: ( 80) seconds.
>> Offline data collection
>> capabilities: (0x73) SMART execute Offline
>immediate.
>> Auto Offline data collection
>on/off support.
>> Suspend Offline collection
>upon new
>> command.
>> No Offline surface scan
>supported.
>> Self-test supported.
>> Conveyance Self-test
>supported.
>> Selective Self-test
>supported.
>> SMART capabilities: (0x0003) Saves SMART data before
>entering
>> power-saving mode.
>> Supports SMART auto save
>timer.
>> Error logging capability: (0x01) Error logging supported.
>> General Purpose Logging
>supported.
>> Short self-test routine
>> recommended polling time: ( 1) minutes.
>> Extended self-test routine
>> recommended polling time: ( 325) minutes.
>> Conveyance self-test routine
>> recommended polling time: ( 2) minutes.
>> SCT capabilities: (0x10b9) SCT Status supported.
>> SCT Error Recovery Control
>supported.
>> SCT Feature Control
>supported.
>> SCT Data Table supported.
>> SMART Attributes Data Structure revision number: 10
>> Vendor Specific SMART Attributes with Thresholds:
>> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
>UPDATED WHEN_FAILED RAW_VALUE
>> 1 Raw_Read_Error_Rate 0x000f 105 099 006 Pre-fail
>Always - 8600880
>> 3 Spin_Up_Time 0x0003 096 094 000 Pre-fail
>Always - 0
>> 4 Start_Stop_Count 0x0032 100 100 020 Old_age
>Always - 19
>> 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail
>Always - 0
>> 7 Seek_Error_Rate 0x000f 085 060 030 Pre-fail
>Always - 342685681
>> 9 Power_On_Hours 0x0032 096 096 000 Old_age
>Always - 4214
>> 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
>Always - 0
>> 12 Power_Cycle_Count 0x0032 100 100 020 Old_age
>Always - 19
>> 184 End-to-End_Error 0x0032 100 100 099 Old_age
>Always - 0
>> 187 Reported_Uncorrect 0x0032 100 100 000 Old_age
>Always - 0
>> 188 Command_Timeout 0x0032 100 100 000 Old_age
>Always - 0
>> 189 High_Fly_Writes 0x003a 028 028 000 Old_age
>Always - 72
>> 190 Airflow_Temperature_Cel 0x0022 069 065 045 Old_age
>Always - 31 (Min/Max 29/32)
>> 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age
>Always - 0
>> 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age
>Always - 19
>> 193 Load_Cycle_Count 0x0032 100 100 000 Old_age
>Always - 28
>> 194 Temperature_Celsius 0x0022 031 040 000 Old_age
>Always - 31 (0 20 0 0 0)
>> 197 Current_Pending_Sector 0x0012 100 100 000 Old_age
>Always - 0
>> 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
>Offline - 0
>> 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
>Always - 0
>> SMART Error Log Version: 1
>> No Errors Logged
>> SMART Self-test log structure revision number 1
>> Num Test_Description Status Remaining
>LifeTime(hours) LBA_of_first_error
>> # 1 Extended offline Self-test routine in progress 90% 4214
> -
>> # 2 Short offline Completed without error 00% 4214
> -
>> SMART Selective self-test log data structure revision number 1
>> SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
>> 1 0 0 Not_testing
>> 2 0 0 Not_testing
>> 3 0 0 Not_testing
>> 4 0 0 Not_testing
>> 5 0 0 Not_testing
>> Selective self-test flags (0x0):
>> After scanning selected spans, do NOT read-scan remainder of disk.
>> If Selective self-test is pending on power-up, resume after 0 minute
>delay.
>> ===== /dev/rdsk/c4t5000CCA02A1292DDd0s0 :
>> smartctl 6.0 2012-10-10 r3643 [i386-pc-solaris2.11] (local build)
>> Copyright (C) 2002-12, Bruce Allen, Christian Franke,
>www.smartmontools.org
>> Vendor: HITACHI
>> Product: HUS156030VLS600
>> Revision: HPH1
>> User Capacity: 300,000,000,000 bytes [300 GB]
>> Logical block size: 512 bytes
>> Logical Unit id: 0x5000cca02a1292dc
>> Serial number: LVVA6NHS
>> Device type: disk
>> Transport protocol: SAS
>> Local Time is: Thu Oct 22 18:45:29 2015 CEST
>> Device supports SMART and is Enabled
>> Temperature Warning Enabled
>> SMART Health Status: OK
>> Current Drive Temperature: 45 C
>> Drive Trip Temperature: 70 C
>> Manufactured in week 14 of year 2012
>> Specified cycle count over device lifetime: 50000
>> Accumulated start-stop cycles: 80
>> Elements in grown defect list: 0
>> Vendor (Seagate) cache information
>> Blocks sent to initiator = 2340336504406016
>> Error counter log:
>> Errors Corrected by Total Correction
>Gigabytes Total
>> ECC rereads/ errors algorithm
>processed uncorrected
>> fast | delayed rewrites corrected invocations [10^9
>bytes] errors
>> read: 0 888890 0 888890 0
>29326.957 0
>> write: 0 961315 0 961315 0
>6277.560 0
>> Non-medium error count: 283
>> SMART Self-test log
>> Num Test Status segment LifeTime
>LBA_first_err [SK ASC ASQ]
>> Description number (hours)
>> # 1 Background long Self test in progress ... - NOW
> - [- - -]
>> # 2 Background long Aborted (device reset ?) - 14354
> - [- - -]
>> # 3 Background short Completed - 14354
> - [- - -]
>> # 4 Background long Aborted (device reset ?) - 14354
> - [- - -]
>> # 5 Background long Aborted (device reset ?) - 14354
> - [- - -]
>> Long (extended) Self Test duration: 2506 seconds [41.8 minutes]
>
>>
>> The zpool scrub results and general layout:
>>
>> # zpool status -v
>> pool: pool
>> state: ONLINE
>> scan: scrub repaired 0 in 164h13m with 0 errors on Thu Oct 22
>18:13:33 2015
>> config:
>> NAME STATE READ WRITE CKSUM
>> pool ONLINE 0 0 0
>> mirror-0 ONLINE 0 0 0
>> c0t3d0 ONLINE 0 0 0
>> c0t5d0 ONLINE 0 0 0
>> logs
>> c4t5000CCA02A1292DDd0p2 ONLINE 0 0 0
>> cache
>> c4t5000CCA02A1292DDd0p3 ONLINE 0 0 0
>> errors: No known data errors
>> pool: rpool
>> state: ONLINE
>> status: Some supported features are not enabled on the pool. The pool
>can
>> still be used, but some features are unavailable.
>> action: Enable all features using 'zpool upgrade'. Once this is done,
>> the pool may no longer be accessible by software that does
>not support
>> the features. See zpool-features(5) for details.
>> scan: scrub repaired 0 in 3h3m with 0 errors on Thu Oct 8 04:12:35
>2015
>> config:
>> NAME STATE READ WRITE CKSUM
>> rpool ONLINE 0 0 0
>> c4t5000CCA02A1292DDd0s0 ONLINE 0 0 0
>> errors: No known data errors
>
>> # zpool list -v
>> NAME SIZE ALLOC FREE EXPANDSZ FRAG
>CAP DEDUP HEALTH ALTROOT
>> pool 2.72T 2.52T 207G - 68%
>92% 1.36x ONLINE /
>> mirror 2.72T 2.52T 207G - 68%
>92%
>> c0t3d0 - - - - -
>-
>> c0t5d0 - - - - -
>-
>> log - - - - -
>-
>> c4t5000CCA02A1292DDd0p2 8G 148K 8.00G - 0%
>0%
>> cache - - - - -
>-
>> c4t5000CCA02A1292DDd0p3 120G 1.80G 118G - 0%
>1%
>> rpool 151G 33.0G 118G - 76%
>21% 1.00x ONLINE -
>> c4t5000CCA02A1292DDd0s0 151G 33.0G 118G - 76%
>21%
>
>> Note the long scrub time may have included the downtime while the
>system was frozen until it was rebooted.
>>
>> Thanks in advance for the fresh pairs of eyeballs,
>> Jim Klimov
>> _______________________________________________
>> OmniOS-discuss mailing list
>> OmniOS-discuss at lists.omniti.com
>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
Mail apparently got bounced, reposting...
--
Typos courtesy of K-9 Mail on my Samsung Android
More information about the OmniOS-discuss
mailing list