[OmniOS-discuss] Fault detection

steve at linuxsuite.org steve at linuxsuite.org
Fri Feb 1 10:26:35 EST 2013


  Howdy!

        I am trying to test the fault management system, so I yanked out
one of the disks in the mirrored rpool. It detects the failure and resilvers
the hot spare, syslogs the condition ,, but the fault management system
doesn't seem
to notice (as per fmadm faulty -a see below) and I don't get a
notification...
Shouldn't fm facility see this "error"? Or how can I test for failed disk
detection/notification, removing one of the disks in a mirrored volume should
be enough?

root at live-dfs-2:~# zpool status rpool
  pool: rpool
 state: DEGRADED
status: One or more devices has been removed by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: resilvered 13.9G in 0h2m with 0 errors on Thu Jan 31 11:54:40 2013
config:

        NAME            STATE     READ WRITE CKSUM
        rpool           DEGRADED     0     0     0
          mirror-0      DEGRADED     0     0     0
            c1t0d0s0    ONLINE       0     0     0
            spare-1     DEGRADED     0     0     0
              c1t1d0s0  REMOVED      0     0     0
              c1t2d0s0  ONLINE       0     0     0
        spares
          c1t2d0s0      INUSE     currently in use

errors: No known data errors


  fmadm sees only historical rsync fault.. ??? in which I did get a
notification.


root at live-dfs-2:~# fmadm faulty -a
--------------- ------------------------------------  --------------
---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  --------------
---------
Jan 28 11:52:22 2d48eac3-ce73-e0f0-9487-b2aee3fbeb74  SMF-8000-YX    major

Host        : dfs2
Platform    : PowerEdge-R710    Chassis_id  : 93FCWL1
Product_sn  :

Fault class : defect.sunos.smf.svc.maintenance
Affects     : svc:///network/rsyncd:default
                  ok and in service
Problem in  : svc:///network/rsyncd:default
                  repair attempted

Description : A service failed - a method is failing in a retryable manner
but
              too often.
              Refer to http://illumos.org/msg/SMF-8000-YX for more
information.

Response    : The service has been placed into the maintenance state.

Impact      : svc:/network/rsyncd:default is unavailable.

Action      : Run 'svcs -xv svc:/network/rsyncd:default' to determine the
              generic reason why the service failed, the location of any
              logfiles, and a list of other services impacted.






More information about the OmniOS-discuss mailing list