3-8
SPARC Enterprise M3000 Server Service Manual • January 2009
Predictive self-healing is an architecture and methodology for automatically
diagnosing, reporting, and handling software and hardware error conditions. This
new technology reduces the time required to debug a hardware or software problem
and provides the administrator and service engineer with detailed data about each
error.
3.4.1
Predictive Self-Healing Tools
In the Solaris OS, Solaris Fault Manager runs in the background. When an error
occurs, the system software recognizes the error and attempts to determine the
faulty hardware component. The system software also takes steps to prevent the
faulty component from being used until it has been replaced. The system software
performs the following activities:
■
Receives telemetry information about errors detected by the system software.
■
Diagnoses the errors.
■
Initiates predictive self-healing activities. For example, Solaris Fault Manager can
disable faulty components.
■
When possible, causes the faulty FRU to provide an LED indication of the error in
addition to populating system console messages with more details.
shows typical messages generated when an error occurs. Messages are
displayed on your console and are recorded in the
/var/adm/messages
file.
A message in
indicates that the fault has already been diagnosed. If there
was any corrective action that the system could take, the system has already taken it.
If your server is still running, the corrective action continues to be taken.
TABLE 3-3
Predictive Self-Healing Messages
Output displayed
Description
Nov 1 16:30:20 dt88-292 EVENT-TIME:Tue Nov 1
16:30:20 PST 2005
EVENT-TIME: The time stamp of the diagnosis
Nov 1 16:30:20 dt88-292 PLATFORM:SUNW,A70,
CSN:-, HOSTNAME:dt88-292
PLATFORM: A description of the server encountering
the error
Nov 1 16:30:20 dt88-292 SOURCE:eft, REV: 1.13
SOURCE: Information on the Diagnosis Engine used to
determine the error
Nov 1 16:30:20 dt88-292 EVENT-ID:afc7e660-d609-
4b2f-86b8-ae7c6b8d50c4
EVENT-ID: The Universally Unique event ID for this
error
Nov 1 16:30:20 dt88-292 DESC:
Nov 1 16:30:20 dt88-292 A problem was detected in the
PCI Express subsystem
DESC: A basic description of the error
Summary of Contents for SPARC Series
Page 4: ......
Page 12: ...xii SPARC Enterprise M3000 Server Service Manual January 2009 ...
Page 22: ...1 6 SPARC Enterprise M3000 Server Service Manual January 2009 ...
Page 102: ...6 10 SPARC Enterprise M3000 Server Service Manual January 2009 ...
Page 108: ...7 6 SPARC Enterprise M3000 Server Service Manual January 2009 ...
Page 114: ...8 6 SPARC Enterprise M3000 Server Service Manual January 2009 ...
Page 120: ...9 6 SPARC Enterprise M3000 Server Service Manual January 2009 ...
Page 132: ...11 6 SPARC Enterprise M3000 Server Service Manual January 2009 ...
Page 138: ...12 6 SPARC Enterprise M3000 Server Service Manual January 2009 ...
Page 144: ...13 6 SPARC Enterprise M3000 Server Service Manual January 2009 ...
Page 152: ...14 8 SPARC Enterprise M3000 Server Service Manual January 2009 ...
Page 158: ...15 6 SPARC Enterprise M3000 Server Service Manual January 2009 ...
Page 162: ...A 4 SPARC Enterprise M3000 Server Service Manual January 2009 ...
Page 168: ...B 6 SPARC Enterprise M3000 Server Service Manual January 2009 ...
Page 188: ...E 8 SPARC Enterprise M3000 Server Service Manual January 2009 ...