3-2
SPARC Enterprise T1000 Server Service Manual • April 2007
■
ALOM CMT firmware
– Is the system firmware that runs on the system
controller. In addition to providing the interface between the hardware and OS,
ALOM CMT also tracks and reports the health of key server components. ALOM
CMT works closely with POST and Solaris Predictive Self-Healing technology to
keep the system up and running even when there is a faulty component.
■
Power-on self-test (POST)
– Performs diagnostics on system components upon
system reset to ensure the integrity of those components. POST is configurable
and works with ALOM CMT to take faulty components offline if needed and
blacklist them in the
asr-db
.
■
Solaris OS Predictive Self-Healing (PSH)
–
This technology continuously
monitors the health of the CPU and memory, and works with ALOM CMT to take
a faulty component offline if needed. The Predictive Self-Healing technology
enables systems to accurately predict component failures and mitigate many
serious problems before they occur.
■
Log files and console messages
– Provide the standard Solaris OS log files and
investigative commands that can be accessed and displayed on the device of your
choice.
■
SunVTS™
–
An application that exercises the system, provides hardware
validation, and discloses possible faulty components with recommendations for
repair.
The LEDs, ALOM CMT, Solaris OS PSH, and many of the log files and console
messages are integrated. For example, a fault detected by the Solaris PSH software
displays the fault, logs it, passes information to ALOM CMT where it is logged, and
depending on the fault, might illuminate of one or more LEDs.
The flow chart in
and
describes an approach for using the server
diagnostics to identify a faulty field-replaceable unit (FRU). The diagnostics you use,
and the order in which you use them, depend on the nature of the problem you are
troubleshooting, so you might perform some actions and not others.
The flow chart assumes that you have already performed some troubleshooting such
as verification of proper installation and visual inspection of cables and power, and
possibly performed a reset of the server (refer to the
SPARC Enterprise T1000 Server
Installation Guide
and SPARC Enterprise T1000 Server Administration Guide for
details).
is a flow chart of the diagnostics available to troubleshoot faulty
hardware.
has more information about each diagnostic in this chapter.
Note –
POST is configured with ALOM CMT configuration variables (
). If
diag_level
is set to
max
(
diag_level=max
), POST reports
all
detected FRUs
including memory devices with errors correctable by Predictive Self-Healing (PSH).
Thus, not all memory devices detected by POST need to be replaced. See
Section 3.4.5, “Correctable Errors Detected by POST” on page 3-35
Summary of Contents for SPARC ENTERPRISE T1000
Page 1: ......
Page 2: ......
Page 6: ......
Page 11: ...Contents ix Index Index 1 ...
Page 12: ...x SPARC Enterprise T1000 Server Service Manual April 2007 ...
Page 16: ...xiv SPARC Enterprise T1000 Server Service Manual April 2007 ...
Page 25: ...Preface xxiii Reader s Comment Form ...
Page 32: ...2 4 SPARC Enterprise T1000 Server Service Manual April 2007 ...
Page 35: ...Chapter 3 Server Diagnostics 3 3 FIGURE 3 1 Diagnostic Flow Chart flow chart ...
Page 86: ...3 54 SPARC Enterprise T1000 Server Service Manual April 2007 ...
Page 126: ...A 4 SPARC Enterprise T1000 Server Service Manual April 2007 ...
Page 130: ...Index 4 SPARC Enterprise T1000 Server Service Manual April 2007 ...
Page 131: ......
Page 132: ......