1.3 Using LEDs to Identify the State of Devices
16
Sun Netra T5440 Server • September 2015
1.2.1.2
Memory Fault Handling
The server uses an advanced ECC technology, called chipkill, that corrects up to 4 bits in error
on nibble boundaries, as long as all of the bits are in the same DRAM. If a DRAM fails, the FB-
DIMM continues to function.
The following server features independently manage memory faults:
■
POST
– Based on ILOM configuration variables, POST runs when the server is powered
on.
For correctable memory errors (CEs), POST forwards the error to the Solaris Predictive
Self-Healing (PSH) daemon for error handling. If an uncorrectable memory fault is detected
or if a “storm” of CEs is detected, POST displays the fault with the device name of the
faulty FB-DIMMs, logs the fault, and disables the faulty FB-DIMMs by placing them in the
ASR blacklist. Depending on the memory configuration and the location of the faulty FB-
DIMM, POST disables half of physical memory in the system, or half the physical memory
and half the processor threads. When this offlining process occurs in normal operation, you
must replace the faulty FB-DIMMs based on the fault message. You then must enable the
disabled FB-DIMMs with the ALOM CMT CLI
enablecomponent
command.
■
Solaris Predictive Self-Healing (PSH) technology
– A feature of the Solaris OS, uses the
fault manager daemon (
fmd
) to watch for various kinds of faults. When a fault occurs, the
fault is assigned a unique fault ID (UUID), and logged. PSH reports the fault and provides a
recommended proactive replacement for the FB-DIMMs associated with the fault.
1.2.1.3
Troubleshooting Memory Faults
If you suspect that the server has a memory problem, follow the flowchart (
the ALOM CMT compatability CLI (in ILOM)
showfaults
With the Service Processor” on page 24
. The
showfaults
command lists memory faults and lists the specific FB-DIMMS that are associated
with the fault. Once you identify which FB-DIMMs to replace, see
for FB-DIMM replacement instructions. You must perform the
instructions in that chapter to clear the faults and enable the replaced FB-DIMMs.
1.3
Using LEDs to Identify the State of Devices
The server provides the following groups of LEDs:
■
“1.3.1 Front and Rear Panel LEDs” on page 17
■
“1.3.2 Hard Drive LEDs” on page 19
■
“1.3.3 Power Supply LEDs” on page 20
■
Summary of Contents for Sun Netra T5440
Page 1: ...Part No E27132 03 September 2015 Sun Netra T5440 Server Service Manual ...
Page 2: ......
Page 10: ...10 Sun Netra T5440 Server September 2015 ...
Page 56: ...56 Sun Netra T5440 Server September 2015 ...
Page 128: ...128 Sun Netra T5440 Server September 2015 ...