Chapter 3
Server Diagnostics
3-7
3.1.1.1
Memory Configuration
In the server memory, there are eight slots that hold DDR-2 memory DIMMs in the
following DIMM sizes:
■
512 MB (maximum of 4 GB)
■
1 GB (maximum of 8 GB)
■
2 GB (maximum of 16 GB)
■
4 GB (maximum of 32 GB)
All DIMMS installed must be the same size, and DIMMs must be added four at a
time. In addition, Rank 0 memory must be fully populated for the server to function.
See
Section 5.6.2, “Installing DIMMs” on page 5-21
, for instructions about adding
memory to the server.
3.1.1.2
Memory Fault Handling
The server uses advanced ECC technology, also called chipkill, that corrects up to 4-
bits in error on nibble boundaries, as long as the bits are all in the same DRAM. If a
DRAM fails, the DIMM continues to function.
The following server features independently manage memory faults:
■
POST
– Based on ALOM CMT configuration variables, POST runs when the
server is powered on. In normal operation, the default configuration of POST
(
diag_level=min
), provides a check to ensure the server will boot. Normal
operation applies to any boot of the server not intended to test power-on errors,
hardware upgrades, or repairs. Once the Solaris OS is running, PSH provides run-
time diagnosis of faults.
When a memory fault is detected, POST displays the fault with the device name
of the faulty DIMMS, logs the fault, and disables the faulty DIMMs by placing
them in the ASR blacklist. For a given memory fault, POST disables half of the
physical memory in the system. When this offlining process occurs in normal
operation, you must replace the faulty DIMMs based on the fault message and
enable the disabled DIMMs with the ALOM CMT
enablecomponent
command.
In other than normal operation, POST can be configured to run various levels of
testing (see
and
) and can thoroughly test the memory
subsystem based on the purpose of the test. However, with thorough testing
enabled (
diag_level=max
), POST finds faults and offlines memory devices with
errors that could be correctable with PSH. Thus, not all memory devices detected
and offlined by POST need to be replaced. See
Section 3.4.5, “Correctable Errors
Summary of Contents for SPARC ENTERPRISE T1000
Page 1: ......
Page 2: ......
Page 6: ......
Page 11: ...Contents ix Index Index 1 ...
Page 12: ...x SPARC Enterprise T1000 Server Service Manual April 2007 ...
Page 16: ...xiv SPARC Enterprise T1000 Server Service Manual April 2007 ...
Page 25: ...Preface xxiii Reader s Comment Form ...
Page 32: ...2 4 SPARC Enterprise T1000 Server Service Manual April 2007 ...
Page 35: ...Chapter 3 Server Diagnostics 3 3 FIGURE 3 1 Diagnostic Flow Chart flow chart ...
Page 86: ...3 54 SPARC Enterprise T1000 Server Service Manual April 2007 ...
Page 126: ...A 4 SPARC Enterprise T1000 Server Service Manual April 2007 ...
Page 130: ...Index 4 SPARC Enterprise T1000 Server Service Manual April 2007 ...
Page 131: ......
Page 132: ......