52
OpenPower 720 Technical Overview and Introduction
hardware, system firmware, and system interaction have been designed to allow transparent
recovery of intermittent PCI bus parity errors and graceful transition to the I/O device
available state in the case of a permanent parity error in the PCI bus.
EEH-enabled adapters respond to a special data packet generated from the affected PCI slot
hardware by calling system firmware, which will examine the affected bus, allow the device
driver to reset it, and continue without a system reboot.
Persistent deallocation functions include:
Processor
Memory
Deconfigure or bypass failing I/O adapters
Following a hardware error that has been flagged by the service processor, the subsequent
reboot of the system will invoke extended diagnostics. If a processor or L3 cache has been
marked for deconfiguration by persistent processor deallocation, the boot process will attempt
to proceed to completion with the faulty device automatically deconfigured. Failing I/O
adapters will be deconfigured or bypassed during the boot process.
3.1.8 Serviceability
Increasing service productivity means the system is up and running for a longer time.
OpenPower improves service productivity by providing the functions described in the
following subsections:
Error indication and LED indicators
The OpenPower 720 is designed for customer setup of the machine and for the subsequent
addition of most hardware features. The OpenPower 720 also allows customers to replace
service parts (Customer Replaceable Unit). To accomplish this, the system provides internal
LED diagnostics that will identify parts that require service. Attenuation of the error is
provided through a series of light attention signals, starting on the exterior of the system
(System Attention LED) located on the front of the system, and ending with an LED near the
failing Field Replaceable Unit.
For more information about Customer Replaceable Units, including videos, see:
http://publib.boulder.ibm.com/eserver
System Attention LED
The attention indicator is represented externally by an amber LED on the operator panel and
the back of the system unit. It is used to indicate that the system is in one of the following
states:
Normal state, LED is off.
Fault state, LED is on solid.
Identify state, LED is blinking.
Additional LEDs on I/O components such as PCI-X slots and disk drives provide status
information such as power, hot-swap, and need for service.
Note: The auto-restart (reboot) option, when enabled, can reboot the system automatically
following an unrecoverable software error, software hang, hardware failure, or
environmentally induced failure (such as loss of power supply)
Summary of Contents for OpenPower 720
Page 2: ......
Page 28: ...18 OpenPower 720 Technical Overview and Introduction...
Page 68: ...58 OpenPower 720 Technical Overview and Introduction...
Page 72: ...62 OpenPower 720 Technical Overview and Introduction...
Page 73: ......