Chapter 10. Hardware/software problem determination
How to use this information
This chapter helps diagnose problems associated with the eServer Cluster 1350.
Cluster 1350 is an integrated Linux Cluster that includes IBM and Third Party
hardware and software components like server nodes and associated service
processors, storage and networking subsystems, plus Cluster Systems Management
(CSM) and General Parallel File System (GPFS) software.
Problem determination involves identifying the likely cluster component where the
problem might have occurred, and following the relevant problem determination
steps for that component.
This chapter will aid in the diagnosis of problems down to the component level.
Once a failing componenet is identified you should refer to the component’s
product documentation for further actions. Links to product web sites and online
product documentation are provided in this chapter as appropriate.
Diagnosing hardware/software problems in a clustered environment requires a
basic understanding of how the components of the eServer Cluster 1350 function
together.
The cluster consists of:
v
One or more 19
″
racks.
v
From 4 to 512 Cluster Nodes. The nodes of the cluster may be an x335 or
BladeCenter containing at least four Blade servers. The nodes are configured to
execute customer applications or provide other services required by the
customer - such as file server, network gateway, or storage server.
v
One Management Node (an x345) for cluster systems management and
administration.
v
A Management Ethernet VLAN used for secure traffic for hardware control.
The Management Ethernet VLAN is used for management traffic only. It is
logically isolated for security using the VLAN capability of the Cisco Ethernet
switches, and is only accessible from the Management Node. The Cluster VLAN
and Management VLANs share the same physical Cisco switches.
v
A Cluster VLAN used for other management traffic and user traffic. Cisco
switches integrated with the cluster are used for the Management Ethernet
VLAN and the Cluster Ethernet VLAN.
v
Service Processor networks. All nodes in the cluster are connected via
daisy-chained service processors (x335) and/or Remote Supervisor Adapter
cards. The first node in a daisy-chain must have a Remote Supervisor Adapter
which is Ethernet connected to the Management Ethernet VLAN.
v
A Terminal Server network for Remote Console, using the MRV In-Reach
terminal server. Optionally, the customer may elect to include an additional
network.
v
A high-performance Myrinet 2000 cluster interconnect, or an additional 10/100
Ethernet.
© Copyright IBM Corp. 2003
59
Summary of Contents for System Cluster 1350
Page 1: ...eServer Cluster 1350 Cluster 1350 Installation and Service IBM...
Page 2: ......
Page 3: ...eServer Cluster 1350 Cluster 1350 Installation and Service IBM...
Page 8: ...vi Installation and Service...
Page 10: ...viii Installation and Service...
Page 12: ...x Installation and Service...
Page 20: ...2 Installation and Service...
Page 30: ...12 Installation and Service...
Page 32: ...14 Installation and Service...
Page 52: ...34 Installation and Service...
Page 68: ...50 Installation and Service...
Page 70: ...52 Installation and Service...
Page 72: ...54 Installation and Service...
Page 74: ...56 Installation and Service...
Page 92: ...74 Installation and Service...
Page 96: ...78 Installation and Service...
Page 98: ...80 Installation and Service...
Page 104: ...86 Installation and Service...
Page 110: ...92 Installation and Service...
Page 124: ...106 Installation and Service...
Page 126: ...108 Installation and Service...
Page 138: ...120 Installation and Service...
Page 139: ...Part 4 Appendixes Copyright IBM Corp 2003 121...
Page 140: ...122 Installation and Service...
Page 144: ...126 Installation and Service...
Page 148: ...130 Installation and Service...
Page 154: ...136 Installation and Service...
Page 160: ...142 Installation and Service...
Page 169: ......
Page 170: ...IBMR Printed in U S A...