78
IBM System p5 550 and 550Q Technical Overview and Introduction
3.1 Reliability, availability, and serviceability
Excellent quality and reliability are inherent in all aspects of the IBM System p processor
design and manufacturing. The fundamental objective of the design approach is to minimize
outages. The RAS features help to ensure that the system operates when required, performs
reliably, and efficiently handles any failures that might occur. This is achieved using
capabilities that are provided by both the hardware and the operating system AIX 5L.
The p5-550 or p5-550Q, as a server, enhances the RAS capabilities that are
implemented in POWER4-based systems. RAS enhancements that are available on
POWER5 and Servers are:
Most firmware updates allow the system to remain operational.
The ECC has been extended to inter-chip connections for the fabric and processor bus.
Partial L2 cache deallocation is possible.
The number of L3 cache line deletes improved from two to ten for better self-healing
capability.
The following sections describe the concepts that form the basis of leadership RAS features
of IBM System p5 systems in more detail.
3.1.1 Fault avoidance
System p5 servers are built on a quality-based design that is intended to keep errors from
happening. This design includes the following features:
Reduced power consumption, cooler operating temperatures for increased reliability,
which is enabled by the use of copper chip circuitry, silicon-on-insulator, and dynamic
clock gating
Mainframe-inspired components and technologies
3.1.2 First-failure data capture
If a problem should occur, the ability to diagnose that problem correctly is a fundamental
requirement upon which improved availability is based. The p5-550 and p5-550Q incorporate
advanced capability in start-up diagnostics and in runtime first-failure data capture (FDDC)
based on strategic error checkers built into the chips.
Any errors detected by the pervasive error checkers are captured into Fault Isolation
Registers (FIRs), which can be interrogated by the service processor. The service processor
has the capability to access system components using special purpose ports or by access to
the error registers. Figure 3-1 on page 79 shows a schematic of a Fault Register
Implementation.
Summary of Contents for p5 550
Page 2: ......
Page 8: ...vi IBM System p5 550 and 550Q Technical Overview and Introduction...
Page 14: ...xii IBM System p5 550 and 550Q Technical Overview and Introduction...
Page 38: ...24 IBM System p5 550 and 550Q Technical Overview and Introduction...
Page 90: ...76 IBM System p5 550 and 550Q Technical Overview and Introduction...
Page 104: ...90 IBM System p5 550 and 550Q Technical Overview and Introduction...
Page 108: ...94 IBM System p5 550 and 550Q Technical Overview and Introduction...
Page 109: ......