132
IBM Power 595 Technical Overview and Introduction
4.1 Reliability
Highly reliable systems are built with highly reliable components. On IBM POWER6
processor-based systems, this basic principle is expanded on with a clear design for reliability
architecture and methodology. A concentrated, systematic, architecture-based approach is
designed to improve overall system reliability with each successive generation of system
offerings.
4.1.1 Designed for reliability
Systems that have fewer components and interconnects have fewer opportunities to fail.
Simple design choices such as integrating two processor cores on a single POWER chip can
dramatically reduce the opportunity for system failures. In this case, a 4-core server will
include half as many processor chips (and chip socket interfaces) in comparison to a single
CPU core per processor design. Not only does this reduce the total number of system
components, it reduces the total amount of heat generated in the design, resulting in an
additional reduction in required power and cooling components.
Parts selection also plays a critical role in overall system reliability. IBM uses three grades of
components, with Grade 3 defined as industry standard (off-the-shelf). As shown in
Figure 4-1, using stringent design criteria and an extensive testing program, the IBM
manufacturing team can produce Grade 1 components that are expected to be 10 times more
reliable than industry standard. Engineers select Grade 1 parts for the most critical system
components. Newly introduced organic packaging technologies, rated Grade 5, achieve the
same reliability as Grade 1 parts.
Figure 4-1 Component failure rates
Component failure rates
0
0.2
0.4
0.6
0.8
1
Grade 3
Grade 1
Grade 5
Summary of Contents for Power 595
Page 2: ......
Page 120: ...108 IBM Power 595 Technical Overview and Introduction...
Page 182: ...170 IBM Power 595 Technical Overview and Introduction...
Page 186: ...174 IBM Power 595 Technical Overview and Introduction...
Page 187: ......