16
IBM
Eserver
xSeries 455 Planning and Installation Guide
Chipkill memory
Chipkill is integrated into the XA-64 chipset and does not require special
Chipkill DIMMs. Chipkill corrects multiple single-bit errors to keep a DIMM
from failing. When combining Chipkill with Memory ProteXion and Active
Memory, the x455 provides very high reliability in the memory subsystem.
Chipkill memory is approximately 100 times more effective than ECC
technology, providing correction for up to 4 bits per DIMM, whether on a single
chip or multiple chips.
If a memory chip error does occur, Chipkill is designed to automatically take
the inoperative memory chip offline while the server keeps running. The
memory controller provides memory protection similar in concept to disk array
striping with parity, writing the memory bits across multiple memory chips on
the DIMM. The controller is able to reconstruct the “missing” bit from the failed
chip and continue working as usual.
Chipkill support is provided in the memory controller and implemented using
standard RDIMMs, so it is transparent to the operating system.
In addition, to maintain the highest levels of system availability, if a memory error
is detected during POST or memory configuration, the server can automatically
disable the failing DIMM and continue operating with reduced memory capacity.
You can manually re-enable the memory bank after the problem is corrected via
the Setup/Configuration option in the EFI Firmware Boot Manager menu. EFI is
Extensible Firmware Interface
, the replacement to BIOS as described in
“Extensible Firmware Interface” on page 29.
Memory ProteXion, memory mirroring, and Chipkill provide multiple levels of
redundancy to the memory subsystem. Combining Memory ProteXion with
Chipkill enables up to two memory chip failures per memory port (14 DIMMs).
Both memory ports could sustain up to four memory chip failures. Memory
mirroring provides additional protection with the ability to continue operations
with memory module failures.
1. The first failure detected by the Chipkill algorithm on each port does not
generate a light path diagnostics error, since Memory ProteXion recovers
from the problem automatically.
2. Each memory port could then sustain a second chip failure without shutting
down.
3. Provided that memory mirroring is enabled, the third chip failure on that port
would send the alert and take the DIMM offline, but keep the system running
out of the redundant memory bank.
The combination of these technologies provides the most reliable memory
subsystem available.
Summary of Contents for 88553RX
Page 2: ......
Page 214: ...200 IBM Eserver xSeries 455 Planning and Installation Guide Figure 5 14 Connect to the x455...
Page 228: ...214 IBM Eserver xSeries 455 Planning and Installation Guide...
Page 229: ...IBM Eserver xSeries 455 Planning and Installation Guide...
Page 230: ......
Page 231: ......