
Intel® Server Board SE7520BD2 Technical Product Specification
Error Reporting and Handling
Revision 1.3
Intel Confidential
133
5.5 Reliability,
Availability
and Serviceability (RAS) Features
5.5.1 Memory
RAS
features
The MCH is designed to bring enterprise level reliability, availability, serviceability, usability, and
manageability to the DP server platform. The MCH supports ACPI power management, and
wake-from-LAN to maximize platform stand-by flexibility.
RAS features include:
•
Data protection – All internal data buses have some form of data protection
o
FSB Address and Data parity protection
o
Hublink even parity protection
o
Memory
interface
•
DRAM ECC
•
Memory Scrubbing
•
DDR II memory mirroring
•
Sparing
5.5.1.1 Memory
scrubbing
Periodically, a memory scrubbing unit will walk through all DRAM doing reads ever 32K clocks.
Correctable errors found by the read are corrected and then the good data written back to
DRAM. This scrubbing does not cause any noticeable degradation to memory bandwidth,
although they will cause a greater latency for that one very infrequent read that is delayed due
to the scrub write cycle.
5.5.1.2 Memory
sparing
The MCH included specialized hardware to support fail-over to a spare DIMM device in the
event that a primary DIMM in use exceeds a specified threshold of runtime errors. This
prevents a failing DIMM with increasing error frequency from causing a catastrophic failure.
This feature is an alternative to memory mirroring.
5.5.1.3
DDR-1 Memory Mirroring
The mirroring feature is fundamentally a way for hardware to maintain two copies of all data in
the memory subsystem. This feature protects the system from failure, since an uncorrectable
memory is no longer fatal to the system. When an uncorrectable error occurs during normal
operation, hardware retrieves the mirror copy of the corrupted data. The case when both
primary and mirror copies of the same data are corrupted simultaneously is statistically very
unlikely. Mirroring reduces total memory capacity to half. No additional hardware support is
required for mirroring support.
5.5.2 PCI
Express
In the PCI Express interface there are several pieces to the reliability of the data transferred.
The initial piece referred to as training is to establish the highest common bus width (x1, x4, or
x8) that the devices on the bus can communicate. Once the devices on the bus can
communicate, it can be determined via software why the devices failed to train at a higher data
width.