Intel® Server Board SE7520BD2 Technical Product Specification
Error Reporting and Handling
Revision 1.3
Intel Confidential
5.5 Reliability,
and Serviceability (RAS) Features
5.5.1 Memory
The MCH is designed to bring enterprise level reliability, availability, serviceability, usability, and
manageability to the DP server platform. The MCH supports ACPI power management, and
wake-from-LAN to maximize platform stand-by flexibility.
RAS features include:
Data protection – All internal data buses have some form of data protection
FSB Address and Data parity protection
Hublink even parity protection
Memory Scrubbing
DDR II memory mirroring
Sparing Memory
Periodically, a memory scrubbing unit will walk through all DRAM doing reads ever 32K clocks.
Correctable errors found by the read are corrected and then the good data written back to
DRAM. This scrubbing does not cause any noticeable degradation to memory bandwidth,
although they will cause a greater latency for that one very infrequent read that is delayed due
to the scrub write cycle. Memory
The MCH included specialized hardware to support fail-over to a spare DIMM device in the
event that a primary DIMM in use exceeds a specified threshold of runtime errors. This
prevents a failing DIMM with increasing error frequency from causing a catastrophic failure.
This feature is an alternative to memory mirroring.
DDR-1 Memory Mirroring
The mirroring feature is fundamentally a way for hardware to maintain two copies of all data in
the memory subsystem. This feature protects the system from failure, since an uncorrectable
memory is no longer fatal to the system. When an uncorrectable error occurs during normal
operation, hardware retrieves the mirror copy of the corrupted data. The case when both
primary and mirror copies of the same data are corrupted simultaneously is statistically very
unlikely. Mirroring reduces total memory capacity to half. No additional hardware support is
required for mirroring support.
5.5.2 PCI
In the PCI Express interface there are several pieces to the reliability of the data transferred.
The initial piece referred to as training is to establish the highest common bus width (x1, x4, or
x8) that the devices on the bus can communicate. Once the devices on the bus can
communicate, it can be determined via software why the devices failed to train at a higher data