Error Reporting and Handling
Intel® Server Board SE7520AF2 TPS
198
Revision
1.2
Intel order number C77866-003
Synrome
Memory 33h
04h 0Ch 08h 6Fh 01h
A1h Data2:
DIMM
location
Data3:
Synrome
Uncorrectabl
e ECC
POST
Error
31h
04h 0Fh 06h 6Fh Data2[0:
3]
A0h or
Data2
See POST
error code
table7.3.3.
Data2:low
byte
Data3:hig
h byte
POST Error
System
Event
03h
04h
12h 83h 6Fh
05h 05h Data2:
00h(1
st
of
pair)
80h(2
nd
of
pair)
Data3:0ffh
Timestamp
Clock Sync
System
Event
03h
04h
12h 83h 6Fh
01h 01h Data2:0ffh
Data3:0ffh
Boot event
log
Critical
Interrupt
31h
04h 13h EAh 6Fh 04h
A4h Data2:bus
num
Data3:dev
and fun
PCI SERR
Critical
Interrupt
31h
04h 13h EBh 6Fh 05h
A5h Data2:bus
num
Data3:dev
and fun
PCI PERR
7.2.4
Single Bit ECC Error Throttling Prevention
The system detects, corrects, and logs correctable errors. As long as these errors occur
infrequently, the system should continue to operate without a problem.
Occasionally, correctable errors are caused by a persistent failure of a single component. For
example, a broken data line on a DIMM would exhibit repeated errors until replaced. Although
these errors are correctable, continual calls to the error logger can throttle the system,
preventing any further useful work.
For this reason, the system counts certain types of correctable errors and disables reporting if
they occur too frequently. Correction remains enabled but calls to the error handler are
disabled. This allows the system to continue running, despite a persistent correctable failure.
The BIOS adds an entry to the event log to indicate that logging for that type of error has been
disabled. Such an entry indicates a serious hardware problem that must be repaired at the
earliest possible time.
The system BIOS implements this feature for two types of errors, correctable memory errors
and correctable bus errors. If ten errors occur in a single wall-clock hour, the corresponding