29 September 1997 – Subject To Change
Error Detection and Error Handling
8–1
8
Error Detection and Error Handling
This chapter provides an overview of the error handling strategy of the 21164PC.
Each internal cache (instruction cache [Icache] and data cache [Dcache]) implements
parity protection for tag and data. Longword parity protection is implemented for
memory and backup cache (Bcache) data. Bcache tag and control (valid and dirty
bits) are parity protected. The instruction fetch/decode unit and branch unit (IDU)
implements logic that detects when no progress has been made for a very long time
and forces a machine check trap.
PALcode handles all error traps (machine checks and parity error interrupts). Where
possible, the address of affected data is latched in an onchip IPR. Most of the Istream
errors can be retried by the operating system because the machine check occurs
before any part of the instruction causing the error is executed. In some other cases,
the system may be able to recover from an error by terminating all processes that had
access to the affected memory location.
8.1 Error Flows
The following flows describe the events that take place during an error, the recom-
mended responses necessary to determine the source of the error, and the suggested
actions to resolve them.
8.1.1 Icache Data or Tag Parity Error
•
Machine check occurs before the instruction causing the parity error is executed.
•
EXC_ADDR contains either the PC of the instruction that caused the parity error
or that of an earlier trapping instruction.
•
ICPERR_STAT<TPE> or <DPE> is set.
•
Can be retried.