Product Fault Management
Overview
This section describes how errors are handled by the microcode
and software, how the errors are logged, and how, through the
symptom-directed diagnosis (SDD) tool, VAXsimPLUS, errors are
brought to the attention of the user. This section also provides
the service theory used to interpret error logs to isolate the FRU.
Interpreting error logs to isolate the FRU is the primary method
of diagnosis.
General
Exception
and Interrupt
Handling
This section describes the first step of error notification: the
errors are first handled by the microcode and then are dispatched
to the VMS error handler.
The kernel uses the NVAX core chipset: NVAX CPU, NVAX
memory controller (NMC), and NDAL-to-CDAL adapter (NCA).
Internal errors within the NVAX CPU result in machine check
exceptions, through system control block (SCB) vector 004, or soft
error interrupts at interrupt priority level (IPL) 1A, SCB vector
054 hex.
External errors to the NVAX CPU, which are detected by the
NMC NCA, usually result in these chips posting an error
condition to the NVAX CPU. The NVAX CPU then generates a
machine check exception through SCB vector 004, hard error
interrupt, IPL 1D, through SCB vector 060 (hex), or a soft error
interrupt through SCB vector 054.
External errors to the NMC and NCA, which are detected by
chips on the CDAL busses for transactions that originated by the
NVAX CPU, are typically signaled back to the NCA adapter. The
NCA adapter posts an error signal back to the NVAX CPU, which
generates a machine check or high level interrupt.
In the case of direct memory access (DMA) transactions where the
NCA or NMC detects the error, the errors are typically signaled
back to the CDAL-Bus device, but not posted to the NVAX CPU.
In these cases the CDAL-Bus device typically posts a device level
interrupt to the NVAX CPU by way of the NCA. In almost all
cases, the error state is latched by the NMC and NCA. Although
Continued on next page
5–56