Table 4–2 Error Handling Flowchart Definitions
Event
Definition
1
Hardware reports error through a high-level interrupt and control is
transferred to the EHS.
2
The EHS examines system registers to determine the type of failure which has
occurred.
3
The EHS identifies the FRU that is the source of the error. FRU isolation is
generally accomplished at the module level. In some cases, FRU isolation is to
a set of modules. In all cases, the EHS isolates the error to an FRU or set of
FRUs in one zone.
4
The EHS determines if the error is solid.
5
If the error is solid, the FRU is deconfigured from the system.
6
The EHS has successfully recovered from the error (either solid or transient)
and execution is continued at IPL8.
7 and 8
If the error is transient, it is compared to its error rate threshold.
9
If the error is below the error rate threshold, an entry is made in the error log.
10
If the error is above the error rate threshold, the FRU is deconfigured from the
system.
11
An entry is made in the error log.
12
The FTSS$SERVER is notified of the error through the ERI.
13
Error handling is complete.
4.2.2 EHS Structure
The EHS is packaged as part of the Fault Tolerant System Services (FTSS)
execlet (loadable image file). The FTSS execlet is loaded and initialized when
FTSS is started after the OpenVMS operating system is booted.
System errors are reported to software through an IPL 29 interrupt. When
an interrupt occurs, the hardware fetches the dispatch vector from the System
Control Block (SCB) and dispatches to the EHS interrupt service routine.
VAXELN errors are reported to the OpenVMS operating system through an IPL
22 interrupt. The interrupts are vectored by a combination of hardware and
software to the EHS interrupt service routine.
Figure 4–2 illustrates the position of the EHS relative to the major hardware,
system firmware, and other software components.
Error Handling and Analysis 4–3