Table 4–4 (Cont.) Error Types
Error Type
Definition
Power
failures
If a zone loses power in a non-Simplex configuration, hardware generates
an interrupt to report the event to the EHS. In a non-Duplex mode,
software will detect this error only when the slave zone loses power. In
this case, the slave zone is removed from the configuration and the system
continues to run in Simplex mode.
In Duplex mode, the error is detected by software when either zone loses
power. Again, the failed zone is removed from the configuration and the
system continues in Simplex mode.
EHS indicates in the error log that this error is solid and service is
required, and the error is compared to its error rate threshold. If the
threshold is not exceeded, the zone will be resynchronized automatically.
If the threshold is exceeded, no automatic resynchronization will occur
until the zone is repaired and resynchronized manually. The failed zone is
identified as the FRU for all power failures.
Clock phase
errors
If the clocks between zones begin to run out of phase, hardware generates
an interrupt to report the event to the EHS. This event can occur only
in non-Simplex modes. The cause of this type of failure can be either the
oscillator or the clock locking logic.
An oscillator failure will prevent the CPU and I/O module clocks in the two
zones from running in synchronization and will result in the termination
of the OpenVMS operating system on that zone.
Failure in the clock lock logic will result in two zones running diverged
if the system operating mode had been Duplex. In this case, EHS will
select one zone to remove, and the other zone will continue to run the
OpenVMS operating system in Simplex mode. (Zone selection is based on
timings within the system and could be either zone.) In Degraded Duplex
mode, the slave zone is removed from the configuration and the OpenVMS
operating system continues in Simplex mode.
In all cases of oscillator failure, the ATM in the zone which is removed
is identified as the FRU. If the error is caused by clock lock logic failure,
software cannot accurately determine in which zone the failure exists.
The EHS compares the error to its error rate threshold. An error log is
generated at the time of the error which identifies the ATM as the FRU. If
the threshold is exceeded, the error log indicates that service is required
for the ATM and the zone will not be resynchronized automatically. If the
threshold is not exceeded and the diagnostic tests complete successfully,
the zone will be resynchronized when it becomes available.
If the threshold is not exceeded and the diagnostics report a failure, the
end action error log will indicate that the ATM module requires service
and the zone will not be resynchronized automatically. If the zone fails to
return for service and the threshold had not been exceeded, an end action
timeout error log is generated which indicates the ATM requires service.
(continued on next page)
4–8 Error Handling and Analysis