![Digital Equipment VAXft Systems 810 Service Information Download Page 91](http://html1.mh-extra.com/html/digital-equipment/vaxft-systems-810/vaxft-systems-810_service-information_2498646091.webp)
Table 4–4 (Cont.) Error Types
Error Type
Definition
Single-Bit
memory
errors
Single-Bit Errors (SBEs) can be detected by either the JXD during a DMA
read cycle which reads from main memory or the CPU during a memory
read. Software action varies depending upon the system operating mode
and where the error detection occurs.
If the SBE is detected by the JXD during a DMA cycle in any system
mode or by the CPU during a CPU cycle in any non-Duplex mode, the
actions of the EHS are the same. The error is always transient, and no
deconfiguration is performed. A pair of memory SIMM rows on an MMB
are isolated and compared to its error rate threshold.
In Duplex mode (JXD detected) when the threshold is exceeded, the CPU
module on which the memory resides will be removed from service. In
non-Duplex mode, since there is only one CPU active and since SBEs are
always transient, the CPU is not removed from service when the threshold
is exceeded. The SBE is repaired in memory by hardware if detected by
the JXD, and by the EHS if detected by the CPU.
If the SBE is detected during a CPU cycle while the system is in Duplex
mode, the action differs due to hardware constraints. The CPU which
experiences the SBE will be removed from service by hardware at the time
of the error. An error log will be generated reporting the error, but FRU
isolation is done at the time of the end action. The error is then compared
to its error rate threshold by the OpenVMS operating system.
If the threshold is not exceeded, the CPU will be resynchronized
immediately by system software (FTSS$SERVER) at the time of the end
action. The process of resynchronization will repair the SBE in physical
memory since each location is rewritten during the memory copy.
If the failed CPU does not return for resynchronization after being
removed in the CPU-detected Duplex mode case, an end action timeout
event will be logged which identifies the failed CPU module as the FRU.
In most cases, a pair of SIMM rows and a memory mother board (MMB)
are identified as the FRU in the error log. However, in some cases, end
action data may not contain all the information needed to isolate to a pair
of memory SIMM rows. In this case the CPU module will be identified as
the FRU and will be subjected to the same threshold as a memory SIMM.
Cable
failures
All traffic between the two zones of the system is performed across the
cross-link cable. If this cable is detached or broken, the hardware will
report a cable loss event to the EHS. This error can only happen in a non-
Simplex system, and when it occurs, communication between the zones is
lost.
In all cases, the system operating mode must be changed to Simplex. If
the mode before the error was not Duplex, then the slave zone is removed
from service. If the mode was Duplex, then Zone B is removed from
service.
The EHS indicates in the error log that this error is solid and service
is required, and the error is compared to its error rate threshold. If the
threshold is not exceeded, the zone will be resynchronized automatically. If
the threshold is exceeded, no automatic resynchronization will occur until
the cross-link cable is repaired. In all cases, the FRU is the cross-link
cable.
(continued on next page)
Error Handling and Analysis 4–7