System Troubleshooting and Diagnostics
5.2 Product Fault Management and Symptom-Directed Diagnosis
5.2.6 Interpreting Memory Faults Using ANALYZE/ERROR
If "memory subpacket" or "memory sbe reduction subpacket" is listed in the
third column of the FLAGS register, there is a problem with one or more of the
memory modules, CPU module, or backplane.
• The "memory subpacket" message indicates an uncorrectable ECC error.
Refer to Section 5.2.6.1 for instructions in isolating uncorrectable ECC
error problems.
• The "memory sbe reduction subpacket" message indicates correctable ECC
errors. Refer to Section 5.2.6.2 for instructions in isolating correctable ECC
error problems.
Note
The memory fault interpretation procedures work only if the memory
modules have been properly installed and configured. For example,
memory modules should start in backplane slot 4 (next to the processor
module in slot 5) and proceed to slot 1 with no gaps.
Note
Although the OpenVMS error handler has built-in features to aid
Services in memory repair, good judgment is needed by the Service
Engineer. It is essential to understand that in many, if not most cases,
correctable ECC errors are transient in nature. No amount of repair
will fix them, as generally there is nothing to be fixed.
Memory modules can represent a great expense to the Corporation
when they are sent back to Repair with no errors. If one disagrees
with the strategy in this section or has questions or suggestions, please
contact Corporate Support.
5.2.6.1 Uncorrectable ECC Errors
Refer to Example 5–3, which provides an abbreviated error log for uncor-
rectable ECC errors.
For uncorrectable ECC errors, a memory subpacket will be logged as indicated
by "memory subpacket" listed in the third column of the FLAGS software
register ( ). Also, the hardware register MESR <11> (
) of the processor
Register Subpacket will be set equal to 1, and MEAR will latch the error
address ( ).
System Troubleshooting and Diagnostics 5–19