
Troubleshooting memory
Symptom
Memory errors can be separated into two categories depending on where they originate:
• CPU to memory buffer errors — outlined in yellow below
• Memory buffer to DIMM errors — outlined in green below
Solution 1
Cause
CPU to memory buffer errors
The link between the CPU and the memory buffer is the SMI2 or VMSE link. An SMI2 failure can manifest
as reduced memory size, reduced memory throughput, or machine checks. However, other issues can
result in the same symptoms. CAE will analyze the failure to determine whether SMI2 is at fault.
For errors related to SMI2, suspect the CPU, the memory buffer, or the traces between them. The
memory buffer is permanently attached to the blade, so it cannot be indicted independently. Therefore,
the CPU and/or blade are indicted for an SMI2 error.
If an error occurs on SMI2, replacing DIMMs is unlikely to correct the problem. DIMMs reside on a
separate DDR bus and changes to the DDR bus will not affect the SMI2 bus.
IMPORTANT:
DIMMs should
not
be moved or replaced for an
SMI2_TRAINING_FAILURE
event.
Solution 2
Cause
Memory buffer to DIMM errors
84
Troubleshooting memory