Customer Messaging Policy
•
Only light a diagnostic LED for memory DIMM errors when isolation is to a specific memory
DIMM. If any uncertainty about a specific DIMM, point customer to the SEL for any action
and do not light the suspect DIMM CRU LED on the diagnostic panel.
•
For configuration style errors, for example, no memory DIMMs installed in rank 0 of side 0,
follow the Hewlett Packard Enterprise policy of lighting all of the CRU LEDs on the diagnostic
LED panel for all of the DIMMs that are missing.
•
No diagnostic messages are reported for single-byte errors that are corrected in both zx2
caches and memory DIMMs during corrected platform error (CPE) events. Diagnostic
messages are reported for CPE events when thresholds are exceeded for both single-byte
and double byte errors; all fatal memory subsystem errors cause global MCA events.
•
PDT logs for all double byte errors will be permanent; single byte errors will initially be logged
as transient errors. If the server logs 2 single byte errors within 24 hours, upgrade them to
permanent in the PDT.
lists the memory subsystem evens that light the diagnostic panel LEDs.
Table 54 Memory Subsystem Events That Light Diagnostic Panel LEDs
Notes
Source
Cause
Sample IPMI Events
Diagnostic
LEDs
A voltage on the
memory
BMC
Voltage on memory
expander is
inadequate
Type 02h, 02h:07h:03h
VOLTAGE_DEGRADES_TO_NON_RECOVERABLE
Memory
Carriers
expander is out
of range (likely
too low)
Light all DIMM
LEDs in rank 0
of cell 0
SFW
No memory DIMMs
installed (in rank 0 of
cell 0)
Type E0h, 208d:04d
MEM_NO_DIMMS_INSTALLED
DIMMs
Either EEPROM
is
SFW
A DIMM has a serial
presence detect
Type E0h, 172d:04d
MEM_DIMM_SPD_CHECKSUM
DIMMs
misprogrammed
(SPD) EEPROM with
a bad checksum
or this DIMM is
incompatible
Memory rank is
about to fail or
WIN
Agent
This memory rank is
correcting too many
single-bit errors
Type E0h, 4652d:26d
WIN_AGT_PREDICT_MEM_FAIL
DIMMs
environmental
conditions are
causing more
errors than
usual
lists the memory subsystem evens that may light the diagnostic panel LEDs.
Table 55 Memory Subsystem Events That May Light Diagnostic Panel LEDs
Notes
Source
Cause
Sample IPMI Events
Diagnostic
LEDs
SFW
Unable to clear the
platform error logs in
CEC
Type E0h, 189d:26d
MEM_ERR_LOG_FAILED_TO_CLEAR
Processor
Carrier
SFW
Self-test of CEC
multi-bit error
signaling has failed
Type E0h, 181d:26d
MEM_ECC_MBE_SIGNAL_TST_FAILED
Processor
Carrier
SFW
The CEC failed the
register test
Type E0h, 160d:26d MEM_BIB_REG_FAILURE
Processor
Carrier
CPU/Memory/SBA
159