Early code fetches come from PDH, until memory is configured. Normal execution is fetched
from main memory.
Local machine check abort (MCA) events cause the physical CPU core and one or both of its
logical CPUs within that IPF processor module to fail while all other physical and their logical
CPUs continue operating. Double-bit data cache errors in any physical CPU core will cause a
Global MCA event that causes all logical and physical CPUs in the server to fail and reboot the
operating system.
Customer Messaging Policy
•
A diagnostic LED only lights for physical CPU core errors, when isolation is to a specific IPF
processor module. If there is any uncertainty about a specific CPU, the customer is pointed
to the SEL for any action, and the suspect IPF processor module’s CRU LED on the diagnostic
panel is not lighted.
•
For configuration style errors, for example, when there is no IPF processor module installed
in CPU socket 0, all of the CRU LEDs on the diagnostic LED panel are lighted for all of the
IPF processor modules that are missing.
•
No diagnostic messages are reported for single-bit errors that are corrected in both instruction
and data caches, during corrected machine check (CMC) events to any physical CPU core.
Diagnostic messages are reported for CMC events when thresholds are exceeded for
single-bit errors; fatal processor errors cause global / local MCA events.
lists the processor events that light the diagnostic panel LEDs.
Table 52 Processor Events That Light Diagnostic Panel LEDs
Notes
Source
Cause
Sample IPMI Events
Diagnostic
LEDs
This event will likely
follow other failed
processor(s)
SFW
Processor failed
and
deconfigured
Type E0h, 39d:04d BOOT_DECONFIG_CPU
Processors
Threshold exceeded
for cache parity
errors on processor
WIN
Agent
Too many cache
errors detected
by processor
Type E0h, 5823d:26d PFM_CACHE_ERR_PROC
Processors
Threshold exceeded
for cache errors from
WIN
Agent
Too many
corrected errors
Type E0h, 5824d:26d
PFM_CORR_ERROR_MEM
Processors
processor corrected
by zx2
detected by
platform
Power Pod voltage
is out of range (likely
too low)
BMC
Voltage on FRU
is inadequate
Type 02h, 02h:07h:03h
VOLTAGE_DEGRADES_TO_NON_RECOVERABLE
Processors
A voltage on the
processor carrier is
BMC
Voltage on FRU
is inadequate
Type 02h, 02h:07h:03h
VOLTAGE_DEGRADES_TO_NON_RECOVERABLE
Processor
Carrier
out of range (likely
too low)
lists the processor events that may light the diagnostic panel LEDs.
156
Troubleshooting