5137ch04.fm
Draft Document for Review October 14, 2014 10:19 am
164
IBM Power Systems E870 and E880 Technical Overview and Introduction
4.6.3 Service processor
In POWER8 processor-based systems with a dedicated service processor, the dedicated
service processor is primarily responsible for fault analysis of processor/memory errors.
The service processor is a microprocessor that is powered separately from the main
instruction processing complex.
In addition to FFDC functions, the service processor performs many serviceability functions:
Several remote power control options
Reset and boot features
Environmental monitoring
The service processor interfaces with the OCC function, which monitors the server’s
built-in temperature sensors and sends instructions to the system fans to increase
rotational speed when the ambient temperature is above the normal operating range. By
using a designed operating system interface, the service processor notifies the operating
system of potential environmentally related problems so that the system administrator can
take appropriate corrective actions before a critical failure threshold is reached. The
service processor can also post a warning and initiate an orderly system shutdown in the
following circumstances:
– The operating temperature exceeds the critical level (for example, failure of air
conditioning or air circulation around the system).
– The system fan speed is out of operational specification (for example, because of
multiple fan failures).
– The server input voltages are out of operational specification. The service processor
can shut down a system in the following circumstances:
•
The temperature exceeds the critical level or remains above the warning level for
too long.
•
Internal component temperatures reach critical levels.
•
Non-redundant fan failures occur.
POWER Hypervisor (system firmware) and HMC connection surveillance.
The service processor monitors the operation of the firmware during the boot process, and
also monitors the hypervisor for termination. The hypervisor monitors the service
processor and can perform a reset and reload if it detects the loss of the service
processor. If the reset/reload operation does not correct the problem with the service
processor, the hypervisor notifies the operating system, and then the operating system
can then take appropriate action, including calling for service. The FSP also monitors the
connection to the HMC and can report loss of connectivity to the operating system
partitions for system administrator notification.
Uncorrectable error recovery
The auto-restart (reboot) option, when enabled, can reboot the system automatically
following an unrecoverable firmware error, firmware hang, hardware failure, or
environmentally induced (AC power) failure.
The auto-restart (reboot) option must be enabled from the ASMI or from the Control
(Operator) Panel.
Concurrent access to the service processors menus of the ASMI
This access allows nondisruptive abilities to change system default parameters,
interrogate service processor progress and error logs, and set and reset service indicators