You can enable or disable CPU Repeat Gard or Memory Repeat Gard using the
Processor Configuration/Deconfiguration Menu, which is a submenu under the System
Information Menu.
Run-Time CPU Deconfiguration (CPU Gard)
L1 instruction cache recoverable errors, L1 data cache correctable errors, and L2 cache
correctable errors are monitored by the processor runtime diagnostics (PRD) code
running in the service processor. When a predefined error threshold is met, an error log
entry with warning severity and threshold exceeded status is returned to AIX. At the
same time, PRD marks the CPU for deconfiguration at the next boot. AIX will attempt to
migrate all resources associated with that processor to another processor and then stop
the defective processor.
Service Processor System Monitoring - Surveillance
Surveillance is a function in which the service processor monitors the system, and the
system monitors the service processor. This monitoring is accomplished by periodic
samplings called
heartbeats.
Surveillance is available during two phases:
v
System firmware bringup (automatic)
v
Operating system run time (optional)
System Firmware Surveillance
System firmware surveillance is automatically enabled during system power-on. It
cannot be disabled by the user.
If the service processor detects no heartbeats during system IPL (for 7 minutes), it
cycles the system power to attempt a reboot. The maximum number of retries is set
from the service processor menus. If the fail condition persists, the service processor
leaves the machine powered on, logs an error, and displays menus to the user. If
call-out is enabled, the service processor calls to report the failure and displays the
operating system surveillance failure code on the operator panel.
Operating System Surveillance
Operating system surveillance provides the service processor with a means to detect
hang conditions, as well as hardware or software failures, while the operating system is
running. It also provides the operating system with a means to detect a service
processor failure caused by the lack of a return heartbeat.
Operating system surveillance is not enabled by default, allowing you to run operating
systems that do not support this service processor option.
You can also use the service processor menus and the AIX diagnostic service aids to
enable or disable operating system surveillance.
For operating system surveillance to work correctly, you must set three parameters:
v
Surveillance enable/disable
52
User’s Guide
Содержание Enterprise Server H80 Series
Страница 1: ...RS 6000 Enterprise Server Model H80 pSeries 660 Models 6H0 and 6H1 User s Guide SA38 0565 03 IBM...
Страница 12: ...xii User s Guide...
Страница 14: ...xiv User s Guide...
Страница 18: ...2 User s Guide...
Страница 32: ...16 User s Guide...
Страница 88: ...72 User s Guide...
Страница 134: ...118 User s Guide...
Страница 146: ...130 User s Guide...
Страница 148: ...132 User s Guide...
Страница 152: ...136 User s Guide...
Страница 166: ...error Handle unexpected modem responses expect 8 r or 7 r or 4 r or 3 r delay 2 done 150 User s Guide...
Страница 182: ...166 User s Guide...
Страница 186: ...170 User s Guide...
Страница 189: ......