Service Processor System Monitoring - Surveillance
Surveillance is a function in which the service processor monitors the system, and the
system monitors the service processor. This monitoring is accomplished by periodic
samplings called
heartbeats.
Surveillance is available during two phases:
v
System firmware startup (automatic)
v
Operating system run time (optional)
System Firmware Surveillance
System firmware surveillance provides the service processor with a means to detect
boot failures while the system firmware is running.
System firmware surveillance is automatically enabled during system power-on. It
cannot be disabled by the user, and the surveillance interval and surveillance delay
cannot be changed by the user.
If the service processor detects no heartbeats during system boot (for a set period of
time), it cycles the system power to attempt a reboot. The maximum number of retries
is set from the service processor menus. If the failure condition repeats, the service
processor leaves the machine powered on, logs an error, and displays menus to the
user. If call-out is enabled, the service processor calls to report the failure and displays
the operating-system surveillance failure code on the operator panel.
Operating System Surveillance
The operating system surveillance provides the service processor with a means to
detect hang conditions, as well as hardware or software failures, while the operating
system is running. It also provides the operating system with a means to detect service
processor failure caused by the lack of a return heartbeat.
Operating system surveillance is enabled by default, allowing the user to run operating
systems that do not support this service processor option.
You can also use service processor menus and AIX service aid to enable or disable
operating system surveillance.
For operating system surveillance to work correctly, you must set the following
parameters:
v
Surveillance enable/disable
v
Surveillance interval
The maximum time (in minutes) the service processor will wait between heartbeats
from the operating system before reporting a surveillance failure.
v
Surveillance delay
The maximum time (in minutes) for the service processor will wait for the first
heartbeat from the operating system after the operating system has been started,
before reporting a surveillance failure.
50
User’s Guide
Summary of Contents for Intellistation POWER 9112 265
Page 1: ...Intellistation POWER 9112 Model 265 User s Guide SA38 0608 00 IBM...
Page 2: ......
Page 3: ...Intellistation POWER 9112 Model 265 User s Guide SA38 0608 00 IBM...
Page 12: ...x User s Guide...
Page 16: ...xiv User s Guide...
Page 18: ...xvi User s Guide...
Page 26: ...6 User s Guide...
Page 160: ...140 User s Guide...
Page 164: ...144 User s Guide...
Page 178: ...158 User s Guide...
Page 182: ...162 User s Guide...
Page 192: ...172 User s Guide...
Page 218: ...198 User s Guide...
Page 222: ...202 User s Guide...
Page 225: ......
Page 227: ...Spine information IBM Intellistation POWER 9112 Model 265 User s Guide SA38 0608 00...