Recovery
Issue 1 May 2002
3-7
555-233-143
Watchdog’s Hardware Timer
The Watchdog’s HiMonitor resets the timer on the hardware Watchdog circuitry
via the Hardware-Sanity device driver. If the Watchdog is unable to perform that
task, then the timer’s value eventually decrements to 0, and the processor is
reset.
Hardware-Sanity Device Driver
The Hardware Sanity device driver (loadable module) is a modified Linux driver
for the hardware Watchdog. A Sanity thread periodically writes to the Hardware
Sanity driver, which resets the timer on the hardware Watchdog. If the Sanity
thread does not write to the Hardware-Sanity driver, the:
■
Driver does not reset the timer on the hardware Watchdog
■
Timer expires
■
Hardware Watchdog reboots Linux
The driver has three capabilities: set time-out interval to some (configurable)
value, reset the timer to the time-out interval, and reboot Linux.
Rolling Reboots
There may be cases where recovering the system using a reboot does not correct
the problem. If this occurs, the server continually reboots. This repeated
rebooting increases the difficulty of diagnosing the problem. The Watchdog
handles this with “
MaxReboots
” and “
MaxRebootInterval
” parameters in the
watchd.conf file. (The default values are currently set to 3 reboots within 60
minutes.) Watchdog logs a message to syslog and does not start any processes,
if it detects the software is rebooting too quickly. When running in this mode,
Watchdog’s sole purpose is to reset the hardware Watchdog.
Restarts
The term “restart” is a traditional Avaya term for a system restart of less severity
than a full recreation. Restarts were accomplished by retaining the memory state
of certain processes.
The WatchDog process is not restartable, nor can it invoke restarts in
MultiVantage. In addition, none of the other Watchdog-started applications can
restart. (They are reloaded, as previously described). If the Watchdog itself dies,
the parent Watchdog process restarts it. If it repeatedly dies (10 times in 2
minutes), init logs a message to Syslog for the GMM to process. GMM lowers the
SOH which causes a server interchange. Eventually, the hardware Watchdog
resets the processor since Watchdog is no longer resetting the hardware
Watchdog.
Содержание S8700 Series
Страница 50: ...Maintenance Architecture 555 233 143 1 26 Issue 1 May 2002 ...
Страница 74: ...Initialization and Recovery 555 233 143 3 12 Issue 1 May 2002 ...
Страница 186: ...Alarms Errors and Troubleshooting 555 233 143 4 112 Issue 1 May 2002 ...
Страница 232: ...Additional Maintenance Procedures 555 233 143 5 46 Issue 1 May 2002 ...
Страница 635: ...status psa Issue 1 May 2002 7 379 555 233 143 status psa See status tti on page 7 406 ...
Страница 722: ...Maintenance Commands 555 233 143 7 466 Issue 1 May 2002 ...
Страница 1121: ...CARR POW Carrier Power Supply Issue 1 May 2002 8 399 555 233 143 Figure 8 19 Power Distribution Unit J58890CH 1 ...
Страница 1447: ...E DIG RES TN800 reserve slot Issue 1 May 2002 8 725 555 233 143 E DIG RES TN800 reserve slot See ASAI RES ...
Страница 1735: ...LGATE AJ Issue 1 May 2002 8 1013 555 233 143 LGATE AJ See BRI SET LGATE BD See BRI BD LGATE PT See BRI PT ...
Страница 1846: ...Maintenance Object Repair Procedures 555 233 143 8 1124 Issue 1 May 2002 Figure 8 62 TN787 MMI MULTIMEDIA INTERFACE CIRCUIT PACK ...