38
Advanced System Diagnostics and Troubleshooting Guide
Software Exception Handling
The system-watchdog feature is enabled by default. The CLI commands related to system-watchdog
operation are:
enable system-watchdog
disable system-watchdog
NOTE
During the reboot cycle, network redundancy protocols will work to recover the network. The impact on
the network depends on the network topology and configuration (for example, OSPF ECMP versus a
large STP network on a single domain).
Also, if the system-watchdog feature is not enabled, error conditions might lead to extensive service
outages. All routing and redundancy protocols use the CPU to calculate proper states. Using the OSPF
ECMP and STP networks as general examples, if the CPU becomes trapped in a loop, the system in
an OSPF network would be unable to process OSPF control messages properly, causing corruption in
routing tables, while in an STP network, spanning tree BPDUs would not be processed, causing all
paths to be forwarded, leading to broadcast storms, causing not only data loss, but loss of general
connectivity as well.
System Software Exception Recovery Behavior
ExtremeWare provides commands to configure system recovery behavior when a software exception
occurs.
•
Recovery behavior—
configure sys-recovery-level
command
•
Reboot behavior—
configure reboot-loop-protection
command
•
System dump behavior—
configure system-dump server
command,
configure system-dump
timeout
command, and
upload system-dump
command
These commands and their uses are described in these sections:
•
“Configuring System Recovery Actions” on page 40
•
“Configuring Reboot Loop Protection” on page 43
•
“Dumping the “i” Series Switch System Memory” on page 45
Redundant MSM Behavior
A number of events can cause an MSM failover to occur, including:
•
Software exception; system watchdog timer expiry
•
Diagnostic failure (extended diagnostics, transceiver check/scan, FDB scan failure/remap)
•
Hot removal of the master MSM or hard-reset of the master MSM
The MSM failover behavior depends on the following factors:
•
Platform type and equipage
•
Software configuration settings for the software exception handling options such as system
watchdog, system recovery level, and reboot loop protection. (For more information on the
configuration settings, see
Chapter 4
, “
Software Exception Handling
.”)
In normal operation, the master MSM continuously resets the watchdog timer. If the watchdog timer
expires, the slave MSM will either 1) reboot the chassis and take over as the master MSM (when the
Содержание ExtremeWare Version 7.8
Страница 8: ...8 Advanced System Diagnostics and Troubleshooting Guide Contents...
Страница 14: ...14 Advanced System Diagnostics and Troubleshooting Guide Introduction...
Страница 24: ...24 Advanced System Diagnostics and Troubleshooting Guide i Series Switch Hardware Architecture...
Страница 48: ...48 Advanced System Diagnostics and Troubleshooting Guide Software Exception Handling...
Страница 102: ...102 Advanced System Diagnostics and Troubleshooting Guide Additional Diagnostics Tools...
Страница 110: ...110 Advanced System Diagnostics and Troubleshooting Guide Troubleshooting Guidelines...
Страница 114: ...114 Advanced System Diagnostics and Troubleshooting Guide Limited Operation Mode and Minimal Operation Mode...
Страница 120: ...120 Advanced System Diagnostics and Troubleshooting Guide Index...