configure sys-health-check auto-recovery
ExtremeWare 7.5 Command Reference Guide
709
In the BlackDiamond switches
, the
auto-recovery
option configures the number of times the system
health checker attempts to automatically reset a faulty module and bring it online. If the system health
checker fails more than the configured number of attempts, it sets the module to card-down. This
threshold applies only to BlackDiamond I/O modules.
In ExtremeWare 6.2.1 or later, when auto-recovery is configured, the occurrence of three consecutive
checksum errors causes the packet memory (PM) defect detection program to be run against the I/O
module. Checksum errors can include internal and external MAC port parity errors, EDP checksum
errors, and CPU packet or diagnostic packet checksum errors. If defects are detected, the module is
taken off line, the memory defect information is recorded in the module EEPROM, the defective buffer
is mapped out of further use, and the module is returned to operational state. A maximum of 8 defects
can be stored in the EEPROM.
After the PM defect detection and mapping process has been run, a module is considered failed and is
taken off line in the following circumstances:
•
More than eight defects are detected.
•
Three consecutive checksum errors were detected by the health checker, but no new defects were
found by the memory scanning and mapping process.
•
After defects were detected and mapped out, the same checksum errors are again detected by the
system health checker.
The auto-recovery repetition value is ignored in these cases. In any of these cases, please contact
Extreme Technical Support.
Auto-recovery mode only affects an MSM if the system has no slave MSM. If the faulty module is the
only MSM in the system, auto recovery automatically resets the MSM and brings it back on line.
Otherwise, auto-recovery has no effect on an MSM.
If you specify the
online
option, the module is kept on line, but the following error messages are
recorded in the log:
<WARN:SYST> card_db.c 832: Although card 2 is back online, contact Tech. Supp. for
assistance.
<WARN:SYST> card_db.c 821: Card 2 has nonrecoverable packet memory defect.
To view the status of the system health checker, use the
show diagnostics
command.
To enable the health checker, use the
enable sys-health-check
command.
To disable the health checker, use the
disable sys-health-check
command.
The alarm-level
system-down
option is especially useful in an ESRP configuration where the entire
system is backed by an identical system. By powering down the faulty system, you ensure that erratic
ESRP behavior in the faulty system does not affect ESRP performance and ensures full system failover
to the redundant system.
If you are using ESRP with ESRP diagnostic tracking enabled in your configuration, the system health
check failure will automatically reduce the ESRP priority of the system to the configured failover
priority. This allows the healthy standby system to take over ESRP and become responsible for handling
traffic.
I/O module faults are permanently recorded on the module EEPROM. A module that has failed a
system health check cannot be brought back online.
Summary of Contents for ExtremeWare 7.5
Page 402: ...402 ExtremeWare 7 5 Command Reference Guide VLAN Commands ...
Page 470: ...470 ExtremeWare 7 5 Command Reference Guide QoS Commands ...
Page 490: ...490 ExtremeWare 7 5 Command Reference Guide NAT Commands ...
Page 826: ...826 ExtremeWare 7 5 Command Reference Guide Commands for Status Monitoring and Statistics ...
Page 1090: ...1090 ExtremeWare 7 5 Command Reference Guide Security Commands ...
Page 1386: ...1386 ExtremeWare 7 5 Command Reference Guide Wireless Commands ...
Page 1436: ...1436 ExtremeWare 7 5 Command Reference Guide EAPS Commands ...
Page 1568: ...1568 ExtremeWare 7 5 Command Reference Guide ESRP Commands ...
Page 1844: ...1844 ExtremeWare 7 5 Command Reference Guide IGP Commands ...
Page 1930: ...1930 ExtremeWare 7 5 Command Reference Guide BGP Commands i Series Switches Only ...
Page 2022: ...2022 ExtremeWare 7 5 Command Reference Guide IP Multicast Commands ...
Page 2066: ...2066 ExtremeWare 7 5 Command Reference Guide IPX Commands i Series Platforms Only ...
Page 2082: ...2082 ExtremeWare 7 5 Command Reference Guide ARM Commands BlackDiamond Switch Only ...
Page 2094: ...2094 ExtremeWare 7 5 Command Reference Guide Remote Connect Commands ...
Page 2174: ...2174 ExtremeWare 7 5 Command Reference Guide PoS Commands BlackDiamond Switch Only ...
Page 2372: ...2372 ExtremeWare 7 5 Command Reference Guide LLDP Commands ...
Page 2422: ...2422 ExtremeWare 7 5 Command Reference Guide H VPLS Commands BlackDiamond Switch Only ...
Page 2528: ...2528 ExtremeWare 7 5 Command Reference Guide MPLS Commands BlackDiamond Switch Only ...