Fault management overview
The goal of fault management and monitoring is to increase system availability, by moving from a reactive
fault detection, diagnosis, and repair strategy to a proactive fault detection, diagnosis, and repair strategy.
The objectives are as follows:
To detect issues automatically, as nearly as possible to when they actually occur.
To diagnose issues automatically, at the time of detection.
To automatically report in understandable text a description of the issue, the likely causes of the issue, the
recommended actions to resolve the issue, and detailed information about the issue.
To ensure that tools are available to repair or recover from the fault.
HP-UX fault management
Proactive fault prediction and notification is provided on HP-UX by SysFaultMgmt WBEM indication
providers. WBEM provides frameworks for monitoring and reporting events.
SysFaultMgmt WBEM indication providers enable users to monitor the operation of a wide variety of
hardware products, and alert them immediately if any failure or other unusual event occurs. By using
hardware event monitoring, users can virtually eliminate undetected hardware failures that could interrupt
system operation or cause data loss.
104
Fault management overview