Use the CLI
As an alternative to using the PowerVault Manager, you can run the show system CLI command to view the health of
the system and its components. If any component has a problem, the system health is in a Degraded, Fault, or Unknown
state, and those components are listed as Unhealthy Components. Follow the recommended actions in the component Health
Recommendation field to resolve the problem.
Monitor event notification
With event notification configured and enabled, you can view event logs to monitor the health of the system and its
components. If a message tells you to check whether an event has been logged, or to view information about an event, use the
PowerVault Manager or the CLI. Using the PowerVault Manager, view the event log and then click the event message to see
detail about that event. Using the CLI, run the
show events detail
command to see the detail for an event.
View the enclosure LEDs
You can view the LEDs on the hardware to identify component status. If a problem prevents access to the PowerVault Manager
or the CLI, viewing the enclosure LEDs is the only option available. However, monitoring/management is often done at a
management console using storage management interfaces, rather than relying on line-of-sight to LEDs of racked hardware
components.
Performing basic steps
You can use any of the available options that are described in the previous sections to perform the basic steps comprising the
fault isolation methodology.
Gather fault information
When a fault occurs, gather as much information as possible. Doing so helps determine the correct action that is needed to
remedy the fault.
Begin by reviewing the reported fault:
●
Is the fault related to an internal data path or an external data path?
●
Is the fault related to a hardware component such as a disk drive module, controller module, or power supply unit?
By isolating the fault to one of the components within the storage system, you are able determine the necessary corrective
action more quickly.
Determine where the fault is occurring
When a fault occurs, the Module Fault LED illuminates. Check the LEDs on the back of the enclosure to narrow the fault to a
CRU, connection, or both. The LEDs also help you identify the location of a CRU reporting a fault.
Use the PowerVault Manager to verify any faults found while viewing the LEDs. If the LEDs cannot be viewed due to the
location of the system, use the PowerVault Manager to determine where the fault is occurring . This web-application provides
you with a visual representation of the system and where the fault is occurring. The PowerVault Manager also provides more
detailed information about CRUs, data, and faults.
Review the event logs
The event logs record all system events. Each event has a numeric code that identifies the type of event that occurred, and has
one of the following severities:
●
Critical – A failure occurred that may cause a controller to shut down. Correct the problem immediately.
●
Error – A failure occurred that may affect data integrity or system stability. Correct the problem as soon as possible.
●
Warning – A problem occurred that may affect system stability, but not data integrity. Evaluate the problem and correct it if
necessary.
Troubleshooting and problem solving
27