Dealing with hardware faults
Make sure that you have a replacement module of the same type before removing any faulty module. See “Module removal and
replacement” in the
Dell EMC PowerVault ME4 Series Storage System Owner’s Manual
.
NOTE:
If the enclosure system is powered up and you remove any module, replace it immediately. If the system is used
with any modules missing for more than a few seconds, the enclosures can overheat, causing power failure and potential
data loss. Such action can invalidate the product warranty.
NOTE:
Observe applicable/conventional ESD precautions when handling modules and components, as described in
. Avoid contact with midplane components, module connectors, leads, pins, and exposed circuitry.
Isolating a host-side connection fault
During normal operation, when a controller module host port is connected to a data host, the port host link status/link activity LED is
green. If there is I/O activity, the host activity LED blinks green. If data hosts are having trouble accessing the storage system, but you
cannot locate a specific fault or access the event logs, use the following procedures. These procedures require scheduled downtime.
NOTE:
Do not perform more than one step at a time. Changing more than one variable at a time can complicate the
troubleshooting process.
Host-side connection troubleshooting featuring CNC ports
The following procedure applies to controller enclosures with small form factor pluggable (SFP+) transceiver connectors in 8/16 Gb/s FC
or 10 GbE iSCSI host interface ports.
In this procedure,
SFP+ transceiver and host cable
is used to refer to any qualified SFP+ transceiver supporting CNC ports
used for I/O or replication.
NOTE:
When experiencing difficulty diagnosing performance problems, consider swapping out one SFP+ transceiver at
a time to see if performance improves.
1. Stop all I/O to the storage system. See “Stopping I/O” in the
Dell EMC PowerVault ME4 Series Storage System Owner’s Manual
.
2. Check the host link status/link activity LED.
If there is activity, stop all applications that access the storage system.
3. Check the Cache Status LED to verify that the controller cached data is flushed to the disk drives.
•
Solid – Cache contains data yet to be written to the disk.
•
Blinking – Cache data is being written to CompactFlash in the controller module.
•
Flashing at 1/10 second on and 9/10 second off – Cache is being refreshed by the supercapacitor.
•
Off – Cache is clean (no unwritten data).
4. Remove the SFP+ transceiver and host cable and inspect for damage.
5. Reseat the SFP+ transceiver and host cable.
Is the host link status/link activity LED on?
•
Yes – Monitor the status to ensure that there is no intermittent error present. If the fault occurs again, clean the connections to
ensure that a dirty connector is not interfering with the data path.
•
No – Proceed to the next step.
6. Move the SFP+ transceiver and host cable to a port with a known good link status.
This step isolates the problem to the external data path (SFP+ transceiver, host cable, and host-side devices) or to the controller
module port.
Is the host link status/link activity LED on?
•
Yes – You now know that the SFP+ transceiver, host cable, and host-side devices are functioning properly. Return the cable to
the original port. If the link status LED remains off, you have isolated the fault to the controller module port. Replace the controller
module.
•
No – Proceed to the next step.
7. Swap the SFP+ transceiver with the known good one.
Is the host link status/link activity LED on?
•
Yes – You have isolated the fault to the SFP+ transceiver. Replace the SFP+ transceiver.
•
No – Proceed to the next step.
Troubleshooting and problem solving
73