Automatic Redistribution on Disk Failure
34
After confirming the failure of an ISB, it is highly recommended that you initiate the remove
redistribution process of the failed ISB immediately. There are two benefits to doing this:
•
First, immediately upon initiation of the remove redistribution, all new writes to the Storage
Group have the full benefit of RAID-6 protection (dual-parity protection).
•
Second, upon completion of the remove redistribution process, existing data in the Storage
Group is once again fully protected. Prior to completion, if another ISB were to fail, the
Storage Group would be in an unprotected state (though no data would be lost).
RAID-6 Storage Groups, Dual ISB Failure
It is considered an “unprotected state” if you are with two failed ISBs in a RAID-6 Storage
Group. In an unprotected state with no additional failures, read operations continue to function
normally at a lower bandwidth.
However, in an unprotected state, due to the distributed architecture of the ISIS file system
(optimized for real-time performance), it is possible under certain circumstances that the system
would not be able to correctly update the parity information when writing new data. As a result
under these circumstances, the file system could return a failure status when writing. While the
failure rate percentage on the total number of write operations is low, heavy workloads on the
system would result in enough write failures to disrupt operations.
This issue only applies when the Storage Group is in an unprotected state and the remove
redistribution process on the failed ISBs has not been initiated. Therefore, it is highly
recommended that the remove redistribution process is initiated immediately upon confirmation
of any ISB failure. This ensures immediate protection (RAID or mirroring) of new data being
written, and full protection of all stored data at the earliest possible time.
Automatic Redistribution on Disk Failure
Avid ISIS performs an automatic redistribution on Disk Failure notification. Storage Elements
continuously monitors disk status and sends a “Disk Failed” notification to the System Director
upon determination that a disk is not usable. The System Director then removes the Storage
Element from its associated Storage Group. The removal of the Storage Element from the
Storage Group initiates redistributions on all workspaces associated with that Storage Group.
The System Director then prevents the Storage Element that reported the disk failure from being
added to a Storage Group.