23.2 Powerloss or Crash Problems
Your OS, raid subsystem and your disks offer advanced features to ensure data security during regular operation
but what happens on a sudden powerloss or system crash - Are you protected against such problem?
The answer is, it depends. So what happens on a sudden powerloss?
Problems can occur for the file that is currently written, for the raid or filesystem structure or
the data validity on a single disk or SSD.
File consistency
The first inspection is, what happens if you save a Word document and power fails:
This will always mean, that this document or the newest state is corrupt as it is only partly on disk.
A solutution are local tmp files that allows Word to revert to a last state. Another protection is an UPS
for your desktop, switch and storage server. Mainly you are only protected on the application level like Word.
ZFS cannot offer help.
Raid and filesystem consistency
But what happens to the Raid or the filesystem on the storage system during this crash?
The basic problem is that you cannot update data on all disks at the same time. If you write data ex to a Raid-1
the first disk is updated then the next. A power outage can result in a situation where one disk is updated but
not the other or that data is updated but not metadata. The same happens with a Raid 5/6 where a raid-stripe
is only partly written. Result can be a damaged raid and/or a damaged filesystem structure. This problem is
called Write-Whole-Problem (http://www.raid-recovery-guide.com/raid5-write-hole.aspx). A conventional Raid
or filesystem cannot protect against such problems as the raid has no control or knwoledge about data content
or transaction groups or write atomicity to ensure that a data update MUST include a metadata update or must
be done on all disks of a Raid. The solution against this problem is CopyOnWrite what means that you do not
inline update data structures but write evetry datablock new. The old state is then available for overwriting un-
less you do not block with a snap. A write action on ZFS that affects filestructure or Raid consistency is done
ompletely or discared completely on a crash.
Result: The CopyOnWrite mechanism protects ZFS Software Raid and the ZFS filesystem against powerloss
problems. This is crash insensitivity by design. Traditional filesystem problems require a offline chkdsk run (that
can last days on large arrays) after a crash to hopely regain a consistent metadata structure. This is not needed
with ZFS as these problems cannot occur. There is no chkdsk on ZFS. All what you need is online scrubbing to
repair bitrot problems.
Transaction Consistency
ZFS metadata structures are always consistent. But what happens if your application writes transactions that
depends on each other like a financial transactions. A move money means, remove it from one account and
THEN add it to another. Or if you use ZFS as a storage for a virtualisation environment. Data is then a filesystem
like ext4 or ntfs that is not CopyOnWrite. A valid write must consist of a data update followed by a metadata
update. In both cases you need transaction safety under control of the database application or the virtualized
OS that writes the data. The key is that you must ensure that after a commit from the storage unit, the data
must to be on stable storage and not lost in the disk writeback or ZFS storage cache on a crash. One option
would be, disable all write caching but this is a bad idea. Caching is where performance comes from when you
connect a very fast unit (CPU/RAM) with a very slow unit (storage). ZFS offers unique read and write caching
options that you do not want to disable as you will then fall back to pure disk performance what is a fraction of
the overall storage performance with caching. You are in a dilemma now as you need caches for performance
and you need a behaviour that every committed write is on stable disk what means uncached sync write.
ZFS can do both: process a fast combined write where you cache all slow and small random writes for a few
seconds and write them then as a single fast sequential write and ensure that every single action is on disk.
Содержание ZFS Storage
Страница 8: ...3 1 ZFS Configurations...
Страница 45: ...Example Map Chenbro 50 x 3 5 Bay...