64
Copyright © Acronis, Inc., 2000-2010
Performed on a managed machine during backup. Acronis Backup & Recovery 10 Agent uses the
storage node to determine what data can be deduplicated, and does not transfer the data whose
duplicates are already present in the vault.
Deduplication at target
Performed in the vault after a backup is completed. The storage node analyses the vault's
archives and deduplicates data in the vault.
When creating a backup plan, you have the option to turn off deduplication at source for that plan.
This may lead to faster backups but a greater load on the network and storage node.
Deduplicating vault
A managed centralized vault where deduplication is enabled is called a
deduplicating vault
. When
you create a managed centralized vault, you can specify whether to enable deduplication in it. A
deduplicating vault cannot be created on a tape device.
Deduplication database
Acronis Backup & Recovery 10 Storage Node managing a deduplicating vault, maintains the
deduplication database, which contains the hash values of all items stored in the vault—except for
those that cannot be deduplicated, such as encrypted files.
The deduplication database is stored in the folder which is specified by the
Database path
in the
Create centralized vault
view when creating the vault. Deduplication database can be created in a
local folder only.
The size of the deduplication database is about one percent of the total size of archives in the vault.
In other words, each terabyte of new (non-duplicate) data adds about 10 GB to the database.
In case the database is corrupted or the storage node is lost, while the vault retains archives and the
service folder containing metadata, the new storage node rescans the vault and re-creates the
database.
2.12.6.2
How deduplication works
Deduplication at source
When performing a backup to a deduplicating vault, Acronis Backup & Recovery 10 Agent reads
items being backed up—disk blocks for disk backup or files for file backup—and calculates a
fingerprint of each block. Such a fingerprint, often called a
hash value,
uniquely represents the item's
content within the vault.
Before sending the item to the vault, the agent queries the deduplication database to determine
whether the item's hash value is the same as that of an already stored item.
If so, the agent sends only the item's hash value; otherwise, it sends the item itself.
Some items, such as encrypted files or disk blocks of a non-standard size, cannot be deduplicated,
and the agent always transfers such items to the vault without calculating their hash values. For
more information about restrictions of file-level and disk-level deduplication, see Deduplication
restrictions (p. 67).