Data Reduction Report
Data reduction, whether it is compression or deduplication, is the substitution of processor, memory, and
disk I/O for disk storage space. For systems that support deduplication, the data reduction ratio
is measured by combining the space savings achieved
through compression and deduplication.
The data reduction ratio is represented as follows:
the size of data that has been received by the system from all of its clients' backups |
||
Data Reduction Ratio |
= |
divided by |
the space used on the system by these backups |
Prior to Release 5, all data reduction was accomplished using compression. Starting with Release 5 and the introduction of Unitrends' Adaptive Deduplication, for platforms that support deduplication, data reduction is accomplished using both compression and deduplication. The deduplication ratio is a measure of the space saved by deduplication, and is represented by:
the size of data that has been received by the system from all of its clients' backups that are deduplicated |
||
Deduplication Ratio |
= |
divided by |
the space used on the system by these deduplicated backups |
In the first phase of Unitrends' Adaptive Deduplication, file-based backups are eligible for deduplication. In subsequent phases, other backup types will be considered for deduplication. The chart depicts both the system's data reduction and deduplication ratios in a graphical manner. The date at which the ratios were gathered is shown along the x-axis, while the data reduction (the bars) and deduplication (the green line) ratios are shown on the y-axes.
The data reduction and deduplication ratios vary widely based on the environment:
- The type and amount of data being backed up, including how much data is being backed up with file-based backups and how much with application backups;
- The backup types and schedules for clients being protected (frequency of master and differential backups);
- The level of commonality among clients' operating systems;
- The frequency and degree of data that is changed (the lower, the better);
- The retention of the data (the longer, the better);
- The specifics of the data reduction algorithms being used.
On a day-to-day basis, you should expect to see variations in your system's data reduction and deduplication ratios, depending on the clients' operating systems and filesystem sizes, as well as the backup types and schedules and amount of available system device space. A typical schedule of weekend Master backups, followed by daily differential backups during the week, will cause the system to exhibit higher levels of deduplication after the Masters complete, as the client's latest Master backups and its Differentials are not considered for deduplication until another successful Master backup has completed. You will see this in the chart as deduplication and data reduction ratios that rise after a Master backup, then trend downward as the Differentials are completed, then rise again after the next Master backup has completed -- because the prior Master and Differentials are then deduplicated. Over time, you see higher levels of retention (e.g., increased numbers of Master backups on the system for each client) and this will also be reflected by a higher deduplication ratio.