Data Reduction Report

Data reduction, whether it is compression or deduplication, is the substitution of processor, memory, and disk I/O for disk storage space. For systems that support deduplication, the data reduction ratio is measured by combining the space savings achieved through compression and deduplication. The data reduction ratio is represented as follows:

   
the size of data that has been received by the system from all of its clients' backups
Data Reduction Ratio

=

divided by
   
the space used on the system by these backups


Prior to Release 5, all data reduction was accomplished using compression. Starting with Release 5 and the introduction of Unitrends' Adaptive Deduplication, for platforms that support deduplication, data reduction is accomplished using both compression and deduplication. The deduplication ratio is a measure of the space saved by deduplication, and is represented by:


   
the size of data that has been received by the system from all of its clients' backups that are deduplicated
Deduplication Ratio

=

divided by
   
the space used on the system by these deduplicated backups


In the first phase of Unitrends' Adaptive Deduplication, file-based backups are eligible for deduplication. In subsequent phases, other backup types will be considered for deduplication.

The chart depicts both the system's data reduction and deduplication ratios in a graphical manner. The date at which the ratios were gathered is shown along the x-axis, while the data reduction (the bars) and deduplication (the green line) ratios are shown on the y-axes.

The data reduction and deduplication ratios vary widely based on the environment:


On a day-to-day basis, you should expect to see variations in your system's data reduction and deduplication ratios, depending on the clients' operating systems and filesystem sizes, as well as the backup types and schedules and amount of available system device space. A typical schedule of weekend Master backups, followed by daily differential backups during the week, will cause the system to exhibit higher levels of deduplication after the Masters complete, as the client's latest Master backups and its Differentials are not considered for deduplication until another successful Master backup has completed. You will see this in the chart as deduplication and data reduction ratios that rise after a Master backup, then trend downward as the Differentials are completed, then rise again after the next Master backup has completed -- because the prior Master and Differentials are then deduplicated. Over time, you see higher levels of retention (e.g., increased numbers of Master backups on the system for each client) and this will also be reflected by a higher deduplication ratio.