« Previous 1 2 3 4 Next »
Data deduplication on Windows Server 2022
Double Trouble
Deduplication Process
Technically speaking, the deduplication function analyzes the data blocks on a volume and searches for duplicates. As soon as identical data blocks are found, the system only keeps one copy and creates links to this block for each instance of its use. This process is executed by a background service that runs regularly to check for new and modified files.
Deduplication on Windows Server 2022 uses a postprocess approach (i.e., the data is first saved in its original form and then retroactively deduplicated). This approach minimizes the effect on system performance during primary storage operations. For efficient data processing, deduplication relies on a chunking algorithm that breaks the data down into smaller units and then analyzes them individually.
Data integrity is a key aspect of deduplication. Windows Server 2022 uses various mechanisms, including checksums and integrity checks, to ensure that the deduplicated data is not corrupted. Deduplication relies on metadata to manage the original data and the deduplicated copies, requiring additional care for backup and restore operations because the metadata is key to reconstructing the original data correctly.
Installation Two Ways
Data deduplication can be integrated with Server Manager by installing the Data Deduplication server role under File and Storage Services | File and iSCSI Services (Figure 1). Alternatively, you can run the following command in PowerShell:
Install-WindowsFeature -Name FS-Data-Deduplication
Installing deduplication does not start the process; it simply imports the required system files. You need to complete the configuration in Server Manager or PowerShell.
Testing Volumes
In the course of installing the server roles for data deduplication, the installation wizard also integrates the ddpeval.exe
command-line tool. You can use it at the command line to search for duplicate files (Figure 2). Doing so will tell you whether the server role can be meaningfully applied to the individual data carriers on the server. You cannot enable data deduplication on boot drives or use ddpeval
to check whether data deduplication on boot drives makes sense.
The tool resides in the \Windows\System32
directory and supports both local drives and network shares. The syntax of the tool is ddpeval <Volume:>
, as in:
ddpeval E:\ ddpeval \\nas\data
The ddpeval
tool itself does not clean up the files; it simply tells you whether or not data deduplication makes sense for the drive in question and offers a preview of possible savings through data deduplication without modifying the data. For a more targeted analysis of a specific directory, you need to modify the command as follows:
ddpeval.exe D:\Data\Projects
The output from ddpeval
contains details of the total size of the analyzed data, the estimated size after deduplication, and the savings as a percentage. This information is crucial for making an informed decision on implementing data deduplication. In particular, the tool helps you evaluate the potential benefits of deduplication and decide which volumes or directories are best suited for deduplication. The following command lets you save the results:
DDPEval.exe d:\wsus /v /o:C:\temp\dedup.txt
This syntax gives you a comprehensive report on the potential storage space savings that you can achieve by introducing data deduplication.
« Previous 1 2 3 4 Next »
Buy this article as PDF
(incl. VAT)
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.