Intelligent deduplication addresses some of these issues that are now coming to the forefront since organizations have mastered the more straightforward dedupe processes. In the coming year, IT leaders should look for deduplication capabilities that address the reporting and detection of data types. Being able to adapt to these data types, you will need to apply different policy options: inline deduplication, post/concurrent deduplication and not deduplicating.
The first policy, inline deduplication, makes the most sense for small storage configurations or environments with immediate replication needs. This option minimizes storage requirements and can deduplicate and replicate data more quickly. The post-process deduplication option occurs independently and can be scheduled for any point in time, including running concurrently. It can facilitate more efficient transfer to physical tape or more frequent restore activities by postponing deduplication. It allows deduplication solutions to make full use of available processing power while minimizing the impact to the incoming data stream. This process is geared toward multi-node clustered solutions, and it allows for full use of all computing resources. Finally, there are data types that simply do not deduplicate effectively and should not be included in the deduplication policies, including image data, pre-compressed or encrypted data.
Beware of the all-in-one approach
It can be tempting for enterprises and even small to midsize businesses to buy all-in-one storage or backup software solutions. Deduplication is just a commodity after all, right? Not so fast. Not all deduplication solutions deliver at the same level, and deduplication is not a solution that can be tacked onto an appliance or patched into software. When enterprises attempt to deploy such patchwork solutions, they often find limitations related to performance, scalability and reliability.
The requirements of individual IT shops still matter when it comes to dedupe. For example, the solutions built for large enterprises with massive amounts of data and rich, heterogeneous environments must offer high availability, data protection failover capabilities, scalability and large data repositories. Some of these deployments must accommodate multiple data center and remote offices, as well as consolidated data protection and disaster recovery resources that are cobbled together as the result of mergers and acquisitions. In these scenarios, deduplication is about simplifying processes and reducing costs.