November 16, 2009, 4:01 PM — The challenges of information growth are well documented today. In fact, we are at a point where information costs are automatically factored into in all areas of a typical IT budget. Copies of all types of information are everywhere. While organizations express the desire to "do more with less" and maximize efficiencies, storage requirements continue to grow across the data center and within remote offices. From primary storage for applications and virtual machines, to the storage required for backup and disaster recovery, fragmented islands of information develop - often each with its own separate tools, operations and service requirements.
Needless to say, this is creating a fundamental shift in the way organizations manage information. As organizations realize that yesterday’s tape-based approach is proving too cumbersome in today’s world, more companies are turning to disk and deduplication technologies in order to facilitate faster backups, reduce primary storage, and replace tape shipping for better disaster recovery.
In the real world, the adoption of disk and deduplication has been evolutionary, but not necessarily revolutionary. From virtual tape libraries and disk staging to target-based dedupe appliances, many organizations have learned that adding disk and deduplication to the current tape backup environment can greatly improve the overall performance for the top 10 percent of their tier-one workloads. But how should organizations handle the bigger challenges of information management across the vast amounts of structured and unstructured data that exists across their IT infrastructure?
By only applying deduplication at the tail-end of the process, multiple copies of the exact same information will continue to exist and grow on primary storage (e.g. email and file servers). This storage problem causes serious capacity management challenges anywhere large amounts of information exists. The bottom line is, if it takes up space, it costs money.
Deduplication Everywhere
Instead of applying deduplication in one place, users should consider deduplication everywhere. In order to maximize efficiency, consider the possibility of leveraging deduplication in an integrated fashion across backup and archiving applications – as well as storage. By deduplicating data right at the source of creation, organizations can focus on tackling the real problem of data reduction everywhere information resides.
For example, one commercial construction company faced the growing challenge of how to manage business critical information on 180 servers located at 45 remote sites around the United States. Backups conducted in remote locations were typically 100GB to 200GB per server and it took nearly 12 hours for each server to be replicated back to a central data center. By moving to disk-based deduplication across their remote offices, the company saw replicated data reduce to 10GB to 15GB and the backup window shrunk as they were able to reduce the amount redundant of data being moved by over 90 percent.
The question is no longer whether or not to dedupe, but how deduplication should be deployed and what is best for the IT environment?
In reality, deduplication is not about a single technology, it’s about selecting the right approach. This is why a number of storage vendors are working together in order to give organizations much greater flexibility in their dedupe options. By providing an interface between archiving and backup software and advanced disk-based storage appliances, organizations can now leverage an integrated platform that not only supports both source and target-based dedupe, but is also easy to manage.
Integrated deduplication within backup and archiving applications can provide unified data protection, from the remote office/branch office to the data center, and including both disk and tape, physical and virtual environments. From remote offices to the data center, these next-generation software solutions offer comprehensive protection and a single console for the management of all backup and recovery operations. Centralized management provides integrated data archiving, migration, and retention capabilities that address regulations for governance and compliance. Additionally, advanced reporting on backup and recovery operations enables service-level management of all protected data in the enterprise.
Organizations often choose this approach because deduplication is integrated into their existing backup and recovery application, giving them the ability to build a more customized solution.













