Deduplication is great for the backup tier. Whether you implement it in your backup software or in an appliance such as a virtual tape library, you can potentially keep months of backups in a near-line state ready to restore at a moment's notice. That's a better deal than having to dig for tape every time you have a restore that's more than a day or two old.
Like most great ideas, however, deduplication has its drawbacks. Chief among these is that deduplication requires a lot of work. It should come as no surprise that NetApp, one of the few major SAN vendors to offer deduplication on primary storage, is also one of the few major SAN vendors to offer controller hardware performance upgrades through its Performance Acceleration Modules. Identifying and consolidating duplicated blocks on storage requires a lot of controller resources. In other words, saving capacity comes at a performance price.
Performance tip No. 10: Accelerate your backups
Backups are almost always slower than you'd like them to be, and troubleshooting backup performance problems is often more art than science. But there is one common problem that nearly every backup administrator faces at some point or another.
If you are backing up direct to tape, it's likely you're underfeeding your tape drives. The current generation of LTO4 tape drives (soon to be supplanted by LTO5) is theoretically capable of more than 120MBps of data write throughput, but few ever see that in real life. Mostly this is because there are very few backup sources that can support sustained read rates to match the tape drive's write performance. For example, a backup source consisting of a pair of SAS disks in a RAID1 array may be capable of raw throughput well beyond 120MBps in a lab environment, but for standard Windows-based file copies over a network, you'll rarely see rates greater than 60MBps. Because many tape drives become significantly less efficient when their buffers are empty, this becomes the root cause of most backup performance problems.
In other words, the problem isn't your tape drive; it's the storage in the servers you're backing up. Though there may not be a great deal you can do about this without investing heavily in a large, high-performance intermediate disk-to-disk backup solution, you have more options if you have a SAN. Though it will depend largely on the kind of SAN you have and what backup software you run, utilizing host backups -- which read directly from the SAN rather than over the network -- can be a great solution to this particularly vexing problem.
Matt Prigge is contributing editor to the InfoWorld Test Center, and the systems and network architect for the SymQuest Group, a consultancy based in Burlington, Vt.