WHEN DISASTER STRIKES: After a hack: The process of restoring once-lost data
The new cloud recovery model, however, brings new challenges. One of the biggest is bandwidth, especially for customers that have highly transactional apps, Dines says. DRaaS involves placing copies of applications and virtual machine images in the provider's cloud. When a disaster strikes, those apps and VMs are automatically brought out of storage and spun up. This allows users to not avoid paying for the reserved instances of VMs, or having dedicated infrastructure in a managed service model. The problem is those apps and VM images must be constantly updated. "You need to keep your DR site up to date, and that could mean moving large amounts of data daily," Dines says.
Apps with high rates of change, anything above 20% daily, may be uploading a lot of data to the cloud each day, she says, which could strain bandwidth. Because of this, Dines says the most common apps supported by DRaaS tend to be less complex ones that can be easily booted cold from a VM, and especially ones that already run on virtual machines. CRM, ERP and HCM apps may fit the bill, but highly transactional databases may not.
To deal with the bandwidth issue many providers team with a partner with a WAN optimization firm, or use types of caching to only send selective updates to the cloud, Dines says.
Richard Cocchiara, IBM's CTO for business continuity and resiliency services (BCRS) - one of the legacy vendors that is pivoting to offer cloud-based DR services - says the company works with customers to provide estimates of how much data upload/download will be needed as part of the vetting and customer acquisition process.
Another concern in the cloud is around providers overprovisioning their facilities. In a DRaaS model, providers typically support customers in a multi-tenant environment. While a provider may be able to accommodate a disaster that befalls one customer, can they adequately support multiple customers if the disaster is regional? Morency says DRaaS just hasn't been proven at scale that they can survive a major 9-11 or Katrina-like disaster. It's one thing to restore 15 to 20 VMs during a test, it's another to have hundreds of customers all declaring disaster at the same time and expecting a two-hour recovery window, he says. "That can be tight for a single organization, let alone a provider serving hundreds of customers at once," Morency says.