Data center disaster recovery – a topic that’s top of mind for astute IT and data center managers. But a recent study by AFCOM, an association of data center management professionals with more than 4,500 members worldwide, found some sobering stats in a recent study of data center managers.
The findings prompted a paper, authored by the Data Center Institute, designed to help organizations improve their disaster recovery and business continuity initiatives.
First, the sobering findings. According to results from the AFCOM survey, more than 15 percent of data centers have no plan for business continuity or disaster recovery. Equally disturbing in the findings is that 50 percent of data centers have no formal plan for replacing damaged equipment after a disaster. And, two-thirds of all data centers have no plan or procedures to deal with cybercrime.
I don’t have details on who the survey respondents were, so I do consider the findings a bit intangible. But they are surprising, nonetheless.
The subsequent report, "How to Stay in Business: A Data Center Institute Report on Disaster Recovery & Business Continuity," was released during the 2011 Fall Data Center World conference, held in Orlando, Fla., earlier this month. The paper outlines the key aspects that a Business Continuity Plan (BCP) and Disaster Recovery Plan (DRP) need to address in order to cover the range of risks and events that could affect the services and infrastructure for which data center managers are responsible.
Regardless of the event – a natural disaster, cyber-attack, or something else – data center managers must be sure that the IT organization can deliver and maintain uninterrupted service and support core business functions. The paper discusses disaster recovery, and highlights the need for such plans to describe the “steps, processes and disciplines to recover from a disaster.”
The paper also breaks out the key characteristics of a sound BCP. The plan should cover the core electrical plant, including backup diesel engines, bulk fuel storage, UPS, batteries, transformers switchboards, and more; it should include the core mechanical plant such as chillers, condensers, and pumps. The BCP should also cover the technical area electrical systems and mechanical systems, security systems, and even fire suppression and detection.
There’s also a section discussing the difference in creating and maintaining redundancies. The most expensive, and reliable, is of course having two separate sites fully duplicated (all processing and data).The lowest cost, according to the paper, is operating a single data center with local backup and recovery systems. Risks can be mitigated in this scenario with various capabilities built into the data center.
The report also addresses cloud computing and its role in disaster recovery. For example, network latency is a critical factor to consider if leveraging public cloud for data backup. And among the key steps to take if using the cloud for disaster recovery: prioritize you applications and route network traffic.
Application prioritization typically is done to determine which apps have priority to available CPU cycles – a step that may seem unnecessary in a cloud environment with on-demand resources. But in this context, it is recommended that organizations consider how they will prioritize application deployment to ensure the most critical systems are up and running first. And as for network routing and capacity, these are considerably more complex in cloud environment, but they are also infinitely important. Organizations need to regularly monitor network usage and performance, and look for tools that can do so in all cloud environments: private, public and hybrid.