From: www.itworld.com

Averting disaster

by Robert L. Scheier

March 19, 2001 —

 

So your organization has gone global, with mission-critical applications spanning time zones and national borders. You're more extended -- and more vulnerable, relying on not only the glass house down the hall but also on an Internet service provider in Guatemala or a telecommunications company in Kazakhstan to get your fancy Web-enabled applications to customers and suppliers.

How do you protect these far-flung systems against natural or man-made disasters? With a mix of centrally developed recovery processes, enough flexibility to account for local differences in culture and infrastructure, and the clout of upper management to ensure that it all gets done, say IT managers and disaster planning professionals.

Multinational companies have been running global applications for decades, of course. But in the past, they were often hosted on tightly controlled internal computer systems, accessed over expensive but reliable private networks and could tolerate an occasional 24-hour outage. Today's global applications are often a mishmash of custom and off-the-shelf applications running across the less-reliable Web, and because they're important, they must be brought back up within hours or even minutes -- not days -- after a crash.

Global systems often involve not only multiple locations or divisions of a company but also systems controlled by suppliers or customers.

"We have more than 300 [e-commerce] initiatives in our organization," says Julia Graham, group risk manager at London-based Royal & Sun Alliance Insurance Group PLC. "With a Web-based business, you could have many joint venture partners and suppliers, and the plan becomes a matrix of different recovery needs based on the potential scenarios that might arise."

Different regions of the world differ widely in the quality of physical recovery sites and the quality of staff at those sites, say IT managers. And because these applications support vital business functions, they must often be brought back up immediately.

"It's not fun," says Jay Leader, director of application development at Nypro Inc. in Clinton, Mass., a plastics molding company that operates 75 servers and has 4,000 users around the world. "It's hard enough . . . to do domestically, when everyone speaks the same language and is in the same time zone," he says, but it's even harder "to try to coordinate an [IT] vendor in Singapore and a vendor from China."

The first step should be for business managers -- ideally at the local business units, to ensure their buy-in -- to decide what applications are most in need of protection and how much protection they're worth. This is often the point at which the critical but touchy issue of who will pay for this "application insurance" should get tackled but often isn't, says Gerard Minnich, a global business continuity program manager at Electronic Data Systems Corp. in Plano, Texas.

"Typically, where programs fail is at the [funding] level," he says, especially at a local business unit. Along with a corporate edict to provide disaster recovery, says Minnich, management must also provide a clear process for determining backup priorities and how to fund them.

"If you don't have guidelines and you don't have criteria, you won't have funding," Minnich says.

A Range of Price Tags

Business recovery costs vary widely. A basic assessment of a company's recovery needs might cost $50,000 to $100,000, while a large company might spend $1 million per month for high-level disaster protection, says Todd Gordon, general manager of IBM's Business Continuity and Recovery Services division. In general, he says, companies should expect to spend between 7% and 15% of their overall IT budgets on disaster recovery.

Agreeing on how to bring a failed system back up is both more important and more tricky in a multinational environment. People in different parts of the world work according to different schedules and cultural rules -- not to mention the fact that they speak different languages and live in different time zones.

"Synchronization of the recovery is real key," says Bill DiMartini, vice president of consulting operations at SunGard Planning Solutions, part of SunGard Data Systems Inc. in Wayne, Pa.

Say, for example, an outage that hits an enterprise resource planning system at midnight in Germany stops data flowing to and from a factory in Singapore. The factory will keep using parts and shipping products. But when the system in Germany is brought back up, the staffs in Singapore and Germany must synchronize the two databases not to the point when the German system went down but to the last backup on the German system.

Since synchronization is also required in day-to-day operations, some companies link disaster recovery planning to regular IT operations. That means linking the change management and version control done in the corporate data center to that done at a backup site, says Marshall McGraw, manager of IT business services at Phillips Petroleum Co. in Bartlesville, Okla.

"Let's say we do an upgrade internally to SAP [R/3] that affects the data that needs to be recovered, or [we change] the configuration of the hardware" on which R/3 runs, says McGraw. Unless the backup site knows about every such change, he says, "you spend a week trying to find all the changes you made [since you last] declared a disaster." Once the procedures are in place to keep the backup site in the loop, the ongoing effort to communicate those changes is minimal, he says.

Think Globally, Act Locally

Given the obstacles, few, if any, multinational firms are doing real-time recovery of global applications. They instead recover applications at local sites and then reconcile the changes around the world later, says Gordon.

But one global recovery practice won't serve everyone's needs. "Some of our operations are fairly small, and some of our operations are fairly significant," says Leader. One plan might be overkill at a small location but grossly inadequate at a large facility.

Many multinational companies issue centrally mandated guidelines for business recovery, leaving local units substantial flexibility in how they reach the goal. Some keep the strictest rein on applications that gather and share information affecting the entire business, giving local units more autonomy on site-specific systems.

Phillips Petroleum, for example, has centralized the operation and backup of its core SAP R/3 and Oracle applications, says McGraw. Every 24 hours, IT staff at headquarters ship backup tapes to a disaster recovery center. The central IT group also arranges for backup network links should the primary Web connections go down.

Remote sites are free to make their own arrangements for hot sites, data backup and backup network links, assuming they follow common recovery procedures, says McGraw.

Graham's colleagues at Royal & Sun are currently working on the third release of the company's worldwide standard for business continuity planning, part of which is based on basic principles of disaster recovery planning and part of which "will be very much influenced by the local business needs, including call centers and those related to the 'e' world," she says. If a business unit can develop a disaster recovery plan without using the central standards, "I'm perfectly happy with that," she says.

Recovery Planning

*Make sure you have senior management backing to ensure compliance.

*Use a consistent planning process and methodology so all business units know the ground rules and how disaster planning will be funded.

*Don