July 19, 2012, 3:05 PM — Amazon Web Services (AWS) recent storm-related outage, which left Web sites including Netflix, Pinterest, and Instagram inaccessible, is just the latest in a string of costly cloud failures. Since 2007, a total of 568 hours of downtime at 13 major cloud services providers had an economic impact of $71.7 million dollars, according to the International Working Group on Cloud Computing Resiliency (IWGCR). Average down time has been 7.5 hours per year, according to IWGCR, an availability rate of 99.9 percent well below the required reliability for mission critical systems. "Cheap cloud services can be expensive," says Kevin C. Taylor, partner in the business services department of law firm Schnader, Harrison, Segal & Lewis.
While the typical cloud contract contains uptime clauses and credits for missed service levels, it often fails to adequately protect the enterprise customer. Service-level agreement (SLA) credits, typically capped at a proportion of monthly service fees "do not compensate for business losses associated with the downtime of a production application," explains Taylor. "Even in an extreme case of sustained and severe outages the credit amounts will be derisory--[say,] $20,000--in comparison to the business impact to the customer, which could potentially be in the millions."
But there are questions the intelligent customer can ask to make sure they are sheltered from potential storms in the cloud.
1. Does your baseline uptime SLA meet my business needs? Buyers used to five nines (99.999 percent uptime) will be disappointed by the 99.9 percent uptime SLAs of cloud providers. "It's one of the first terms they should ask their prospective provider about to see if they can do better," says Jim Slaby, research director of sourcing security and risk strategies for outsourcing analyst firm HfS Research. "Buyers should also negotiate well-defined recovery point and recovery time objectives for each service in their contract." You will have to pay more. The typical active-active service configuration required to deliver 99.999 percent reliability can add as much as 50 percent to monthly service costs, Slaby says.
2. How do you define "uptime" and "downtime"? "Sophisticated customers will clearly spell out exactly what is considered downtime," says Todd A. Fisher, partner with law firm K&L Gates. "Does it mean five percent of the end users are affected? Or 25 percent? Or 50 percent? What if the system is technically working, but is running so slowly that end users can't do their jobs effectively?"