3 tips for making highly available systems in Amazon's cloud

By Brandon Butler, Network World |  Cloud Computing, Amazon Web Services

Amazon Web Services makes a big deal out of its availability zones, recommending that customers use multiple AZs to build clouds that are tolerant of failures like the one the company recently experienced.

But experts say using a second availability zone when you're starting up an AWS instance isn't a panacea to prevent your app from going down during an outage, nor is it as easy to do as simply flipping a switch. And it will cost you more -- sometimes more than double the cost of deploying to a single AZ.

AMAZON TAKES THE BLAME: Amazon reveals cause of outage, refunds customers

MORE CLOUD: Top 8 cities for cloud computing jobs

There are a variety of tools for customers to create highly available, fault tolerant systems within Amazon's cloud. One option is to use AWS's own services, specifically its Elastic Load Balancers (ELBs) that allow workloads to be moved from one availability zone into another. However, AWS acknowledged in a post-mortem report about the October outage that even its ELBs were impacted.

1. Architectural design

There are a variety of vendors, some call them cloud access management tools, that will manage this process of creating highly available cloud systems using AWS. RightScale and enStratus are two of the most popular. RightScale offers customers prepackaged solutions that spread workloads across AZs, or even various providers. But as Gartner IaaS analyst Kyle Hilgendorf says, "It's a cost vs. risk play. It's not easy, and it's not cheap." Highly available cloud systems simply cost more and add complexity.

Fault-tolerant applications have to be built to be horizontally scalable. In an ideal world they would be stateless, meaning they wouldn't be constantly changing with new data being inputted and saved into them. In most cases, it requires a copy of your system to be made somewhere else to ensure fault tolerance during an outage. Even when all of that is taken into account, there can still be problems.


Originally published on Network World |  Click here to read the original story.
Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Answers - Powered by ITworld

ITworld Answers helps you solve problems and share expertise. Ask a question or take a crack at answering the new questions below.

Ask a Question