Tool shows real-time lag, error rates of major cloud providers

The whole idea of cloud computing is that it's supposed to provide the ability to spin up virtual resources quickly and easily. A new tool from startup Ravello Systems shows that can be done ... sometimes.

After more than a month of monitoring three of the leading cloud computing providers Amazon Web Services, Rackspace and HP Ravello found inconsistent performance in terms of the amount of time it takes to provision a virtual machine, and the rate of errors in doing so.

[CLOUD DOWN: Worst Cloud Outages of 2013 (so far)

BACKGROUND: Ravello is virtualizing virtualization ]

Ravello found that Amazon Web Services seems to be the most stable of the three platforms, according to Ravello VP of R&D Gil Hoffer, but every now and then each of these providers has a "bad day," which leads to slower provisioning time and higher error rates. According to data collected since early June, AWS seems to have VM launch times typically below two minutes, but sometimes it can take up to more than four minutes. HP generally launched VMs in less than two minutes, but had spikes of up to six minutes. Rackspace fairly consistently took more than five minutes to launch VMs in its Chicago and Dallas data centers.

Ravello makes software that allows customers to deploy applications to public cloud services from Amazon, Rackspace and HP, without making changes to the application's code. The founding team previously built the Kernel Virtual Machine (KVM) hypervisor, which was sold to Red Hat. In enabling this functionality, Ravello engineers deploy a lot of VMs to these three cloud providers. Company officials have aggregated results of the cloud testing and now display it in a dashboard that's available to anyone, showing how long it takes for virtual machines to be provisioned, and the error rate in provisioning the equipment. The dashboard can be seen here.

Part of the inconsistency is just the nature of IT and the expectation that systems will have issues, whether they be on-premises mainframes, or virtualized hosted resources. The key is that when using cloud resources to take that into account. "In the cloud, you need to expect and architect for problems at any level, from startup to runtime to teardown," says RedMonk analyst Donnie Berkholz. "Although people may already realize the importance of building with the expectation that existing instances disappear, they may not be building for delays or failures in creation of new instances."

Navin Thadani, SVP of product for Ravello, says that over time, as the cloud computing market matures, these response times will quicken and error rates will drop. "The public cloud is a public resource," he says. There will be days when customers will see higher error rates and provisioning times. "The fact is, if you're building an application for the cloud, you need to consider all these metrics."

For example, if an application is being built with auto-scaling functions to automatically provision more resources as they are needed, the dashboard shows that the VM provisioning may not be instantaneous, it may take minutes. Applications should be architected to retry to provision resources because of the likelihood of error rates, he adds.

Network World senior writer Brandon Butler coverscloud computingand social collaboration. He can be reached atBButler@nww.comand found onTwitterat @BButlerNWW.

This story, "Tool shows real-time lag, error rates of major cloud providers" was originally published by Network World.

From CIO: 8 Free Online Courses to Grow Your Tech Skills
Join the discussion
Be the first to comment on this article. Our Commenting Policies