How do you measure cloud resiliency?


Christopher Nerney
Vote Up (17)

There are many vendors offering products designed to measure your cloud's ability to run under catastrophic circumstances. But here's how IBM Developer describes the general process:


"The metrics around resiliency all relate to keeping your cloud running under adverse conditions. Denial-of-Service attacks, run away processes, failed-hardware resources are examples of Security, Isolation, and Resiliency respectively. A cloud should be able to quickly react to issues within the 'hard' or 'soft' aspects of environment by moving workloads to working areas of the cloud and quickly failing over to another virtual environments. A robust enterprise cloud should also support disaster recovery features, allowing your cloud to be linked to another cloud in an active/passive or active/active setup. One can imagine a performance experiment to measure Resiliency being similar to the Elasticity tests. However, instead of the cloud reacting to a breach in SLA, the cloud must now react to a system failure. For example, unplug the 'blade' running the Apache Day Trader workload, to simulate a hardware error. Measure how long it takes the cloud to react to this breach and return the response time back to 2 seconds or better. Similarly, your cloud must support isolation such that if one tenant’s virtual system is 'running amuck' another tenant will not be disturbed. To test this scenario, we create a run away process that continually allocates memory, or disks space. While this is happening we measure the performance of a second tenant to see if we notice any ill effects from its neighboring tenant. Also watch the system vital signs such that the run-away tenant is 'capped'. Your cloud must still perform while under a denial of service attack. Hence, another test involves setting up a denial of service attack by opening up Port 80 and sending bogus HTTP traffic. The cloud should employ an application firewall that filters Port 80 and looks for ill formed HTTP requests and deny them access to the Cloud’s network."

Join us:






Ask a Question