May 04, 2009, 9:30 AM — McKinsey, the doyen of strategy consultants, published a report on cloud computing last week featuring a disguised real-world case study. While the report doesn't explicitly state the fact, it seems that the paper is a summary of the results of a strategy project with a financial services firm, which apparently engaged McKinsey to assess whether it would make sense to move all of its systems to Amazon Web Services.
The client appears to a be a good-sized company, since one of the pages lists headcount for the IT organization and compares the total as-is vs. post-Amazon migration. Today's IT headcount stands at 1704. In typical companies IT represents about 2.3 percent of total staff; this would indicate a company with around 75,000 employees (however, financial services companies are heavy IT users, so IT might represent a higher portion of total employment, thereby reducing the employee population of the company somewhat). We will return to the topic of total headcount later in this post.
In its description of the computing environment, McKinsey notes that the organization is primarily a Windows shop.
One unintentionally funny thing about the report is that, early on, it notes that cloud computing is an ambiguous term, with no set definition. In fact, the report notes, McKinsey found 22 separate definitions of cloud computing. So McKinsey immediately recommends a new definition of cloud computing, seeming to assume that everyone will now adopt it, and the whole indefiniteness about the topic will immediately be settled. McKinsey's definition isn't obviously wrong, but it doesn't necessarily have anything special to recommend it.
I expect we will continue to see multiple definitions for the foreseeable future; more to the point, that's a good thing, evidence that the field is rapidly evolving and new characteristics are springing up all the time. Anyway, we all seem to manage to push along with no set definition for the Internet, so I expect we'll survive this confusion as well.
In terms of the outcome of the report, three of McKinsey's conclusions stand out to me:
1. Cloud is more expensive: Amazon is significantly more expensive than in-house in terms of the cost of running computing capacity. In fact, Amazon is 144 percent more expensive--costing US$366 per month per server vs. $150 internally. Therefore, attention should be focused on internal data centers, because they're more cost-effective.
2. Companies shouldn't focus on internal clouds: The big payoff is leveraging server consolidation via virtualization. While the report doesn't say it in this section, earlier it notes that cloud computing is at the top of the Gartner "hype cycle." The report is tinged with a definite flavor of disdain for the trendiness of cloud computing.
3. Companies can be nearly as efficient as cloud providers: By leveraging server virtualization, internal IT organizations can raise server utilization to 35 percent, just shy of Google's 38 percent.
While the report is interesting for a number of reasons, not the least of which is that it demonstrates how big picture strategy firms view cloud computing, it glosses over a number of issues, with ambiguous calculations and comparisons. Four in particular stand out:
1. A Single Example Does Not Reflect All Possible Scenarios: The McKinsey report's case study is a specific scenario that is not representative of all computing environments: Windows is the least attractive option for Amazon's cloud, particularly large instances. Windows represents a very small proportion of all Amazon EC2 instances; it's a much more Linux-oriented environment, for a number of reasons. Not all environments resemble the one in the report, and conclusions appropriate for this environment should not be applied in a blanket fashion.
2. The Headcount Numbers Don't Add Up: The headcount savings identified by McKinsey regarding moving to a cloud environment seem very small. For example, McKinsey estimates in this example that with a complete shift of servers out of the data center, the number of IT administrators only falls from 673 to 505.
3. Don't Forget Capital Expense for Facilities and Associated Assets: McKinsey calculates the monthly cost of running an internal server as $43; while in Amazon the same capacity would be $270. However, this is an incomplete comparison. Rather surprisingly, for a sophisticated strategy consulting firm staffed with experienced financial personnel, the analysis contains no capital expenses assigned to the self-host model beyond those of the server itself.
McKinsey repeats a very common mistake made by people skeptical about cloud computing: confusing the marginal cost of a single server in a company's own data center with the total cost of a server hosted by a cloud provider. In my research, the cost for data center construction runs $600 to $1000 per square foot. Some portion of that amount needs to be assigned to the internal server instance; furthermore, owning a data center is not a one-time expense--there's maintenance as well, which adds to the monthly cost of an internal server. That doesn't even address the capital expense assignment of additional capital assets like network equipment, storage arrays and the like.
As I noted in a previous blog posting, IT typically does a terrible job of accurately assessing what the total real cost of a given asset like a server actually runs. So this cost comparison is certainly flawed. It would be interesting to look at the numbers with an accurate accounting for actual internal costs; one might bet that the comparison would not be so stark.
And, as an aside, McKinsey notes only a 10 percent labor saving in moving the machine to an external hoster. It's hard to understand how shipping a server completely off-premises only reduces the work to manage it only 10 percent; surely the savings from not having to manage the hardware must be 100 percent? And, if the labor figure is the overall cost of labor (i.e., all types of labor needed to manage the OS, the app, etc.), that skews the comparison, because the relevant figure is what it costs to manage the hardware, because that's what's being outsourced to Amazon, not all labor associated with running a server instance.
4. The Issue isn't Utilization Rate, It's Cost per Unit of Computing Capacity: McKinsey does not recommend that companies attempt to mirror the characteristics of an Amazon Web Services by creating an internal cloud; instead, it proposes that server consolidation via virtualization be the primary strategy for cost reduction. By aggressively pursuing server consolidation, IT organizations can raise server utilization rates to nearly the 38 percent that Google accomplishes, McKinsey advises.
However, the raw utilization rate is not the point. The main question should be, what does a unit of compute capacity cost me? Google and its cloud brethren run their data centers at around 50 percent of the cost of a typical IT data center, so gaining the same utilization rate as Google still leaves you at twice the cost per compute capacity unit.
What would have been a better set of conclusions from the research McKinsey performed?
The job of operations is changing dramatically. IT operations is historically rooted in hardware management, with middleware and application management being overlaid as software systems became more complex. Today, with the advent of cloud computing, operations is delaminating into infrastructure management and application management. Amazon-style cloud computing allows you to outsource much or all of the infrastructure management. Therefore, it's important to understand the proportions of those roles in your operations group to understand what potential cloud computing offers. And, when it comes time to make a financial assessment, be sure to compare only the parts of operations germane to infrastructure management.
1. Review your portfolio of applications to understand what cloud computing means to you. Most of the systems in the case study are Windows-based, which is not that attractive a platform for Amazon hosting. Rather than assess what the costs would be to move everything to Amazon, it would have been better to analyze the portfolio of applications to see which could cost-effectively be moved to a cloud provider. Moving only 10 or 20 percent of an IT organization's systems can potentially offer significant savings--so focus on that as an initial cloud initiative.
2. Create a viable financial model for assessing the true costs of internal hosting. I mean, c'mon. Ignoring the capital investment necessary to host one server internally is a rookie mistake. Without a complete financial model, what's the point of the exercise?
3. Evaluate the potential for an internal cloud even if the numbers don't work with an external cloud provider. There are three characteristics of cloud computing that are generally cited (even by McKinsey in its definition); cost is only one of them. The other two relate to limitless capacity and the ability to initiate system use without commitment or delay.
Focusing on only the cost dimension poses the risk to IT organizations as being viewed as a commodity provider, unable to offer agility and scale to users, motivating the business to go elsewhere, even if those alternatives are more expensive. Getting boxed into a commodity provider position in a market that is seeking additional capability is a losing strategy.
Bernard Golden is CEO of consulting firm HyperStratus, which specializes in virtualization, cloud computing and related issues. He is also the author of "Virtualization for Dummies," the best-selling book on virtualization to date.
Note: HyperStratus has recently launched two one-day workshops on cloud computing, focused on helping organizations get started with their cloud initiatives. No equipment other than a client device and a browser are necessary for either of the workshops. Learn more about the workshops here.