Resource allocation for virtual machines is like running a gym

It's all good, until it isn't

tonezone gym dublin jan2013
Credit: Wikimedia

A gym is a place I've heard about that other people visit to get fit and be healthy. The days and times that each person goes to the gym can vary greatly, but there is a general trend to the light and heavy use periods of the day. If you own the gym, you need members. You have a finite amount of equipment for your members to use, but you wouldn't stay in business long if you limited the number of members to the number of machines that you have. Instead, knowing that your member's visits will be sporadic, you oversubscribe your memberships to make better use of your resources. This is the same theory behind oversubscribing virtual machines, especially where VDI (virtual desktop infrastructure) is deployed.

Think of treadmills as CPU cores. If you have 24 treadmills, you can only ever serve 24 people simultaneously. If you only had 24 people with gym memberships however (1:1), nearly all of those treadmills would be unused the vast majority of the time. Instead you would accept more memberships, let's say 100 (~4:1). Now those treadmills might be 40% occupied at all times which makes for a better return for the gym owner. You are aware that at the peak hours (before and after work) there may be times when a person or two are waiting on a machine, but for the most part you've got ample capacity.

In a virtualized environment things work the same way. With VDI, most users will have pretty low CPU usage throughout the day but there may be periods of high use at the beginning and end of the work day when employees sign in/out of their machines (also known as a boot storm). System administrators may opt to oversubscribe the virtual machine hosts for VDI to get a bigger bang for their hardware buck, allocating 1 physical CPU core to many VM's. This may cause a slow down in the morning and evening but the rest of the time may have acceptable performance.

At the gym, there is the chance that every member shows up at the same time however. If that were to happen, the building would be clogged with people just waiting around for a machine to become available so they can work out. Things would come to a grinding halt until every person has had a turn on the treadmill or enough people abandon the gym for a while. It wouldn't be a good situation. Similarly, if you oversubscribe your CPU resources, there is the chance (higher odds than the gym scenario) that VM's will be waiting for CPU resources to do work and performance will rapidly degrade.

The reason this is a real risk with virtualization is because of the way CPU work is scheduled by the host. If you tell a VM it has 2 cores, the VM will expect that there are always 2 open cores on the host to process it's requests. If other VM's are using all of the host cores, regardless of the overall CPU utilization, your VM will have to get in line with the scheduler for the next available CPU cores.  If you oversubscribe your VM host, this will absolutely be happening. Depending on your workload however, this may still provide acceptable performance. 

There are those who would say that you should never ever over subscribe your VM host resources. I think that's a bit extreme as I can see how it might make business sense for smaller companies, or for sys admins that really know their workload/users. I'm not aiming to open up that debate, simply make an analogy. In my case where we have a virtualized hosting environment (not for VDI), we do not oversubscribe our resources. We allocate 1 vCPU per physical CPU to maintain predictable performance. The same goes for RAM and we pay attention to disk I/O as well. 

ITWorld DealPost: The best in tech deals and discounts.
Shop Tech Products at Amazon