May 01, 2014, 7:21 AM —
Image credit: flickr/Jim Bahn
Once upon a time, if you ran a data center, you used virtual machine (VM) management programs (i.e., hypervisors) There was no other practical choice. This dates all the way back to the good old IBM 360 mainframe days with CP-67/CMS in 1967. Today, our data centers and clouds run more advanced hypervisors. Amazon Web Services (AWS), for example, is made up of nearly half-a-million Linux servers running the Xen hypervisor, while Microsoft's Azure cloud relies upon its Hyper-V hypervisor.
That's all well and good, but hypervisor support takes up a lot of system resources -- every VM runs not merely a full copy of an operating system, but a virtual copy of all the hardware that the operating system needs to run. That's great for using otherwise unused memory or CPU cycles, but say you're running multiple VMs and your users want more VMs -- more I tell you! -- then the fact that these fat VMs take up a lot of RAM and clock time starts to be troublesome. That's where containers (a different take on virtualization) comes in.
Advocates for containers argue that you don't need a full copy of the operating system, never mind virtual representations of the hardware. Isn't just enough of the operating system and system resources for just the program itself all you really need for your server applications? It turns out container fans were right. You can use one copy of an operating system to run multiple containers, each with an instance of an application, and this vastly reduces the system resources to run them.
So how does this container magic work? While its theory isn't as old as hypervisors, it does go back to the year 2000 and FreeBSD Jails. This was, and is, a way for users, usually the system administrator, to run programs in a sandbox. This Jail had access to the operating system kernel but other than that it could only get to a very limited set of other system resources. For example, a FreeBSD jail typically only has access to a preassigned Internet Protocol (IP) network address.
Since then, containers theory and practice has come a long way. Oracle Solaris, for example, has a similar concept called Zones. While virtualization techniques in Linux, such as Xen and Linux Kernel Machine (LKM) have gotten all the headlines, companies such as Parallels, Google, and Docker have been working in such open-source projects as OpenVZ and LXC (Linux Containers) to make containers work well and securely.
Over time all of these efforts have consolidated into LXC. With LXC, applications can run in their own container. Each container shares a common Linux system kernel, but unlike a VM there's no attempt made to abstract the hardware. Mind you, from the containerized applications' viewpoint the program still has its own file system, storage, CPU, RAM, and access to external devices.
LXC is designed to "create an environment as close as possible as a standard Linux installation but without the need for a separate kernel." To do this, containers use the following Linux kernel features:
* AppArmor and SELinux profiles
* Seccomp policies
* Control groups (cgroups)
* Kernel namespaces (ipc, uts, mount, pid, network and user)
Containers may also, as they do in Red Hat Enterprise Linux (RHEL), use Libvirt to define containers.
It's those first two items that really made containers good for practical use. Even as recently as 2011, the Libvirt developer, Daniel Berrange wrote in a blog post: "LXC is not yet secure. If I want real security I will use KVM." But while Container security has improved since then, you should never permit any application in a container run as root and you need to be on the alert for buggy system calls. The last thing you want is a zero-day vulnerability in your containers' base Linux kernel being use to pry open all your containers.
Cgroups and kernel namespaces were essential for the modern day container. The first provided an easy way to manage and monitor process resources allocation. With them, you can control the maximum amount of system resources, such as memory, CPU cycles, and disk and network throughput, that each container is allowed.
Namespaces are helpful in isolating process groups from each other. By themselves namespaces don't provide enough security, but they're useful in managing containers' access to network devices and other shared external resources.
Programs such as Docker are built on top of LXC to automate "the deployment of any application as a lightweight, portable, self-sufficient container that will run virtually anywhere." Besides simply making it much easier to deploy a program to a container, James Bottomley, CTO of Server Virtualization at Parallels and a top Linux kernel developer, has observed that programs like Docker allow you "to create a containerized app on your laptop and deploy it to the cloud. Containers gives you instant app. portability. In theory, you can do this with hypervisors, but in reality there's a lot of time spent getting VMs right. If you're an app developer and use containers you can leave worrying about all the crap to others."
The one big thing that hypervisors can do, that containers can't do is create VMs that use different operating systems and kernels. You can, for example, use VMware vSphere to run instances of Ubuntu Linux 14.04 and Windows Server 2012 simultaneously. In LXC, all containers must use the same operating system and kernel. So, you can't mix and match containers the way you can VMs.
So why should you bother with containers? For one simple reason: You can stick a lot more containers on a single server than you can VMs. How many more? Try two to three times more.
For example, Google uses containers all the time. If anyone knows about how to get the most bang from their servers, it's Google. As Bottomley said, "Google invested in containers early on. Anything you do on Google today is done in a container -- whether it's Search, Gmail, Google Docs -- you get a container of your own for each service." Google has open-sourced its own container stack, lmctfy (Let Me Contain That For You), if you'd like to see how Google does it.
Some experts, such as Stéphane Graber, one of the LXC project's leaders, believes that, "lmctfy isn't a full fledged container technology, Google rather confusingly uses "containers" to really mean "cgroups". lmctfy is a userspace tool and library to easily create and manage cgroups, nothing else. You won't get a full container, in the non-Google sense of the term, by using it."
And so we finally arrive at why containers are important for businesses: They enable you to make much more effective use of your server hardware. At day's end, containers are not important for their technology, they're important because they can substantially help your bottom line.