September 10, 2012, 4:33 PM — When transitioning workloads to virtual environments, one of the big drawbacks for data center administrators can be a loss of visibility.
When a problem occurs, it can be difficult to get a handle on details like which users are affected and by how much as well as the causal links between the user layer, the application layer and the underlying infrastructure. This is often because the hypervisor abstracts the data about the underlying hardware.
"Monitoring the dynamic nature of virtualization with tools designed for single-technology silos creates a significant challenge for administrators," says Dave Bartoletti, senior analyst at Forrester Research. "There is a growing need for solutions that provide cross-tier visibility to effectively troubleshoot, monitor and analyze data across silos and deliver real-time business insights and operational intelligence."
Splunk--—provider of an engine that collects, indexes and analyzes massive volumes of machine-generated data--—thinks big data is the answer. Splunk customer CloudShare, —a San Mateo, Calif.-based provider of pre-production cloud for dev and test, demos and POCs,— sees a constant stream of data from its network/gateways/firewalls, backend, virtual machines, applications, web servers, databases and storage.
CloudShare's infrastructure as a service (IaaS) platform is designed to grant each customer--—including a large number of Fortune 500 firms like HP, SAP, Microsoft and IBM--—its own private multi-VM networked environment, including compute resources, networking, IP and preinstalled OS. During peak hours, its system performs about 500 VM resume/suspend operations an hour. Its VMware performance data alone comes in at about 2 million events per hour.
Getting a handle on that data, let alone correlating and analyzing it, is a tricky proposition. In its early days, Elad Gotfrid, CloudShare's director of IT, says the company got by with traditional monitoring tools. But it soon outgrew them.
Scaling Out With Splunk
"In the beginning, we used a traditional monitoring tool, which was good for a small scale," Gotfrid says. "Once you start to grow up, you see the scale doesn't allow you to use a traditional monitoring system anymore. You need higher visibility."