Managing Storage in a Virtual World
Demand for storage has been growing rapidly for some time to meet ever-expanding volumes of data. And it seems that the more common virtualized servers become, the more storage is required. Together, the two trends data growth and virtualization are becoming a potent combination for storage growth.
"Storage capacity continues to grow at a rate of nearly 60 percent per year," said Benjamin Woo, an analyst at IDC. "2008 is likely to represent an inflection point in the way applications and storage will be interfaced. And virtual servers will emerge as the killer application for iSCSI."
Are virtual machines (VMs) accelerating storage growth? According to Scott McIntyre, vice president of software and customer marketing at Emulex, VMware is typically given a large storage allocation than normal. This acts as an extra reserve to supply capacity on demand to various virtual machines as they are created. In fact, VMware actually encourages storage administrators to provision far more storage than is physically present, for example, giving each of 20 VMs a 25 percent share of capacity. And it also makes it easier to provision away an awful lot of storage.
In theory, this is supposed to make storage more efficient by improving utilization rates. But could it inadvertently be doing the opposite?
"VMware virtualized environments do not inherently need more storage than their physical counterparts," said Jon Bock, VMware's senior product marketing manager. "An important and relevant point is that customers do often change the way they use and manage storage in VMware environments to leverage the unique capabilities of VMware virtualization, and their storage capacity requirements will reflect that."
What seems to be happening is that companies are adapting their storage needs to take advantage of the capabilities built into virtual environments. For example, the snapshot capability provided by VMware's storage interface, VMFS (virtual machine file system), is used to enable online backups, to generate archive copies of virtual machines, and to provide a known good copy for rollback in cases such as failed patch installs, virus infections, and so on. While you can do a lot with it, it also requires a lot more space.
Solving Management Headaches
Perhaps the bigger problem, however, is the management confusion inherent in the collision of virtual servers and virtual storage.
"The question of coordinating virtualized servers and virtualized storage is a particularly thorny issue," said Mike Karp, an analyst with Enterprise Management Associates. "The movement toward virtualizing enterprise data centers, while it offers enormous opportunities for management and power use efficiencies, also creates a whole new set of challenges for IT managers."
Virtualization, after all, is all about introducing an abstraction layer to simplify management and administration. Storage virtualization, for example, refers to the presentation of a simple file, logical volume or other storage object (such as a disk drive) to an application in such a way that allows the physical complexity of the storage to be hidden from both the storage administrator and the application.
However, even in one domain such as servers this "simple layer" can get pretty darn complicated. Just take a look at what it does to the traditional art of CPU measurement using as an example an IBM micropartition in an AIX simultaneous multi-threaded (SMT) environment that consists of two virtual CPUs in a shared processor pool. This partition has a single process running that uses, let's say, 45 seconds of a physical CPU in a 60-second interval. When you come to measure such an environment, it presents some challenges. The results can be different, for example, if SMT is enabled or disabled, and if the processor is capped or uncapped.
The CPU statistic %busy represents the percentage of the virtual processor capacity consumed. In this example, it might come out as 37.5 percent. Now take another CPU measurement, this time by LPAR (Logical Partition) known as %entc. This represents the percentage of the entitled processor capacity consumed and it comes out as 75 percent. Take another metric, %lpar_pool_busy, which is percentage of the processor pool capacity consumed. It comes out at only 18.75 percent. Or %lpar_phys_busy the percentage of the physical processor capacity consumed. It scores 9.38 percent. And there are other metrics which might show completely different results.
"A capacity planner might look at one score and think utilization is low, whereas another takes a different view and sees an entirely different picture," said Jim Smith, an enterprise performance specialist at TeamQuest Corp. of Clear Lake, Iowa. "So who's right? It's not an easy question to answer with virtualized processors. Each answer is correct from its own perspective."
Finding the Root Cause
To make things more challenging, there is the ongoing trend of marrying up virtual servers with virtual storage. That means having to manage across two abstraction layers instead of one. Now let's suppose something goes wrong. How do you find out where the problem lies? Is it on the application server, on the storage, on the network or somewhere in between?
"Identifying the root cause of the problem that potentially could be in any one of several technology domains (storage, servers, network) is not a problem for the faint of heart and, in fact, is not a problem that is always solvable given the state of the art of the current generation of monitoring and analysis solutions," said Karp. "Few vendors offer solutions with an appropriate set of cross-domain analytics that allow real root cause analysis of the problem."
EMC majority owner of VMware starts to look pretty smart now for its acquisition of Smarts a little while back. It is heading down the road of being able to provide at least some of the vitally needed cross-virtualization management. And NetApp is heading down the same road with the acquisition of Onaro.
"Onaro extends the NetApp Manageability Software family, as SANscreen's VM Insight and Service Insight products help minimize complexity while maximizing return," said Patrick Rogers, vice president of solutions marketing at NetApp. "These capabilities make Onaro a key element in NetApp's strategy to help customers improve their IT infrastructure and processes."
For virtual machine environments, VM Insight provides virtual machine-to-disk performance information to optimize the number of virtual machines per server. For large-scale virtual machine farms, this type of cross-domain analytics assists in maintaining application availability and performance. SANscreen Service Insight makes it easier to map resources used to support an application in a storage virtualization environment. It provides service level visibility from the virtualized environment to the back-end storage systems.
Meanwhile, the management of multiple virtualization technologies is coming together under the banner of enterprise or data center virtualization. This encompasses server virtualization, storage virtualization and fabric virtualization.
"IT managers are increasingly considering the prospect of a fully virtualized data center infrastructure," said Emulex's McIntyre. "One of the characteristics of enterprise data centers is the existence of storage area networks (SANs). There is a high degree of affinity between SANs and server virtualization, because the connectivity offered by a SAN simplifies the deployment and migration of virtual machines."
SAN-based storage can be shared between multiple servers, enabling data consolidation. Conversely, a virtual storage device can be constructed from multiple physical devices in a SAN and be made available to one or more host servers. Not surprisingly then, not only are storage devices being virtualized, but increasingly there is interest in virtualizing the SAN fabric itself in order to consolidate multiple physical SANs into one logical SAN, or segment one physical SAN into multiple logical storage networks.
Emulex, for example, is providing the virtual plumbing to handle some of the connectivity gaps between storage and server silos. Emulex LightPulse Virtual HBA technology virtualizes SAN connections so that each virtual machine has independent access to its own protected storage.
"The end result is greater storage security, enhanced management and migration of virtual machines and the ability to implement SAN best practices such as LUN masking and zoning for individual virtual machines," said McIntyre. "In addition, Virtual HBA Technology allows virtual machines with different I/O workloads to co-exist without impacting each other's I/O performance. This mixed workload performance enhancement is crucial in consolidated, virtual environments where multiple virtual machines and applications are all accessing storage through the same set of physical HBAs."
No doubt over time, more and more of the pieces of the virtual plumbing and a whole lot more analytics will have to be added to the mix to make virtualization function adequately in an enterprise-wide setting. Until then, get ready for an awful lot of complexity in the name of simplification.
"It is absolutely necessary to understand the topology, in real time or at the very least, in near real-time in order both to identify problems and to manage the entire environment proactively as a system and preempt problems," said Karp. "In a best-case scenario, a constantly updated topology map would be available for each process being monitored."