Server virtualization poses some serious challenges for storage systems and the administrators that maintain them. This simple fact is probably one of the more significant reasons why only around 50 percent of servers in data centers around the world have now been virtualized — despite the benefits of server virtualization being significant, well-known and real.
At the most basic level it comes down to a matter of cost. Server virtualization can enable some fairly hefty costs savings, but only if the storage systems that support it are up to the job. A 2010 study by William Blair and Company, a Chicago-based investment bank, found that companies involved in server virtualization projects typically spend $2 to $3 on storage for every $1 they spend on server virtualization. Numbers like those can blow the economics of server virtualization out of the water, according to Mark Peters, senior analyst at Enterprise Strategy Group (ESG). “It is certainly conceivable for people who don’t plan ahead sufficiently to lose the economic benefits of server virtualization through storage costs,” he said.
One reason for ballooning storage costs is that just as server virtualization decouples a virtual machine (VM) from the physical hardware on which it runs, it decouples the VM from the underlying storage, which is usually located on a SAN. Server virtualization vendors make a virtue of the fact that it is quick and easy to spin up a new VM, but this can lead to VM sprawl and hundreds of ghost VMs — VMs that are no longer needed or used but still consume storage resources. This is compounded when VMs are spun up from standard images that allocate far more storage resources than are actually needed.
In more general terms, it’s true to say that by its very nature, server virtualization is very storage resource hungry. It can particularly stress storage systems because it can make sequential accesses random — which is exactly the sort of storage behavior that challenges storage systems the most. “Because it’s easy to spin up VMs, you get an increase in demand for storage capacity, but also as VMs move around a virtualized infrastructure you get more random I/O and you stress the performance as well as the capacity of your storage systems,” said Peters. “This can certainly shock people into slowing their virtualization efforts, as they have to spend far more on storage then they expected.”
Peters mentions planning ahead, and one of the pitfalls of expanding storage without a clear strategy is storage sprawl — scaling out storage with new equipment to meet virtualization demands for more performance and capacity instead of ensuring it is possible to scale up by using a storage architecture that can be upgraded or expanded to cope. A single system that can scale up is easier to manage, administer and maintain; it occupies less valuable data center floor space; and it costs less to power and keep cool. The cost savings from this can be significant when you consider that the acquisition cost of a storage system can be as little as 20 percent of the total cost of running and maintaining it over its lifetime.
Loosening Storage’s Throttle on Virtual Servers
One of the biggest challenges posed by server virtualization has simply been how to cope with the high levels of I/O that multiple VMs running on a single physical host can generate, all going through a single hypervisor running on the host. An increasingly popular solution is to install a virtualization cache — usually several hundred gigabytes of fast solid state memory — using a PCIe bus connection right next to the processor.
When coupled with application-level caching software running in the hypervisor and in guest operating systems, this effectively offloads IOPS from the back-end storage system and into the cache, which can relieve pressure on storage systems, reduce latency and accelerate applications significantly. Vendors offering this type of solid state caching hardware and software include Fusion-io (with io Turbine software), OCZ (with VXL), and SanDisk (with FlashSoft).
This continues a trend of adapting storage technologies to virtualization from dynamically tiered storage arrays supplied by vendors (such as EMC and NetApp) to SSD appliances close to the server from newer vendors (such as Tintri, Nimble, Nutanix and StorSimple).
Another solution, which goes a stage further by presenting storage right at the VM level, is the so-called storage hypervisor. One example is the product offered by California-based Virsto. The storage hypervisor is installed as a virtual appliance on each physical virtualization host, and it then intercepts I/O requests that would normally go to the hypervisor. These are written to a log file and then to a pool of shared heterogeneous storage in an optimized manner that can lead to performance increases of a factor of 10, all the while reducing storage requirements by as much as 90 percent, thanks to the thin provisioning of the underlying virtual disks.
“I am a big fan of using a storage hypervisor,” said ESG’s Peters. “If you have virtualized everything else, why not manage storage in a heterogeneous manner as well by turning storage into one big pool in the way that Virsto does?”
Thin provisioning is a technology that can have huge benefits when used in virtualized infrastructures. According to research carried out by ESG, about half of all companies waste about half of their storage capacity. Virtualization requires vast amounts of storage, and thin provisioning can help ensure less of this vast amount is wasted. “Thin provisioning should be used by everyone,” said Peters.
The astounding fact, though, is that only about half of all enterprises are actually using thin provisioning, according figures provided to Peters by a vendor. “Many organizations simply haven’t switched it on,” he said.
Of course, in some situations thin provisioning may not be a good idea — it’s not suitable for use with applications whose storage requirements can vary significantly and very quickly, for example. But Peters believes that in many cases thin provisioning is not being used simply due to unwarranted conservatism.
There may be another factor at work here as well. Roy Illsley, principal analyst at Ovum, said few companies that virtualize their mission-critical applications are willing to use thin provisioning with those applications. “They will eventually, but for that to happen they will need some sort of sophisticated automatic provisioning system.” Automated systems do exist right now, but Illsley said that something far more powerful that helps deliver guaranteed service levels is needed.
This is connected to storage tiering and the notion of a storage system that allocates data to different tiers of storage (usually based on how frequently the data is accessed). Illsley suggested that the way forward for automated storage tiering systems may turn out to be increased use of economic modeling using a technology similar to the one VMTurbo offers: Instead of allocating data to tiers automatically based on usage frequency, VMs or applications may be allocated a budget that can be used to “buy” different tiers of storage from the storage systems. By allocating larger budgets to important applications, automated systems would then be able to ensure that different tiers of storage are allocated in a more effective manner.
Other Server Virtualization Challenges for Storage
There are other ways that server virtualization can challenge storage environments — think backups and disaster recovery, and issues as seemingly mundane as how storage is managed in a virtual environment and by whom. Increasingly, it is the big storage vendors themselves that will seek to provide solutions. “Your storage platform for a virtualized environment should be very tightly tied to your virtualization platform,” said Phil George, a marketing manager at storage vendor EMC. “In particular, we believe your backup must be integrated with your storage appliance.” Storage vendors are also integrating their products with management systems like VMware’s vCenter, which means that the distinction between a storage administrator and a virtualization administrator are becoming blurred, he added.
Blurred it may be, but one thing is crystal clear: Server virtualization presents significant storage challenges. Unless these challenges are faced head with the latest technology, enterprises are likely to see their virtualization efforts stalling, missing out on the benefits of server virtualization in the process.