Deduplication, Storage Tiering and VPlex Star at EMC World
BOSTON — The big news on the second day of EMC World was all about new Data Domain software that speeds up deduplication.
EMC's (NYSE: EMC) fast-growing data deduplication product lineup overshadowed some midrange announcements surrounding the company's fully automated storage tiering (FAST) technology, suggesting just how big an opportunity the data storage giant sees in its dedupe and backup business.
As reported last month, Data Domain and Avamar deduplication products more than doubled their sales year-over-year. Frank Slootman, president of EMC's backup and recovery systems division (encompassing Data Domain, Avamar, NetWorker and Disk Library), headed up the press conference.
"Data Domain alone may well surpass $1 billion in sales this year, as the demand is huge," said Slootman.
That growth will be assisted by the latest upgrade to Data Domain. Known as Boost, it speeds up both backup and deduplication. What it does, essentially, is move the bulk of the work from the Data Domain target device back upstream to the backup media server.
"Boost is a software client that transfers the compute-intensive work of segmentation and fingerprinting onto the backup server," said Slootman. "This reduces the load on the backup server and frees up even more bandwidth for the Data Domain appliance."
He reported 20 to 40 percent drops in backup load due to fewer copy requests. Instead of sending all the data to the Data Domain target, the backup server determines what needs to be forwarded and then transmits only that info. That means 80 to 99 percent less backup LAN bandwidth and a 50 percent faster Data Domain box. Boost is currently available for Symantec NetBackup and Backup Exec. In the second half of the year, it will be integrated into EMC NetWorker.
Slootman gave some insight into the guts of the system. Traditional backup servers typically use protocols such as NFS or CIFS. However, these wire protocols were not developed for backup and they perform it inefficiently. Data Domain first began to bypass the old approach when Symantec (NASDAQ: SYMC) released its Open Storage Technology (OST).
"OST already claims 22 percent of the Data Domain customer base," said Slootman. "It dramatically speeds up performance and users love it."
Boost is a generalization of Data Domain's OST-related code so it can operate beyond the constraints of NetBackup, and it has added functionality that makes it faster and more efficient than the OST-only version. Used with a DD880 deduplication device, it changes throughput from 5.5 TB per hour to 8 TB.
In a few months, those installing EMC NetWorker will find Boost already included as a free addition. When they install the backup software, their Data Domain box will recognize it. Meanwhile, existing Data Domain customers can receive the software free of charge as a plug in for NetBackup and Backup Exec as part of their maintenance contracts.
"EMC is serious about integration," said Slootman. "We have made it possible to replicate directly from the NetWorker console with no need to touch Data Domain."
Clariion, Celerra Get FAST, Management ToolsEMC's midrange products followed the Data Domain opening address, perhaps a sign of where EMC thinks the growth is, and it was a departure from the last few years when the company's Symmetrix, Celerra and Clariion disk arrays dominated the proceedings. There was no announcement of a merged Celerra-Clariion platform, as the company hinted on its earnings call last month, but EMC did have a few other announcements.
Rich Napolitano, president of EMC's Unified Storage Division, announced that fully automated storage tiering (FAST) version two will be shipping in July, with sub-LUN storage tiering for Clariion and Celerra using flash technology.
"Flash can also used to set up a FAST cache rather than a storage tier," said Napolitano. "That means 2 TB of non-volatile cache in our midrange products."
He also briefly covered Unisphere, a product that brings consolidation to all the individual element managers. This means no more console jumping to manage different types of storage.
Finally, he mentioned a VMware vStorage plug in for vCenter to simplify the management of Clariion, Symmetrix and Celerra in a virtual environment. For more on EMC's midrange announcements, see EMC debuts Unisphere, FAST for Clariion, Celerra at our sister site InfoStor.
EMC also announced Ionix Storage Configuration Advisor 2.0, which performs a variety of tasks including discovery, change tracking and automation.
"Visiblity into the SAN has always been challenging and this is accentuated in virtualized environments," said Bob Labiberte, an analyst with Enterprise Strategy Group. "EMC Ionix provides end-to-end visibility, change management and policy control."
Clouds, VPlex and Dedupe: Customers ReactAfter two days of announcements about private clouds, VPlex and dedupe, what do EMC customers think about all this?
Guy Chapman, a senior engineer for storage and virtual infrastructure at SunGard, buys in to all three of these EMC messages. He currently uses Data Domain with OST in a NetBackup environment, with 1,500 VMs distributed across eight sites.
"Data Domain worked out about one-fifth the cost of any other method of backup, whether based on tape or SATA disk," said Chapman. "I also use it for nearline storage of VMs that are no longer needed. This costs me about a tenth of the price of keeping them on primary storage."
While he is keen on the general concept of VPlex because of its real-time continuity benefits, he doesn't think that it will be be implemented in his company for about three years because of ongoing data center consolidation efforts.
"Once we reach the point of having three global data centers serving all worldwide operations, I can see us using VPlex to replicate between each site to give us geographical independence," said Chapman. "The early adopters will probably be those putting in new arrays that want to avoid the pain of a lengthy data migration."
Andrew Fuss, manager of technology and engineering at Charter Care Health Partners, is in a similar camp. He likes VPlex for what it offers in the disaster recovery arena, but wont have enough wiggle room in the budget or project timeline for another two to four years. Meanwhile, he is already implementing a cloud model and sees it as playing an important role in healthcare operations.
"I will maintain my existing data center for legacy applications, but anything new will probably be based in the cloud," said Fuss.
Chapman is another who has begun building a private cloud.
"Just about everyone is on the journey to the private cloud," he said.
While EMC customers seem to be largely behind EMC's private cloud and VPlex vision, NetApp (NASDAQ: NTAP) claims that much of what EMC announced is already on the market via its FlexCache technology. Patrick Rogers, vice president of solutions and alliances marketing at NetApp, said it is a proven long-distance VM migration that is available on all NetApp systems. FlexCache also works for multi-vendor storage environments today, including EMC through the NetApp V-Series open system controller, he said.
"EMCs VPlex is proprietary and available for only a limited set of platforms with no guarantee of compatibility with additional platforms, training and tools," said Rogers. "VPlex Local and Metro functionalities closely resemble many existing NetApp capabilities such as NetApp Data Motion, as well as Long Distance VM Migration (LDVM) with NetApp FlexCache."
But Brian Gallagher, division president for EMC's Symmetrix and Virtualization Product Group, said VPlex is top tier to mid-tier, whereas NetApp is midtier. As such, NetApp has two controllers whereas a VPlex cluster can scale up to eight within one cluster.
"VPlex is Active-Active, whereas NetApp is Active-Passive," said Gallagher. "In additon, VPlex is far more granular."
StorageIO Group analyst Greg Schulz said he agrees with NetApp that the company offers virtualization tools with similar functionality, but he discounted the vendor lock-in argument.
"Whoever controls the virtualization tools controls the storage and IT gold," Schulz said, "so these arguments are really all about mindshare and perception. Vendor lock-in is only bad if it impedes the business."
Follow Enterprise Storage Forum on Twitter