Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Cloud infrastructures and virtual data centers dominated the discussion at last week's EMC Forum in Long Beach, Calif., but EMC (NYSE: EMC) also shed light on its plans for data deduplication and announced a timetable for its Fully Automated Storage Tiering (FAST) technology.
EMC's biggest announcement at the event was its revelation that it will soon unveil a unified virtual data center solution with Cisco (NASDAQ: CSCO) and VMware (NYSE: VMW), which the company has dubbed a "stack in a rack" (see EMC, Cisco and VMware to Release Joint Data Center Product).
The promotional theme for this year's EMC Forum was: Virtualize, Consolidate, Deduplicate, Manage, Protect and Comply. And the session tracks paralleled that: data storage tiering, deduplication, security, virtualization and storage management.
VMware, which is majority-owned by EMC, played a much bigger role than at least year's event. This year, VMware and storage appear better integrated with EMC's core products, and the forum reflected it, with sessions such as backup of virtual environments and using vSphere in a disaster recovery scenario.
EMC's RSA Security business, however, remains a square peg in an otherwise well-rounded event. The MC and early keynote address attempted to lure attendees into the security track, but a lunchtime keynote about the importance of security in the "hyper-extended enterprise" didn't seem to quite resonate with what was largely a data storage audience.
No such trouble for deduplication. Notably, dedupe was granted equal presence with VMware and RSA Security hot on the heels of the company's Data Domain acquisition. Clearly, EMC sees a very bright future for the technology. Despite that, only about 10 percent of the close to 1,000 attendees appeared to be using deduplication. According to Rob Emsley, senior director of product marketing at EMC, that is in keeping with analyst estimates of adoption rates across the industry.
Source, Target and Disk Library Dedupe
Emsley explained the differences between the various dedupe products that now reside within EMC's product portfolio: Avamar, Data Domain and Disk Library. Avamar can be classified as deduplication backup software and is typically prepackaged within EMC hardware as an appliance. It is also available as an optional plug-in for EMC NetWorker backup for added dedupe capability.
Data Domain, on the other hand, is purely a disk-based deduplication storage system without built-in backup. It works with EMC Networker and just about any other backup application. EMC Disk Library, based on dedupe technology from Quantum (NYSE: QTM), is more squarely focused on the SAN-based VTL market and contains a deduplication option for certain types of information. It functions with EMC NetWorker or other backup applications.
Avamar takes the approach of source-based deduplication, which eliminates duplicate data on the client side before sending backup data across the network. Data Domain is target-based it takes out the duplicates at the backup device after data has been transmitted.
"You have to determine the right solution based on your current challenges, applications, data types and SLAs," said Emsley.
He said that if you are bandwidth-constrained, source-based is best as you transmit far less data across the network. He recommended source-based for virtualized environments due to the way virtual systems tend to result in much higher CPU, memory and disk utilization rates, the result of which is far more potential for contention from backup processes. With Avamar, nothing is transmitted across virtual resources that hasn't been deduplicated.
Target-based, said Emsley, is good when bandwidth isn't an issue. It can also work with any backup software and protocol, so it doesn't require a change of backup platform.
"Source-based only moves new data and tends to provide the highest deduplication rates," said Emsley. "Target is good if you are committed to your current backup software, don't have any network bandwidth issues, or if you are running an environment that is experiencing a large amount of changes."
What about EMC Disk Library? Emsley indicated that Disk Library LAN-based products were not a priority for EMC going forward. Disk Library is used mainly by enterprise users with large-scale VTL requirements up to 700 TB in some cases, often from multiple tape formats.
He seemed to be hinting that Disk Library 1500 and 3000 might not be long for this world, or at least won't be given much emphasis, while the larger Disk Library 4000 will probably maintain its place in the EMC product portfolio because it's such a good fit for larger customers. Further, the company is rolling out management tools that enable Avamar, Data Domain and Disk Library to be managed via NetWorker from a single pane of glass.