EMC Reveals Dedupe, Tiering, Virtualization Plans

Enterprise Storage Forum content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Cloud infrastructures and virtual data centers dominated the discussion at last week’s EMC Forum in Long Beach, Calif., but EMC (NYSE: EMC) also shed light on its plans for data deduplication and announced a timetable for its Fully Automated Storage Tiering (FAST) technology.

EMC’s biggest announcement at the event was its revelation that it will soon unveil a unified virtual data center solution with Cisco (NASDAQ: CSCO) and VMware (NYSE: VMW), which the company has dubbed a “stack in a rack” (see EMC, Cisco and VMware to Release Joint Data Center Product).

The promotional theme for this year’s EMC Forum was: Virtualize, Consolidate, Deduplicate, Manage, Protect and Comply. And the session tracks paralleled that: data storage tiering, deduplication, security, virtualization and storage management.

VMware, which is majority-owned by EMC, played a much bigger role than at least year’s event. This year, VMware and storage appear better integrated with EMC’s core products, and the forum reflected it, with sessions such as backup of virtual environments and using vSphere in a disaster recovery scenario.

EMC’s RSA Security business, however, remains a square peg in an otherwise well-rounded event. The MC and early keynote address attempted to lure attendees into the security track, but a lunchtime keynote about the importance of security in the “hyper-extended enterprise” didn’t seem to quite resonate with what was largely a data storage audience.

No such trouble for deduplication. Notably, dedupe was granted equal presence with VMware and RSA Security hot on the heels of the company’s Data Domain acquisition. Clearly, EMC sees a very bright future for the technology. Despite that, only about 10 percent of the close to 1,000 attendees appeared to be using deduplication. According to Rob Emsley, senior director of product marketing at EMC, that is in keeping with analyst estimates of adoption rates across the industry.

Source, Target and Disk Library Dedupe

Emsley explained the differences between the various dedupe products that now reside within EMC’s product portfolio: Avamar, Data Domain and Disk Library. Avamar can be classified as deduplication backup software and is typically prepackaged within EMC hardware as an appliance. It is also available as an optional plug-in for EMC NetWorker backup for added dedupe capability.

Data Domain, on the other hand, is purely a disk-based deduplication storage system without built-in backup. It works with EMC Networker and just about any other backup application. EMC Disk Library, based on dedupe technology from Quantum (NYSE: QTM), is more squarely focused on the SAN-based VTL market and contains a deduplication option for certain types of information. It functions with EMC NetWorker or other backup applications.

Avamar takes the approach of source-based deduplication, which eliminates duplicate data on the client side before sending backup data across the network. Data Domain is target-based — it takes out the duplicates at the backup device after data has been transmitted.

“You have to determine the right solution based on your current challenges, applications, data types and SLAs,” said Emsley.

He said that if you are bandwidth-constrained, source-based is best as you transmit far less data across the network. He recommended source-based for virtualized environments due to the way virtual systems tend to result in much higher CPU, memory and disk utilization rates, the result of which is far more potential for contention from backup processes. With Avamar, nothing is transmitted across virtual resources that hasn’t been deduplicated.

Target-based, said Emsley, is good when bandwidth isn’t an issue. It can also work with any backup software and protocol, so it doesn’t require a change of backup platform.

“Source-based only moves new data and tends to provide the highest deduplication rates,” said Emsley. “Target is good if you are committed to your current backup software, don’t have any network bandwidth issues, or if you are running an environment that is experiencing a large amount of changes.”

What about EMC Disk Library? Emsley indicated that Disk Library LAN-based products were not a priority for EMC going forward. Disk Library is used mainly by enterprise users with large-scale VTL requirements — up to 700 TB in some cases, often from multiple tape formats.

“We have more of a focus on Avamar and Data Domain, but will continue to fully support Disk Library.”

— Rob Emsley

“We have more of a focus on Avamar and Data Domain, but will continue to fully support Disk Library,” said Emsley.

He seemed to be hinting that Disk Library 1500 and 3000 might not be long for this world, or at least won’t be given much emphasis, while the larger Disk Library 4000 will probably maintain its place in the EMC product portfolio because it’s such a good fit for larger customers. Further, the company is rolling out management tools that enable Avamar, Data Domain and Disk Library to be managed via NetWorker from a single pane of glass.

Page 2: EMC’s Virtual Infrastructure and FAST

Back to Page 1

Information Infrastructure Comes into Focus

In lock step with the messaging at EMC World in May, the “information infrastructure” tag line was again in heavy usage at the EMC Forum. This time, though, the real meaning came more sharply into focus.

EMC sees itself as the backend of choice for the next-generation virtual data center. Linda Connly, EMC’s chief of staff and sales strategy, trotted out the math: 72 percent of current IT time is spent on maintaining applications and the underlying infrastructure. That leaves about 23 percent for application development and implementation and 5 percent for infrastructure deployment — effectively a 70/30 ratio between plumbing and new technology build-out.

“Our vision is to reverse this ratio and have 70 percent of the time for infrastructure/application deployment by reducing the cost of keeping the lights on,” said Connly. “Our alliance with Cisco and VMware is a means of getting rid of all that plumbing by building out the private cloud.”

Far from talking the talk and leaving users to iron out the kinks, EMC is rolling this out internally. Connly said that EMC is already about 50 percent virtualized in terms of its own applications. All of its Web sites, for example, are virtualized. And as the fifth largest global deployment of Oracle (NASDAQ: ORCL), it plans to virtualize that application next along with Microsoft (NASDAQ: MSFT) Exchange.

Eventually, EMC plans to be 100 percent virtualized, which will include the integration of external SaaS and cloud-based applications such as Salesforce.com (NYSE: CRM) and AmEx Travel into a virtual data center composed of a federated cloud, with clouds from external service providers and internal clouds within EMC melded together into a fully controlled and managed private cloud.

“It will be possible to move workloads from external to internal clouds while staying secure,” said Connly.

She stressed that this was much more than just “marketecture,” or so much marketing verbiage. All members of the VMware-Cisco-EMC (VCE) alliance, she said, had aligned their respective visions strategically around achieving this goal. Hence, the integrated “stack in a rack,” joint support functions and more to come.

“Our big bets for the future are on the virtual data center, deduplication, cloud computing and virtualized clients,” said Connly.

FAST Coming in December

Finally, a note on Fully Automated Storage Tiering (FAST): Lou Przystas, a senior advisory consultant at EMC’s corporate executive briefing center, revealed that the much-touted FAST will have its first release in December. This will provide LUN-level movement of data between tiers composed of different drive technologies (SATA/SAS, SSD, Fibre Channel). While this is a slight improvement in what current Symmetrix users can do in terms of automated data movement, the full-fledged FAST with sub-LUN capability isn’t scheduled to come out till next year.

“FAST is like an intelligent policy engine that looks at applications, data usage patterns and moves data among storage tiers automatically based upon SLAs,” said Przystas. “By May of next year, we will release sub-LUN level FAST. Eventually, this will make its way from Symmetrix V-Max into other storage platforms.”

Follow Enterprise Storage Forum on Twitter

Drew Robb
Drew Robb
Drew Robb is a contributing writer for Datamation, Enterprise Storage Forum, eSecurity Planet, Channel Insider, and eWeek. He has been reporting on all areas of IT for more than 25 years. He has a degree from the University of Strathclyde UK (USUK), and lives in the Tampa Bay area of Florida.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends, and analysis.

Latest Articles

15 Software Defined Storage Best Practices

Software Defined Storage (SDS) enables the use of commodity storage hardware. Learn 15 best practices for SDS implementation.

What is Fibre Channel over Ethernet (FCoE)?

Fibre Channel Over Ethernet (FCoE) is the encapsulation and transmission of Fibre Channel (FC) frames over enhanced Ethernet networks, combining the advantages of Ethernet...

9 Types of Computer Memory Defined (With Use Cases)

Computer memory is a term for all of the types of data storage technology that a computer may use. Learn more about the X types of computer memory.