Ocarina Wants to De-Dupe Primary Data

Enterprise Storage Forum content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Startup Ocarina Networks is pushing de-duplication for primary storage, saying the technology’s benefits shouldn’t be limited to back-end, offline storage.

Not everyone agrees that de-duplication — that is, the process of scouring data archives to remove redundant information and shrink files — is ready for a larger role in critical online storage. To date, most de-duplication solutions have focused on the realm of data backup.

But two-year-old Ocarina Networks contends that “de-dupe” can help cut costs and space requirements in online, primary data environments. It believes security and file retrieval speeds won’t be concerns when good policy management is put in place.

Ocarina’s rivals feature technology that typically sits between users and file servers, or on the server itself. Ocarina, however, takes a different route with its Optimizer appliance.

“We’re not in the write path at all — when customers write files, they go straight to the disk,” said Carter George, Ocarina’s vice president of products. “We come along later and optimize them. Because we post-process files after they’ve been written, we don’t affect write performance at all.”

That may prove a critical factor given that interest in de-dupe for primary storage is growing, albeit cautiously. Many industry observers still seem guarded about the value of the technology, given the delicacy with which primary data storage must be handled. Activities like de-dupe can work in the online storage environment only if they don’t cause data loss or corruption, hinder retrieval performance, or create security issues.

Yet several analysts acknowledge that de-duplication could prove useful in certain environments.

Dave Russell, a Gartner research vice president, said data reduction technologies, like those from Ocarina, Data Domain, Diligent (being acquired by IBM), EMC, FalconStor, NetApp, Sepaton, Storewize and Quantum, “are transformational, as they significantly reduce the capacity requirements for storage.”

“That, in turn, leads to potentially significant cost and floor space reduction and as well as an improved overall quality of service, as more data can be stored on disk for longer periods of time,” said Russell.

Others are even more sanguine about the technology’s promise.

Enterprise Strategy Group research analyst Heidi Biggar said such tools “aren’t just ‘nice-to-haves,’ they are becoming ‘must haves.'”

“The explosion of digital content is forcing organizations to find ways to optimize primary disk capacity,” Biggar said in a statement. “Technology [that] compresses and de-duplicates data at the information level, are taking center stage because of the immediate cost-savings they can enable.”

Ocarina’s approach to avoiding data-loss and retrieval performance problems centers on how its Ocarina Optimizer hardware appliance functions. The product reads, consolidates and writes files back to storage, cutting down the size of already-compressed data and optimizing even large, media-rich files. As a result, the company claims its system can help enterprises store 10 times more data on current storage systems.

“No one’s really done online, primary de-dupe because it’s hard to do,” George said. “Performance requirements are much more stringent [in online data] and almost every file is compressed by its application during the save process.”

For example, he noted that Microsoft 2007 documents are automatically compressed once a file is closed. That means the file can’t be compressed again by traditional means to further save space — which is where de-dupe technology comes into play.

Additionally, Ocarina takes steps to ensure that data that’s likely to be needed again soon remains readily available to users.

“We shrink [a] compressed file using policies, so that older files are ‘de-duped’ while data that needs to be accessible isn’t shrunken until it’s a certain age,” George said.

Describing it as “complementary” to de-duplication’s traditional role in the backend, George said beta customers of Ocarina’s technology are seeing big efficiency gains, especially among businesses in social networking and digital photo environments.

“These enterprises are dealing with many, many petabytes of data and the ability to compress those to save storage space is saving them money and storage space,” he said.

Not every type of primary storage environment is ripe for de-dupe, however. Rich media, e-mail and workflow files may be a good fit the technology, while environments handling heavy database files aren’t, experts said.

“The sheer volume of media files, the amount of data they create, presents an interesting place to play with online storage and could prove attractive to customers,” said Charles King, principal analyst at Pund-IT.

King also views the consumer environment as a potential target for de-duplication, as it “doesn’t have the same kind of regulatory requirements” enterprises often face with data files.

Ocarina’s George said the technology may have particular appeal for gas and oil industries, given their hefty seismic graphic files that require storage. He acknowledged that large financial institutions and transactional-based environments would find de-dupe less suitable.

“Databases are tricky,” he said. “We could shrink the files, but due to the constant churn in data changes, they shrink and expand, shrink and expand — it doesn’t work as well.”

Article courtesy of InternetNews.com

Judy Mottl
Judy Mottl
Judy Mottl is an experienced technology journalist who has served as a senior editor, reporter, writer, and blogger for InformationWeek, Investors Business Daily, CNET, and Information Security Magazine, as well as other media outlets.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends, and analysis.

Latest Articles

15 Software Defined Storage Best Practices

Software Defined Storage (SDS) enables the use of commodity storage hardware. Learn 15 best practices for SDS implementation.

What is Fibre Channel over Ethernet (FCoE)?

Fibre Channel Over Ethernet (FCoE) is the encapsulation and transmission of Fibre Channel (FC) frames over enhanced Ethernet networks, combining the advantages of Ethernet...

9 Types of Computer Memory Defined (With Use Cases)

Computer memory is a term for all of the types of data storage technology that a computer may use. Learn more about the X types of computer memory.