Beyond Block and File
Blocks are fundamental, blocks are simple, and blocks haven't changed much since the first disk drive brought storage technology into existence some fifty years ago. The storage industry moves at a whirlwind pace, but underneath all of the virtualization and abstraction, we still work in blocks, writing, reading and keeping track of blocks of data on storage devices across the network.
Block storage has limitations that have become more apparent as demand for scalability and security has grown. With low-level blocks as currency, storage designs require intelligence higher in the stack to manage metadata, to reassemble blocks into the file, or record abstractions understood by applications. Depending on where that intelligence resides, with typical architectures you can choose either scalability (SAN) or sharable data (NAS) but you can't choose both.
Object-based storage pushes metadata management to the very bottom of the stack, to the storage device itself. With object-based storage devices, files and records are no longer abstractions, but actual storage objects that are understood, managed and secured at the device level.
The technology has been around for several years now, but commercial deployments remain limited. It may take a long time for a change to take hold that is as fundamental as a move from blocks to objects.
Still, awareness of object-based storage, and of its promise, is growing as more and more organizations begin to bump up against the scalability limits of NAS. "There's increasing interest in the notion [of object-based storage] for a variety of reasons," says David Freund, an analyst at Illuminata. "Not the least of which is trying to get a scalable network file service that is actually sharable, and more easily managed."
The Case for Object-Based Storage
In 2004, ANSI ratified version 1.0 of the Object-based Storage Device (OSD) specification, defining a protocol for communication with object-based storage devices. The OSD specification describes a SCSI command set that provides a high-level interface to OSD devices, allowing clients, such as file systems and databases, to store and retrieve data objects. SNIA's technical working group is currently developing version 2 of the OSD specification, and it's expected later this year.
An OSD device deals in objects. It handles the mapping from object to physical media locations itself. The device also tracks metadata as attributes, such as creation timestamps, allowing for easier sharing of data among clients.
The biggest selling point for OSD is that it combines the scalability of SAN with the data sharing of NAS. First-generation NAS architectures don't scale well, because all metadata processing has to be handled on the NAS server. Scaling up with more storage behind the NAS head is restricted because metadata processing on the NAS device becomes a bottleneck. Scaling by adding additional NAS devices quickly becomes a management headache because data is isolated on individual "NAS islands."
"That's why NetApp abandoned the idea of a bigger, scale-up, Intel-processor based NAS head, and acquired Spinnaker," says Freund, discussing the 2003 acquisition that brought NetApp scalable NAS technology based on a distributed file system. That acquisition led to the development of NetApp's own Data ONTAP GX operating system, rolled out just last week.
Distributed NAS systems are an alternative to object storage, but these may have high-end scaling limitations of their own. Says Freund, "Even if you use a distributed file system, the lock traffic starts to build up, so that scaling to large numbers of machines becomes a problem."
What OSD adds to the mix is the ability to directly connect clients and OSD devices without the need for an intermediary to handle metadata. Panasas, the first to deliver a commercial OSD product, combines a parallel file system and object-based storage. In the company's DirectFLOW design, clients get object locations and security clearance from out-of-band director blades. All data traffic runs direct from OSD storage blades to clients. Although commercial OSD products remain sparse, the technology continues to make progress. Seagate and IBM have demonstrated OSD products. HP has incorporated the open-source Lustre file system, with OSD as a key component, as part of its StorageWorks Scalable File Share.
Also significant for the future of object storage, customers have become better acquainted with scalable alternatives, OSD among them. "When we came into the market a couple of years ago, typically what we were competing against was an NFS server or multiple NFS servers," says Larry Jones, VP of marketing for Panasas. "And the customers were thinking, well, do I buy a couple more NFS servers, or do I look at changing how I do this?"
Now, says Jones, customers are aware of other options. "That's really changed, probably over the last year or so, to 'oh, no, having five NFS servers, that's not going to fly,'" he says. Instead, customers are evaluating Panasas and scalable performance competitors like Isilon, or NetApp's distributed NAS.
Making devices responsible for the objects they store makes sense not only from a scalability perspective, but from a security perspective as well. OSD allows direct client-device connections for performance, while making it possible to authorize I/O between clients and storage devices on a request-by-request basis.
"The object storage device is not just going to hand out objects to anybody who asks," says Jones. Instead, clients negotiate capabilities with the offline managers and use these capabilities as secure tokens when communicating with storage devices. "It allows us to run over what is otherwise a pretty open network (in our case it's Ethernet), and still have a level of security," says Jones.
The HPC Frontier
Panasas, Isilon and Lustre all compete in the rarefied air of high-performance computing (HPC) clusters, where OSD has the most traction. OSD has, to date, been deployed most successfully as a key technology in the scalable storage that supports high-performance clusters.
The HPC trend away from vector supercomputers to clusters of x86 Linux machines has created a growing need for storage systems that can serve large numbers of clients. "That transition has really driven a requirement for storage systems that can keep up," says Jones.
What of the transition of more interest to storage pros the shift of OSD from HPC to mainstream enterprise storage? That one's still to happen, and when and if it will occur is anybody's guess. Jones sees precedent in the movement of earlier technologies from technical environments to the business mainstream, including Sun's NFS.
He also sees a scenario in which storage clusters can migrate from a company's limited HPC application say, chip simulation to a widening circle of supporting groups, such as design engineers. "That's really how it spreads," he says. "You start from that core HPC space, and then build up through the pipeline to other pieces of the process required to support the final HPC element."
Objects in the Mainstream
Whether OSD builds into the enterprise from HPC, or gets in through some other door, it faces a tough path to acceptance. Making the most of object storage ultimately means making changes at all levels of the storage stack, from storage devices and storage networks to file systems and databases.
"Do people want objects?" asks Freund. "No, not today. They're out to solve the base problems that object-based systems, and others, can address." For the moment, that mostly means handling the scalability vs. share-ability tradeoff.
There are greater challenges ahead. "Where object-based storage is going to get more interesting," says Freund, "is when you start to worry about things like enforcing security, consistently, across an entire infrastructure, where you don't even have control over everything."
According to Freund, such problems could be solved by associating security attributes directly with data objects, attributes that travel with the data regardless of where it is stored. That's something that object storage can address well, and something that block storage cannot.
Object storage can be used in this way with applications none the wiser. Applications can make file-oriented system calls as usual, unaware that most of the work is being done below the file system, on the device. But it's possible to imagine an even more radical future, where applications work with stored data objects directly, rather than writing data to files.
It's possible, and it could mean finer-grained data sharing between applications. It would require a change to the application API contract, a task beyond the capability of most companies or even consortiums. "There are very few entities on the planet that could pull such a feat off," says Freund. But, he adds, "Microsoft is one of them."
Object storage clearly has great potential, and it could someday provide secure, easily-shared data, on highly-scalable storage, to applications across the enterprise. But there are a lot of standards to hammer out, a lot of pain thresholds that need to be crossed in the market before objects replace blocks.
Says Freund: "We've known how to solve this problem for a couple of decades. But we can't get out of our own way. There's tremendous inertia."
For more storage features, visit Enterprise Storage Forum Special Reports