Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Clearly, there is keen interest in the object storage vs. block storage debate, for a variety of reasons. Object storage is in the limelight, thanks to the spectacular growth of cloud computing along with the advent of object-based storage solutions from vendors. Block storage, meanwhile, remains an enterprise mainstay that continues to serve well.
Here's what IT pros should know about object and block storage, and how they fit into today's data storage environments.
What is object storage?
The term object storage, or object-based storage, derives its name because it packages data and metadata into objects. Metadata is essentially data that describes other data, or in the case of object-based storage, is information about the files that are typically stored within an object.
Objects are stored in a flat structure or address space. Objects are each assigned an object ID or unique identifier, enabling them to be retrieved from a single repository or pool of storage. Enterprises value this approach because it offers greater flexibility on where they can place their data beyond block- and file-based storage solutions.
The popularity of cloud object storage products from Amazon Web Services (AWS) and other providers have helped heighten object storage's profile in recent years.
What is block storage?
Block storage is synonymous with storage area networks (SANs) and enables storage services that aren't possible with file storage technologies used in network-attached storage (NAS) systems. Block storage involves saving data in blocks, or raw storage volumes.
Each of these storage blocks can appear as an individual hard drive to an external server operating system. An operating system, in turn, use the Fibre Channel (FC), Fibre Channel over Ethernet (FCoE) or iSCSI protocols to access these blocks.
The reason block storage and therefore SANs are popular in enterprise IT environments is due to their flexibility and performance characteristics. Block storage supports a variety of workloads that require low-latency, network-based storage operations, including business-critical applications, virtual machines, RAID implementations and databases.
And while it's not to be confused with file storage systems—the type that enable organizations to offer their employees shared file services over a network using a NAS—a file system can be layered atop block storage since block storage appears as raw storage to server operating systems.
In the cloud, block storage is available from services like AWS Elastic Block Store, or AWS EBS, which provides scalable block storage that can be used by Elastic Compute Cloud (EC2) instances.
Object and block storage use cases
Here are the ways object storage and block storage are used in the data center:
Object storage use cases
- Cloud storage
- Unstructured data storage (documents, images, video, etc.)
- Big data storage
- Backup and recovery
- Archival storage
- Big Data Analytics
Block storage use cases
- Business applications
- Virtual machines
- Boot from networked storage
How block storage and object storage differ
One of the biggest differences between object and block storage is how they handle metadata.
Object storage, as mentioned earlier, includes both data and metadata. This metadata can be customized to include several other attributes that support functionality like search or advanced storage management and analytics. Object storage can very rich in metadata, in fact.
It's another reason that enterprises are increasingly turning to object-based storage solutions. Organizations can add their own custom information to object storage metadata, lending more business context and relevance to the underlying data.
This contrasts with file storage, for instance, which typically contains metadata concerning a file's basic attributes, like its name, file type and creation date. An object's metadata can describe applications they are tied to, among many other characteristics.
Compared to block storage, object-based storage is practically swimming in metadata.
In block storage, server operating systems directly access the data blocks needed to complete read and write actions using their unique addresses. The operating system or application in use is responsible for tracking and managing these data blocks absent of native metadata.
Block storage uses Fibre Channel, FCoE or iSCSI protocols to access individual data blocks. Object data is generally accessed using developer friendly APIs made up of familiar Hypertext Transfer Protocol (HTTP) requests.
Problems solved by object storage
Need to store and manage mountains of data, particularly unstructured data, well into the petabyte range? Object storage systems are massively scalable and their flat address space along with adaptable metadata capabilities can help businesses cope with growing data volumes.
Object storage's inherent data protection capabilities are another draw. Typically, object copies and erasure coding are used to ensure that data remains accessible if a disk node failure strikes, eliminating the need for RAID.
In simple terms, erasure coding involves splitting an object into pieces, encoding them with additional redundant data, and distributing those pieces across several disks or nodes. If disaster strikes the systems or disks containing some of those pieces, the remaining pieces contain enough information to reassemble the object, within reason, of course.
Problems solved by block storage
Need reliable, low-latency storage for your applications?
Block storage is the go-to solution for enterprises running critical business applications, databases and workloads that require predictable performance. Transactional systems, in particular, don't fare well if they are kept waiting for data or can't update data in a timely manner.
Block storage is also prized for its reliable and efficient data transport. Storage administrators generally value the ability to set up a block storage volume as an independent disk for an external server and the relative ease with which they can manage access and control privileges.
Object and block storage trade-offs
As with most technologies, both object and block storage have their benefits and drawbacks.
In comparing block and object storage, the latter is generally less expensive since it can run on commodity hardware. Block storage, in the form of specialized SAN storage arrays, are typically more expensive, although software-defined storage solutions (SDS) exist that enable SAN (and NAS) functionality on off-the-shelf hardware.
Performance-wise, SAN wins out.
Altering data in an object store requires transferring a new version of the entire object whereas data stored in a SAN can be changed at the data block level within a file. Applications that require random access to data that may be stored within objects, like databases and transaction systems, generally won't fare well with object storage.
Object storage must also contend with headers packed with metadata. This adds overhead that also lowers performance, relative to block storage. Read latency is also a concern.
There is also the matter of data consistency models. Given the architectural and performance characteristics of a SAN, copies of data stored in this environment can be considered strongly consistent, meaning that the latest version of the data is immediately available after being modified. Object storage is generally considered eventually consistent, meaning there is a risk that the latest version of the data isn't necessarily the newest one. This is due to erasure coding and replication, along with the time it takes data to and propagate across distributed object storage environments.
Object and block storage workloads
As mentioned earlier, object storage shines in environments that require working with large volumes of unstructured data that is infrequently updated. This can include documents, photos and videos.
Metadata-rich object stores make large-scale analytics an attractive proposition for enterprises. Object storage also makes it relatively economical to provide and manage storage that spans geographies.
Meanwhile, block storage's attributes make it ideal for high-performance business applications, transactional databases and virtual machines that require low-latency, granular access to data and consistent performance.
Although fundamentally different, enterprises nowadays are turning to both technologies to fulfill their storage needs. It's increasingly common to see block storage systems used in SANs fulfill the immediate data requirements of critical business applications while unstructured data, media files, logs and other content is funneled into on-premises or cloud object storage solutions, offering enterprises the best of both worlds.