Many options exist for setting up highly available data storage through a clustered file system, but figuring out what each option does will take a bit of research. Your choice of storage architecture as well as file system is critical, as most have severe limitations that require careful design workarounds. In this article we will […]
Many options exist for setting up highly available data storage through a clustered file system, but figuring out what each option does will take a bit of research. Your choice of storage architecture as well as file system is critical, as most have severe limitations that require careful design workarounds.
In this article we will cover a few common physical storage configurations, as well as clustered and distributed file system options. Hopefully, this is a good starting point to begin looking into the technology that will work best for your high availability storage needs.
Some readers may wish to configure a cluster of servers that simply have concurrent access to the same file system, while others may want to replicate storage and provide both concurrent access and redundancy. There are two ways to go about providing multiple servers access to the same disks: Let them both see it, or do it through replication.
Shared-disk configurations are most common in the Fibre Channel SAN and iSCSI worlds. It is quite simple to configure storage systems so that multiple servers can see the same logical block device, or LUN, but without a clustered file system, chaos will ensue if both try to use it at the same time. This problem is dealt with by using clustered file systems, which we will cover in a moment.
Generally speaking, shared disk setups have a single point of failure: the storage system. This is not always true, however, as “shared disk” is a confusing term with today’s technology. SANs, NASappliances and commodity hardware running Linux can all replicate the underlying disks in real time to another storage node, which provides a simulated shared disk environment. Since the underlying block devices are replicated, the nodes have access to the same data and both run a clustered file system, but this replication breaks the traditional shared disk definition.
“Shared nothing,” in contrast, was the original answer to shared disk single points of failure. Nodes with distinct storage would notify a master server with changes as each block was written. Nowadays, shared nothing architectures still exist in file systems like Hadoop, which purposely creates multiple copies of data across many nodes for both performance and redundancy. Also, clusters that employ replication between storage devices or nodes with their own storage are also said to be shared nothing.
You cannot access the same block device via multiple servers, as we noted. You always hear about file system locking, so it’s strange that normal file systems cannot handle this, right?
At the file system level, the file system itself is locking files to protect the data against mistakes. But at the operating system level, the file system drivers have full access to the underlying block device, upon which they are free to roam. Most file systems assume that they are given a block device, and it’s theirs and theirs alone.
To get around this, clustered file systems implement a mechanism for concurrency control. Some clustered file systems will store metadatawithin a partition of the shared device, and some choose to utilize a centralized metadata server. Both allow all nodes in the cluster to have a consistent view of the state of the file system, to allow safe concurrent access. The model with the central metadata sever, however, is sub-optimal if your goal is high availability and eliminating single points of failure.
One other note: The clustered file system model requires swift action when a node does something wrong. If a node writes bad data or stops communicating its metadata changes for some reason, other nodes need to be able to “fence” off the offender. Fencing is accomplished in many ways, most often using lights-out management interfaces. Healthy nodes will Shoot The Other Node In The Head (STONITH), or yank its power, at the first sign of inconsistency to preserve the data.
Having too many choices is never a bad thing. Your implementation goals will dictate which clustered or distributed file system and storage architecture you choose. All of the mentioned file systems work very well, assuming they are used as intended.
Article courtesy of Enterprise Networking Planet
Follow Enterprise Storage Forum on Twitter
Enterprise Storage Forum offers practical information on data storage and protection from several different perspectives: hardware, software, on-premises services and cloud services. It also includes storage security and deep looks into various storage technologies, including object storage and modern parallel file systems. ESF is an ideal website for enterprise storage admins, CTOs and storage architects to reference in order to stay informed about the latest products, services and trends in the storage industry.
Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved
Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.