pNFS and the Future of File Systems
High-performance file systems such as Panasas PanFS, Sun QFS, Quantum StorNext, IBM GPFS and HP File Services can add plenty of value to storage implementations (see Choosing the Right High-Performance File System).
Take the case of DigitalFilm Tree, a company based in Hollywood that provides post-production and visual effects (VFX) services for the entertainment industry. It recently had to ramp up its operations to deal with VFX for Showtime's "Weeds," CW's "Everybody Hates Chris," NBC's "Scrubs," a new TV pilot episode, and work on the Jet Li movie "The Forbidden Kingdom."
The company harnesses a storage environment that includes Apple (NASDAQ: AAPL) Xsan, HP (NYSE: HPQ) StorageWorks arrays, QLogic (NASDAQ: QLGC) switches and gear from several other storage vendors. It is also a mixed OS environment, with the workflow having to deal with users on Macs and PCs.
"The velocity of our work on the TV shows demands a non-linear workflow and the management of well over 100 TB of data," said Ramy Katrib, founder and CEO of DigitalFilm Tree. "StorNext enabled us to greatly expand our delivery without having to double our staff."
But with the ongoing updates to file system protocols like NFS, including parallel NFS (pNFS), is there a possibility that NFS could eventually supplant the many proprietary file systems out there? Let's first take a look at another couple of high-performance offerings from Sun and NetApp (NASDAQ: NTAP) before taking out our crystal ball to see what the future holds.
Sun Microsystems (NASDAQ: JAVA) characterizes Lustre as "the most scalable parallel file system in the world." In evidence of this, it serves six of the top 10 supercomputers and 40 percent of the top 100.
"We have Lustre file systems that scale to petabytes of data in one cohesive name space and deliver in excess of 100 GB/s aggregate performance to 25,000 clients or more," said Peter Bojanic, director of Sun's Lustre Group. "This includes HPC applications at Livermore, Oak Ridge and Sandia National Laboratories, where large-file I/O and sustained high bandwidth are essential."
Adoption is also growing in oil and gas, rich media and content distribution networks, which all require mixed workloads with large and small files. One of Lustre's differentiators is that it is available as open source software based on Linux. That's why you find it integrated with storage products from other HPC vendors, including SGI (NASDAQ: SGIC), Dell (NASDAQ: DELL), HP, Cray (NASDAQ: CRAY) and Terascala.
Lustre is an object-based cluster file system, but it is not T10 OSD-compliant, and the underlying storage allocation management is block-based. It requires the presence of a Lustre MetaData Server and Lustre Object Storage Servers. File operations bypass the MetaData Server, utilizing parallel data paths to Object Servers in the cluster. Servers are organized in failover pairs. It runs on a variety of networks, including IP and InfiniBand.
NetApp has a file system called WAFL (Write Anywhere File Layout), which consolidates CIFS, NFS, HTTP, FTP, Fibre Channel and iSCSI and works in conjunction with NetApp's Data ONTAP operating system. WAFL is integrated with RAID-DP, NetApp's high-performance version of RAID-6, so it can survive the loss of one or two disk drives.
Non-volatile memory (NVRAM) is added to improve speed by allowing a storage access protocol target to respond to requests to modifications before writing to disks. Through WAFL, requests are logged to NVRAM and file system modifications are saved in volatile memory. After several modifications have accumulated in volatile memory, WAFL gathers the results into what NetApp terms a "consistency point" (basically a snapshot) and writes the consistency point to the RAID group assigned to the file system.
"If the consistency point is not written to disk before hardware or software failure, then once Data ONTAP reboots, the contents of the NVRAM log are replayed to the WAFL, and the consistency point is written to disk," said Michael Eisler, senior technical director of NFS at NetApp. "Most of NetApp's competitors have snapshots, but NetApp has used its underlying snapshot technology to build features like file system level mirroring, backup integration, cloning, de-duplication, data retention, striping across network storage devices, and flexible volumes."
Flexible volumes (also called FlexVols) are volumes that can share a single pool (or aggregate) of storage with other flexible volumes. These volumes can be grown or contracted as needed freed up space is returned to the storage pool to be used by other FlexVols.
The Future of File Systems
Not everyone needs high performance, of course. There are the more common file system protocols such as NFS and CIFS, as well as Sun's open-source ZFS file system that runs on Solaris. There is even the Global File System (GFS) by Red Hat (NYSE: RHT), originally developed at the University of Minnesota. It offers high-performance and data sharing capabilities for the Linux platform.
"NFS-based solutions or SAN file systems are not optimized for environments where there are large numbers of clients or server nodes that need shared access to very large files," said Len Rosenthal, chief marketing officer at Panasas. "Common NAS systems are fine for everyday file storage for e-mail and other documents and for many internet-oriented applications, as the file sizes are generally small since they need to traverse low-bandwidth public networks."
ZFS, he said, is a good local file system, but it doesn't have any parallel data transfer capabilities and is not designed for scale-out applications. Similarly, he believes that the local file systems that come with Linux and the Unix vendors are mainly useful for direct-attached storage (DAS) within that server. Meanwhile, NFS will continue to be the standard for networked file access, but will transition over the next couple of years to pNFS.
"With pNFS support coming in NFS version 4.1, the future is bright, as pNFS is the first significant performance enhancement to NFS is close to 20 years," said Rosenthal. "The future of file systems with proprietary client software such as GPFS, EMC's MPFSi and Lustre is unclear, as pNFS could ultimately eliminate the need for these file systems."
pNFS, then, could well be the future of high-performance, shared file storage. Companies such as Panasas, Sun, NetApp, IBM (NYSE: IBM) and EMC (NYSE: EMC) are all actively working on the pNFS standard (NFSv4.1). Panasas has even publicly stated that it will migrate via a software revision its DirectFLOW protocol to pNFS as the standard emerges.
pNFS will enable parallel data transfers and create a standard client access protocol for NFS that will be supported on all major Linux distributions and also the proprietary Unix versions such as Solaris and AIX.
Greg Schulz, founder and senior analyst at StorageIO Group, takes a similar view.
"For everyday, general-purpose common file and data sharing in commercial enterprise environments, the workhorse will remain near-term to be NFS V3 and higher, along with Windows file sharing, with more environments looking at ZFS as they migrate to Linux environments or if they are Sun-centric," said Schulz. "Many others, though, will investigate pNFS for its applicability to their environments."
Sun, though, doesn't see its Lustre file system going away anytime soon. It sees a combination of regular file systems and high-performance file systems being in play for some time to come.
"No one technology can solve all enterprise problems, but by leveraging them together we can solve a lot of them," said Bojanic. "Lustre and pNFS won't replace all these other file systems and protocols, but will enable horizontal scalability to deliver super-scale file systems."
That explains why Sun is working on improvements to Lustre.
"Work is underway to deliver Lustre running on top of ZFS on both Linux and Solaris. This will offer vertical scalability and data integrity protection at the individual server level," said Bojanic. "Lustre, in turn, will be leveraged by future Open Storage products from Sun, providing horizontal scalability to a variety of client protocols, including pNFS and Windows/CIFS."
An alpha release of Lustre-on-ZFS should be out early in 2009, with the production version due in 2010.