Since Oracle acquired Sun last year, there has been uncertainty surrounding the fate of the Lustre file system. In fact, that uncertainty stretched back well before the acquisition. To end the speculation, Oracle (NASDAQ: ORCL) chose last week’s Supercomputing 2010 conference as the forum to reaffirm its commitment to Lustre.
“Nothing in Lustre has been radically changed under Oracle,” said Jason Schaffer, senior director of product management for storage at Oracle. “We have an unwavering commitment to Lustre and its community as well as the high performance computing (HPC) marketplace.”
Lustre, of course, is a high-performance open source file system which is often used in HPC environments. Based upon Linux, it is designed, developed and maintained by Oracle with input from many individuals and companies in the open-source community. The basic design is for it to be massively parallel to enable I/O performance and provide levels of scale that are well beyond the limits of traditional file systems. According to Schaffer, Lustre scales to tens of petabytes, hundreds of Gigabits per second and thousands of clients, and has been reliability deployed in many clustered environments.
“Lustre is the world’s number one parallel file system,” said Schaffer. “We are going to continue to invest in its development.”
He insists that Oracle hasn’t changed Sun’s strategy, but rather, has bolstered it. What is that strategy? For Oracle to lead the Lustre open-source community and continue to invest in it to support its heavy usage in HPC.
He points to the fact that 61 of the top 100 systems on the just published Top500.org list of the world’s fastest supercomputers utilize Lustre.
“The number one system from China is a Lustre customer,” said Schaffer.
He’s talking about the Tianhe-1A system at the National Supercomputer Center in Tianjin, China. It claimed top spot on the list with a performance level of 2.57 petaflops on the Linpack benchmark application used to determine the Top 500. With 29376GB of memory, it is based upon Nvidia graphics processing units (GPUs) as well as Intel Xeon 5600 series processors. It harnesses a custom-made interconnect that is said to be able to handle data at about twice the speed of InfiniBand. This feature cuts down on latency within the cluster as one of the key factors in its attainment of the top spot.
“Most of the customers that gravitate to Lustre are interested in scale and/or performance,” said Schaffer. “It either out-scales or out-performs the competition.”
He said that while scientific and academic users are the most prominent, there are plenty of commercial customers too. This is particularly the case in oil and gas, media and entertainment.
OK. That’s the official view from Oracle, but what do the analysts think?
“Where Lustre has had success with the HPC community is for those environments or scenarios that need to read or write very large datasets, or files requiring parallel access to files compared to general purpose and most scale out NAS solutions that are targeted for many concurrent access of small files,” said Greg Schulz, an analyst at StorageIO. “Likewise most general purpose or scale out NAS including many clustered file system solutions have many features such as snapshots, replication for managing many small files where Lustre has a main tenant of being able to safely store and provide high throughput reads and writes in parallel from multiple object storage targets (OST) from object storage servers (OSS).”
Consequently, Schulz said that few users will try to get Lustre to do things that it is not designed to do (i.e. general purpose applications deployed around Lustre). Where it is found, said Schulz, is in many commercial environments supporting R&D, exploration and simulation or other workloads that require processing of large datasets in parallel. On the other hand, where the general purpose NAS solutions from Dell, EMC, IBM, BlueArc, NetApp and Oracle ZFS, among others get deployed, Lustre is usually not the best fit. In other words, while they are file serving systems, Lustre and other similar approaches like Panasas PanFS are more likely to be found in their specific vertical application environments while the more general purpose systems are found elsewhere.
“It’s about using the right tool and technique for the task at hand,” said Schulz.
He thinks Lustre has a future, but that it will remain in its focus area of HPC unless someone can get really creative on how to position it for more general-purpose applications – or even the cloud where the need for such processing is still evolving.
When Oracle acquired Sun, the company also picked up a couple of other file systems such as QFS and ZFS.
“While there is or can be overlap, the products are positioned into different markets with ZFS being taken into more general purpose usages,” said Schulz.
ZFS is a storage centric file system, while Lustre focuses on scale out and parallelism.
“Lustre and ZFS are extremely complementary,” said Schaffer. “Oracle has increased its investment in integration with ZFS as well as ZFS optimizations for Lustre.”
Such claims, though, may fall on deaf ears. Sun has been talking about upgrades to Lustre for many years. Users, for example, are still waiting for such features as clustered metadata and a lock manager.
In an Enterprise Storage Forum article from 2008 Sun stated that an alpha release of Lustre-on-ZFS would be out early in 2009, with the production version due in 2010. Schaffer wasn’t sure when it would be released and said that the company was still working on perfecting integration between Lustre and ZFS.
The person who made that release statement back in 2008 was Peter Bojanic. He left his Lustre Engineering position at Oracle to join Xyratex where he will continue to create storage tools based on Lustre. Another company that has been headhunting Lustre luminaries is Xyratex. Both seem destined to become rivals of Oracle in the Lustre-based storage and HPC space. Could this lead to a power struggle within the Lustre community? And with nimble competitors like Panasas temping customers away from Lustre, it looks like Oracle has its work cut out to make a success of its efforts with the file system it inherited from Sun.
Drew Robb is a freelance writer specializing in technology and engineering. Currently living in California, he is originally from Scotland, where he received a degree in geology and geography from the University of Strathclyde. He is the author of Server Disk Management in a Windows Environment (CRC Press).
Follow Enterprise Storage Forum on Twitter.