Linux File Systems: You Get What You Pay For
I am frequently asked by potential customers with high I/O requirements if they can use Linux instead of AIX or Solaris.
No one ever asks me about high-performance I/O high IOPS or high streaming I/O on Windows or NTFS because it isn't possible. Windows and the NTFS file system, which hasn't changed much since it was released almost 10 years ago, can't scale given its current structure. The NTFS file system layout, allocation methodology and structure do not allow it to efficiently support multi-terabyte file systems, much less file systems in the petabyte range, and that's no surprise since it's not Microsoft's target market.
And what was Linux's initial target market? For desktops, of course. Linux has since moved from the desktop to run on many large SMP servers from Sun, IBM and SGI. But can Linux as an operating system and Linux file systems meet the challenge of high-performance I/O?
You may think you don't need high-performance I/O, but every server needs this type of I/O performance for something as simple as backup and restoration. Current LTO-4 tape drives can operate at 120 MB/sec without compression and can support data rates up to 240 MB/sec with compression. If your file system cannot support I/O at these streaming data rates, then the time to backup and restore will take much longer than expected. For large environments with multiple tape drives, not being able to use the tape drives at their full data rate might require additional tape drives to meet the backup time window, which affects restoration too. Therefore, it seems to me that everyone should be interested in the performance of Linux file systems, if only for backup and restore.
Can Linux file systems, which I will define as ext-4, XFS and xxx, match the performance of file systems on other UNIX-based large SMP servers such as IBM and Sun? Some might also inquire about SGI, but SGI has something called ProPack, which has a number of optimizations to Linux for high-speed I/O, and SGI also has their open proprietary Linux file system called CxFS, which is not part of standard Linux distributions. Because SGI ProPack and CxFS are not part of standard Linux distributions, we won't consider them here. We'll stick to standard Linux because that is what most people use.
We'll focus on two areas:
- Linux as an operating system, and
- Linux file systems.
Linux Operating System Issues
We'll set aside what might happen with Linux in the future and instead focus on what is available today. Linux has a number of features that match the I/O performance of AIX and Solaris, such as direct I/O, but the bottom line is that Linux wasn't designed around high-performance multi-threaded I/O.
There are a number of areas that limit performance in Linux, such as page size compared with other operating systems, the restrictions Linux places on direct I/O and page alignment, and the fact that Linux does not allow direct I/O automatically by request size I have seen Linux kernels break large (greater than 512 KB) I/O requests into 128 KB requests. Since the Linux I/O performance and file system were designed for a desktop replacement for Windows, none of this comes as much of a surprise.
Linux has other issues, as I see it; for starters, the lack of someone to take charge or responsibility. With Linux, if you find a problem, groups of people are going to have to agree to fix it, and the people writing Linux might not necessarily be responsive to the problems you're facing. If a large vendor of Linux agrees with your problem and provides a fix, that doesn't mean it will be accepted or accepted anytime soon by the Linux community. And getting a patch for your problem could pose maintenance problems.
The goals for Linux file systems and the Linux kernel design seem to be trying to address a completely different set of problems than AIX or Solaris, and IBM and Sun are far more directly responsible than the Linux community if you have a problem. If you run AIX or Solaris and complain to IBM or Sun, they can't say we have no control.
Linux File Systems
Remember that most Linux file systems were designed with desktop use in mind, not some of the high-performance file systems such as GPFS (IBM), StorNext (Quantum) or QFS (Sun). These file systems were designed for streaming I/O, which we now know is important for everyone and for some high-speed IOPS, and in some cases for database access.
The Linux file systems that are commonly used today (ext-3 today and likely soon ext-4 and xfs) have not had huge structural changes in a long time. Ext-4 improves upon ext-3 and ext-2 for some improved allocation, but simple things like alignment of the superblock to the RAID stripe and the first metadata allocation are not considered.
Additionally, things like alignment of additional file system metadata regions to RAID stripe value are not considered, nor are simple things like indirect allocations (see File Systems and Volume Managers: History and Usage), which are fixed values so with the small allocations supported (4 KB maximum), large numbers of allocations are required. Take a 200 TB file system, which will require 53.7 billion allocations to represent the 200 TB using the largest allocation size of 4 KB supported by ext-3. Using 8 MB, which is feasible on enterprise file systems, it becomes a manageable 26.2 million allocations. The bitmap or allocation map could even fit in memory for this number of allocations! The xfs file system has very similar characteristics to ext-3. Yes, allocations can be larger, up to 64 KB if the Linux page size is 64 KB, but the alignment issues for the superblock, metadata regions and other issues still exist.
Linux Has Its Place
That's not to say I am anti-Linux, just as I am not pro-AIX or pro-Solaris. I am not even anti-Windows, since I use a Windows laptop as my main computer. But I do believe that the default Linux file systems are not yet up to the task of replacing the high-performance, highly scalable SMP file systems. Computers are tools, and operating systems and file systems are also tools in the toolbox. No one uses a chainsaw in place of a jigsaw, and the same analogy can be used for operating systems, file systems and the hardware they run on.
Many of the people I deal with daily use MS Word, MS Excel, MS PowerPoint and MS Visio. I could run some if not all of these applications on a Windows emulator from someone, but I routinely get incompatibilities with fonts, and I just decided long ago to live with Windows until someone can prove to me that it all works together with no problems. My point here is that every computer is a tool and has its use. Currently no single computer or file system can meet all application requirements. This should not come as a surprise. Linux has a place, but as far as I can tell, that place does not support single instances of large file systems and scaling well from large to small file systems with high-performance requirements. And I don't see this changing anytime soon.
Henry Newman, a regular Enterprise Storage Forum contributor, is an industry consultant with 27 years experience in high-performance computing and storage.
See more articles by Henry Newman.