Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
This is a commonly used term, but what are inodes really used for? Most inodes range in size from 128 bytes to 512 bytes, and vendors often use something called extended attributes, which are really just aggregations of more information than could fit in the inode. This happens when a vendor runs out of space in the inode and needs additional space for things like access control lists (ACLs), tape positioning for HSM, security, and other non-basic functions.
The basic function of an inode is to serve as an off-set from the superblock that identifies where the data resides on the devices under the control of the file system. The inode also provides information about ownership, permissions (read only for example), and access and create times. Inodes have the same basic functionality for Linux, UNIX and Windows systems.
The concept of inodes and how they are used have been around for over 35 years, and in that time not much has changed but the size. Most inode implementations allow 15 allocations per inode, and after that another inode is used for the additional allocated space. The 15 allocations is true even for the file systems that support large inodes (those greater than 256 bytes).
File Size Distribution
How and where file systems place data on the devices (notice I do not say disks because some file systems support hierarchical storage management (HSM), which basically means the data could reside on tape or disk or both) that are managed is an important -- and complex -- concept. Data allocation algorithms and data placement (which device(s) the file(s) will reside on) can be a big issue for some file systems. As mentioned in the last article, most file systems require volume managers when using multiple devices. Generally, there are two types of representation for free space:
- Btrees, as used by Veritas, StorNext, XFS, ReiserFS, and other file systems
- Bitmaps, as used by QFS, NFTS, and HFS (MAC/Linux)
Given what I have seen for most "user and scratch" file systems, a 90/10 rule applies -- approximately 90% of the files use 10% of the space, and 10% of the files use 90% of the space. Of course, sometimes the distribution is 95/5 or even a tri-modal distribution, but the point is that you are likely to have an extremely skewed distribution of sizes and counts for files, rather than a statistically normal distribution (Bell Curve).
Understanding how allocation is accomplished and the tunable parameters in a file system gives you a better understanding of how the file system will scale and perform in your environment given the file sizes, the file system allocation sizes, and how the data is accessed (random, sequential, large or small block).
Shared file systems both heterogeneous and homogenous are becoming commonplace, and the complexity of the architecture and its management is growing exponentially. These last two columns have provided a basis for understanding how the shared file systems and volume managers work internally, and understanding that will allow you to better evaluate file system tuning parameters and discern what they really mean, as all too often the documentation leaves a great deal to be desired.