File Systems and Volume Managers: Understanding the Internals Page 2
I believe log-based file systems were first discussed in a USENIX paper by Rosenblum and Osterhout entitled The Design and Implementation of a Log Structured File System, published in 1990. The authors analyzed I/O usage and concluded that:
- I/Os are short
- Most I/Os are random
- All data is fragmented on current file system technology (BSD and other file systems)
- Disk access speeds have not changed much
Since that time, file system vendors such as ADIC StorNext, Compaq/HP (ADvFS), Veritas (VxFS), SGI (XFS), IBM (JFS), Sun (UFS Logging) and a myriad of other vendors and Linux file systems (EXT3, ReiserFS) have taken the original concept and modified the concept to log only metadata operations. The goal is to ensure that the file system metadata is synchronized when the log area becomes full, and if the system crashes, the expectation is that only the metadata log will have to be checked for consistency after the boot rather than all of the metadata.
This file system check is commonly called fsck(1M). This logging methodology was developed based on the requirement to boot quickly after a crash. Almost all fsck(1M) versions that check just the log can check all of the metadata. This is sometimes important if you have had a hardware problem that went unrecognized and fear that the metadata data was corrupted.
Most file systems and volume managers allow the placement of the log on a device different than that of the data. This is done to reduce the contention between the log and increase the performance of the file system metadata operations. Each time the files are written or opened, the inode is kept in the log and is periodically written to the metadata area within the actual file system metadata area(s).
When performing large number of metadata operations, logging and the logging device performance can become an issue. With logging, the file system metadata is copied two times:
- The file system metadata is written to the log device
- The file system metadata is moved from the log device to the file system metadata area after the log becomes full
This double copy can become a performance bottleneck if:
- A large number of metadata operations fill the log, the file system is busy with data operations, and the log data cannot be moved quickly to the file system
- The log device is slower than the number of log operations that are required
Most people (including me) have both positive and negative philosophical issues with logs and logging, but typically you either fall into the logging camp or the non-logging camp. There are far more "loggers" than "non-loggers." Logging is currently the only method used for fast file system recovery, although other methods are possible.
It is important to remember that fast file system recovery is the requirement, and that logging is a method to meet that requirement, not the requirement itself. If someone comes up a with a file system that does not providing metadata logging, it will be a hard sell, as everyone thinks logging is a requirement at this point.