Step 2: Running More Fsck Tests - Page 3
Additional FSCK Tests
DDN was kind of enough to offer additional testing time so I decided to try some tests that stretched the boundaries a bit. The first test was to create an XFS file system with 415,000,000 files and filling about 40 percent of the file system on the 72TB file system. The second test was to try to increase the fragmentation of the file system by randomly adding and deleting directories using fs_mark for the 105,000,000 file case also on the 72TB file system.
For the first test where 415,000,000 files were created, the original goal was to test 520,000,000 files in five stages of 105,000,000 files (creating 520,000,000 all at once caused the server to swap badly). However, due to time constraints, only four of the five stages could be run (fs_mark ran increasingly slower the more files were on the system). The final number of files created was 420,035,002 which also includes all "." and ".." files on the directories.
For the second test, approximately 105,000,000 files were created on an XFS file system in several steps. A total of five stages were used where 21,000,000 files were added at each stage using fs_mark (a total of 105,000,000 files). In between the stages, a number of directories were randomly removed, and the same number of directories anf files were replaced using fs_mark on randomly selected directories. The basic process is listed below:
- Use fs_mark to create 21,000,000 files using,
- 3 threads of 7,000,000 files each
- 7,000 directories
- 1,000 files per directory
- Randomly remove 700 directories and their files ("rm -rf")
- use fs_mark to add 700 directories with 1,000 files each to 700 randomly chosen existing directories (one directory is added to one existing directory)
- Use fs_mark to create 21,000,000 more files (42,000,000 total at this point)
- Randomly remove 1,400 directories and their files
- Use fs_mark to add 1,400 directories with 1,000 files each to 1,400 randomly chosen existing directories
- Use fs_mark to create 21,000,000 more files (63,000,000 total at this point)
- Randomly remove 2,100 directories and their files
- Use fs_mark to add 2,100 directories with 1,000 files each to 2,100 randomly chosen existing directories
- Use fs_mark to add 21,000,000 more files (84,000,000 total at this point)
- Randomly remove 2,8000 directories and their files
- Use fs_mark to add 2,8000 directories with 1,0000 files each to 2,800 randomly chosen existing directories
- Use fs_mark to add the final 21,000,000 files (105,000,000 total at this point)
- Randomly remove 3,500 directories and their files
- Use fs_mark to add 3,500 directories with 1,000 files each to 3,500 randomly chose existing directories
Because of the random nature of selecting the directories, it is possible to get some directories with many more files than others. However, the total number of files won't be 105,000,000 since of the random nature of selection for deletion and insertion. If we count all fo the files including the "." and ".." files we find that the process created 115,516,127 files.
The table below lists the file system repair times in seconds for the standard matrix cases as specified in the previous article but with the new number of files. These times include all steps in the file system checking process.
|File System Size (in TB)||Number of Files (in Millions)||XFS - xfs_repair time (Seconds)||ext4 - fsck time (Seconds)|
The FSCK time for the additional tests are listed below:
- 415,000,000 file case:11,324 seconds
- Fragmented case: 676 seconds
Notice that the 415,000,000 case took 6.95 times longer than the 105,000,000 file case even though it had four times as many files. During the file system check the server did not swap, and no additional use of virtual memory was observed.
The "fragmented" case is interesting because it took less time to perform the file system check than the one-level directory case. The original case took 1,629 seconds and the fragmented case took only 676 seconds -- about 2.5 times faster. Time did not allow investigating why this happened.
In the next article in this seriesHenry writing about his observations of the results. Please be sure to post your comments about these testing results.
A Big Thank You
At first glance it seemed simple a vendor could provide about 80TB to 100TB of raw storage connected to a server for testing, but this turned out not to be the case. It was far more difficult than anticipated. I would be remiss if I didn't thank the people who made this possible: Of course Henry Newman for pushing various vendors to help if they could. Thanks go to Paul Carl and Randy Kreiser from DDN who greatly helped in giving me access to the hardware and helped with the initial hurdles that crop up. Thanks also to Ric Wheeler who answered several emails about using fs_mark and about Linux file systems in general. He has been a big supporter of this testing from the beginning. Thanks also to Andres Dilger from Whamcloud who provided great feedback and offers of help all of the time.
Jeff Layton is the Enterprise Technologist for HPC at Dell, Inc., and a regular writer of all things HPC and storage.
Henry Newman is CEO and CTO of Instrumental Inc. and has worked in HPC and large storage environments for 29 years. The outspoken Mr. Newman initially went to school to become a diplomat, but was firmly told during his first year that he might be better suited for a career that didn't require diplomatic skills. Diplomacy's loss was HPC's gain.