Our FSCK Test vs. Others: Page 2 - EnterpriseStorageForum.com
Hot Topics:

Our FSCK Test vs. Others - Page 2

Comparison to Other Tests

To understand if these rates for files processed during the fsck are comparable to other tests, Table 4 takes data from presentations by Ric Wheeler from Red Hat. Ric has done several presentations about putting 1 billion files in a file system with various performance tests. Table 4 contains a quick description of the test, including the hardware and distribution if available, the number of files in the fsck tests, and the fsck time for various file systems.

Table 4: FSCK times for the various file system sizes, number of files, and file systems from Ric Wheeler presentations.

Configuration
Size (TB)
Number of Files
(Million of files)
ext3 - fsck time
time (secs)
ext4 - fsck time
(secs)
XFS - xfs_repair
time (secs)
btrfs - fsck time
(secs)
1 SATA drive (2010).
Unknown OS and kernel
1
1,070
(934.6 files/s)
40
(25,000 files/s)
40
(25,000 files/s)
90
(11,111.1 files/s)
1 PCIe drive (2010).
Unknown OS and kernel
1
70
(14,285.7 files/s)
3
(333,333.3 files/s)
4
(250,000 files/s)
11
(90,909.1 files/s)
1 SATA drive (2011).
RHEL: 6.1 alpha,
2.6.38.3-18 kernel
1,000
(zero length)
NA
3,600
(277,777.8 files/s)
54,000
(18,518.5 files/s)
NA
16 TB, 12 SAS drives,
hardware RAID (2011).
RHEL: 6.1 alpha,
2.6.38.3-18 kernel
1,000
(zero length)
NA
5,400
(185,185.18 files/s)
33,120
(30,193.2 files/s)
NA


The rate of files processed in the file system check performance for xfs ranged from 18,518.5 files/s for a single drive with an alpha version of RHEL 6.1 (kernel was 2.6.38.3-18) with 1 billion zero-length files to 250,000 files/s for a single PCIe drive with an unknown distribution (presumably Fedora or Red Hat) and 1 million files. But the more comparable result is 30,193.2 files/s for the case with 12 SATA drives and hardware RAID controller with 1 billion zero-length files using an alpha version of RHEL 6.1 with a 2.6.38.3-18 kernel.

According to David Chinner, who did much of the above testing, the low file processing rates are a result of the limited memory in the host system. For the next to last case, there is only 2GB in the system, and for the last case there is only 8GB. In reviewing this article, David went on to say:

What you see here in these last two entries is the effect of having limited RAM to run xfs_repair. They are 2GB for the single SATA case, and 8GB for the SAS RAID case. In each case, there isn't enough RAM for xfs_repair to run in multi-threaded mode, so it is running in it's old slow, single threaded mode that doesn't do any prefetching at all. If it had 24GB RAM like the tests you've run, the performance would have been similar to what you have achieved.

Since both cases don't really have enough memory for multi-threading operations, xfs_repair resorted to a single threaded behavior. This limited performance. According to David, more memory really allows multi-threading operations and prefetching in xfs_repair, which greatly improves performance. If you want a fast repair, at least in the case of xfs, add more memory to the host node.

On the other hand, ext4 had a much higher rate of files processed during the fsck. The performance ranged from 25,000 files/s for a single drive with an unknown distribution (presumably Fedora or Red Hat) with 1 million files to 333,333.3 files/s for a single PCIe drive with an unknown distribution (presumably Fedora or Red Hat) and 1 million files. But the more comparable result is 185,185.18 files/s for the case with 12 SATA drives and a hardware RAID controller with 1 billion zero length files using an alpha version of RHEL 6.1 with a 2.6.38.3-18 kernel. But notice that the rate of files processed during the file system check decreases as the number of disks increases (compare the third row results with the fourth row results for ext4).

In addition, David commented on the ext4 results by stating the following:

300,000 files/s is about 75MB/s in sequential IO, easily within the reach of a single drive. But seeing as it didn't go any faster on a large, faster RAID storage capable of 750MB/s for sequential read IO, indicates that e2fsck is either CPU bound or sequential IO request bound. i.e. it's been optimized for the single disk case and can't really make use of the capabilities of RAID storage. Indeed, it went slower most likely because there is a higher per-IO cpu cost for the RAID storage (more layers, higher IO latency).


Page 2 of 3

Previous Page
1 2 3
Next Page


Comment and Contribute

 


(Maximum characters: 1200). You have characters left.