Fixing SSD Performance Degradation, Part 2
It's a fairly well known fact that solid state disk (SSD) performance can suffer over time. This was quite common in early SSDs, but newer controllers have helped reduce this problem through a variety of techniques. In the second part of this two-part article examining performance degradation in SSDs, the rubber meets the road as we perform some benchmarking of an enterprise class SSD to understand performance before the drive is heavily used, and after.
Review of SSD Performance Issues
In the first part of this article, we examined some of the origins of SSD performance problems. The nature of the design of floating gate transistors, the design of the underlying NAND chips, and the design of the SSDs are the source of the performance degradation problems as well as the performance benefits of SSDs.
In reviewing how SSDs are constructed, remember that SSDs are erased in units of blocks that are typically about 512KB in size, meaning that if a single page (4KB) on a particular block is changed, then the entire SSD has to be reprogrammed (rewritten). The reprogramming process uses a slow cycle of read-modify-erase-write. That is, all of the data on the particular block first has to be read, typically into cache, then the modified data is merged with the cached data, then the particular SSD block has to be erased, and finally, the updated block is written to the freshly erased block.
One would think that this cycle wouldn't take too long since SSDs are very fast storage devices with no moving parts. However, the performance of SSDs is asymmetric with erasing being several orders of magnitude slower than reading, and writing being almost an order of magnitude slower than reading. This leads to a very slow read-modify-erase-write cycle that severely penalizes an otherwise extremely fast storage media.
Even worse, the read-modify-erase-write cycle leads to something called write amplification. This refers to the fact that a simple change or update to a page inside a block leads to additional write cycles being used. The number of writes that have to occur to write a particular chunk of data to the storage media has a value of 1 when no additional writes are needed (this is typical for hard drives). A larger value means that more than one write has to happen to write the data to the storage media. This measure of the number of writes to actually write a chunk of data is commonly the "write amplification factor" and early SSDs had a very large value (larger than 1). However, don't forget that SSDs have limited write cycles (Single Level Cell (SLC) has approximately 100,000 cycles and Multi Level Cell (MLC) has approximately 10,000), so write amplification factors greater than 1 prematurely age an SSD (i.e. use the write cycles faster than they should).
Solid state disk manufacturers and vendors have known about these problems for some time and have developed several techniques to improve their performance in light of their limitations as well as reduce write amplification which can reduce the life of an SSD.
One technique is called Write Combining where multiple writes are collected by the controller before being written to the block(s) within the SSD. The goal is to combine several small writes into a single larger write with the hope that neighboring pages of data are likely to be changed at the same time, and that these pages really belong to the same file. It can greatly reduce the write amplification factor improving the write performance, but it depends upon how the data is sent to the drive and whether the data chunks are part of the same file, or are likely to be changed/erased at the same time.
Another technique, called Over-Provisioning, keeps a certain number of blocks in reserve and doesn't expose them to the OS. For example, if the SSD has a total of 75GB of total space, perhaps only 60GB of it will be exposed to the OS. These reserved blocks can be used for the general pool of available blocks, without the knowledge of the OS, to help performance. These "extra" blocks increase the size of the block pool guaranteeing that at no time will the pool have zero available blank blocks creating a bottleneck while the application waits for a read-modify-erase-write cycle to happen before actually writing data to the SSD. This concept also has benefits for longevity because if a particular block has fewer write cycles remaining compared to any other block, then it can be switched with a block in the reserved pool that has much less usage. This helps the overall wear-leveling of the SSD.
A third technique that has been eagerly awaited for some time is something called a TRIM command. Recall that one of the big performance problems is that when a write is performed to a page that has not been erased yet, the entire block that contains that page has to be read into cache, the new data is merged with the existing data in the block, the block on the SSD is erased, which takes quite a bit of time, and the new block in cache is written to the block. This read-modify-erase-write process takes much more time than just a write would on the SSD. The TRIM command tells the SSD controller when a page is no longer needed so that it can be flagged for erasing. Then the SSD controller can write the new data to a "clean" page on a block so that the entire read-modify-erase-write cycle is avoided (the cycle just becomes "write"). Thus, the overall write performance is much better.
All of these techniques, plus others, have been incorporated into SSDs with varying degrees of success. As with everything, there are trade-off that have to be made in the design of the SSD. These previously described techniques require a more sophisticated controller which costs more to design and manufacture. Over-provisioning means that you don't get to use the entire capacity of the drive increasing the apparent cost of the drive. So SSD manufacturers combine these various techniques while watching the overall performance and price when they design a new SSD.
In this second part of the article series, I want to test an enterprise class drive, an Intel X25-E drive, to see how well it performs over time as a way to understand the impact of these techniques on SSD performance. In particular, I will be performing some benchmarks on a brand new clean drive and then running some I/O intensive tests against the drive. Immediately following the I/O tests I will re-run the benchmarks to look for signs of degraded performance.
Benchmarking Approach and Setup
The old phrase of "if you're going to do it, do it right," definitely rings true for benchmarking. All too often, storage benchmarks are nothing less than marketing materials providing very little useful information. So in this article I will follow the concepts explained in this article that should improve the quality of the benchmarks. In particular I will follow this advice.
- The motivation behind the benchmarks will be explained (if it hasn't already)
- Relevant and useful storage benchmarks will be used
- The benchmarks will be detailed as much as possible
- The tests will run for more than 60 seconds
- Each test is run 10 times and the average and standard deviation of the results is reported
These basic steps and techniques can make benchmarking much more useful.