Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Planning for solid state drive (SSD) hardware is a complex task that requires a good understanding of how SSDs are going to be used within your storage networking environment, so it is best to start with the first part of this series to understand application and software issues (see Solid State Drives in Enterprise Applications).
There are two basic types of interfaces for SSD hardware: PCIe and Fibre Channel/SAS/SATA. Each of these types of interfaces has strengths and weaknesses, and these tradeoffs are important to understand.
Another issue is solid state drive management. The question is whether SSDs should be placed in RAID controllers, or whether they should be connected via another interface. Of course, SSD performance is a big issue, or you wouldn't even be considering purchasing them. We have all heard about wear leveling, write performance degradation and other performance issues found in some flash-based SSDs. There has been a great deal written on these topics, so I am just going to cover some important details to understand. And last but not least is the question of SSD reliability. As always, let's start with reliability, as few things matter more to enterprise data storage users.
We all have heard that there is a limit to the number of times you can write to a flash cell. This number varies dramatically depending on the flash vendor, what generation the flash technology is, and other issues. The claimed range is from more than 100,000 writes per cell to more than a million, but flash cell reliability is not the whole story for SSDs, and we'll touch on that more in the next article. Another critical area for reliability is the amount of Error Correction Code (ECC) used within the SSD, which in turn influences the hard error rate of the device. The hard error rate is usually specified for disk drives in terms of the number of sectors in error per bits read. For enterprise-class SAS/FC drives, this number is currently at one sector in error per 10E16 bits. This translates to 512 bytes for every 11 PB moved. Obviously, this is a big number, but if you have 1,000 disk drives moving data at 100 MB/sec, you have a failure every 33 hours.
That's not a very long time, and SSDs operate at much higher rates than disk drives, so this might be a concern, but the likelihood if having 1,000 SSDs is low, at least for now. But the real question to ask is what the hard error rate is and how the vendor arrived at that value. The major disk manufacturers understand the underlying ECC issues and have the reliability engineers needed to calculate these values based on their media design and low-level error issues. With all the new SSD vendors, the question I always ask in addition to the hard error rate is how the number was calculated. If they do not have an answer, and it is not at least equal to enterprise disk drives, then I move on.
Also, keep in mind that just as with disk drives, SSDs have different hard error rates for SAS/FC (note that there are not very many FC SSDs, as SSDs are new technology and the industry trend for drive interfaces is SAS) and for SATA SSDs. Enterprise SATA disk drives have an order of magnitude lower hard error rate than SAS/FC (10E15 bits per sector), and non-enterprise SATA is an order of magnitude less than that at 10E14 bits per sector. One of the reasons that SAS/FC drives are smaller than SATA drives is that there is more ECC per disk drive, reducing the density, as that ECC space is traded for storage space.
In addition to the hard error rate, there are two other factors to consider: the annualized failure rate and the write budget per day.
Hard error rates measures the rate of failure of the media, while the annualized failure rate is usually based on the failure of other components. It is usually specified in percentages per year. In enterprise disk drives, for example, the value is less than 1 percent.
Most SSDs have a limit on the number of writes that can be done based on the number of times a cell can be written to before it fails. The write budget is generally based on the performance of the device, the number of times a cell can be written to before it is expected to fail, and the number of spare cells that can be remapped to address cells that have gone over their value for failure. Write budgets vary widely between enterprise and non-enterprise devices, so it is important to understand the write budget in terms of the type of SSD and the performance of the interface. If you are using 3Gb SAS (375 MB/sec) or PCIe 2.0 16 lane 8GB/sec, your write budget needs will differ greatly. We'll return to this issue again in the next article.
The performance of various SSD devices varies widely and is often measured in either or both MB/sec or IOPS per second for both read and write. One important question to ask about IOPS is what the packet size is of the claimed IOPS performance. Although read and write performance can be significantly different, some vendors' write performance is much closer to read performance than others. This is why it is important to understand your application requirements (back to part 1 for that
The performance of SSDs for both SAS and SATA often depends on whether they are using 3 Gb/sec or 6 Gb/sec technology. This is an important performance limitation that does not affect disk drives, as individual drives are not fast enough to require 6 Gb/sec, although most of the drive vendors are moving to 6 Gb/sec to allow the channel to operate at 6 Gb/sec so more drives can be put on the same channel. Using 6 Gb/sec technology is critical for enterprise SSDs given the speed of the devices. For those SSDs using PCIe interfaces, you need to ensure that the SSD performance cannot outrun the PCIe slot. With PCIe 2.0 available in most small servers (Intel and AMD), the availability of 8 PCIe 2.0 slots and even 16 slots is common. High-end non-Intel/AMD (IBM and Sun) servers generally lag in PCIe implementation, given the time it takes to design the complex memory interconnect that is usually associated with these enterprise servers. If you are thinking of PCIe-based SSDs, make sure that the server you are planning on using has PCIe slots equal to or greater than what is required by the vendor.
There is no official standard for SMART for SSDs, either SAS or SATA. This means that you must understand the implication and meaning of the proprietary SMART data for each SSD. Of course, if the SSD is in a RAID array, the RAID vendor has done all of this for you during qualification of the SSD. The key problem I see for people using SSDs attached to standard SAS cards is that some vendors do not provide tools to look at the SMART data, and even if you get the SMART data through some freeware tool, you do not know what the values really mean. For the vendors that I have reviewed, if they have a tool, then the tool has the definitions of the SMART values and often has a set of thresholds for alerts and alarms. This is a very good thing in enterprise data storage environments, of course, as you really want to know before a device fails so you can take action.
SSD Hardware Interfaces
The PCIe hardware interface is easy to understand compared to SAS/SATA, as PCIe will be much faster, but neither PCIe nor a direct SATA interface will allow failover from one machine to another. There are some vendors that provide some external PCIe channel extenders and allow you to mirror the device through a volume manager, but I am not a fan of this method given that it is not a tried and true method that has worked at PCIe 2.0 speeds with all of the potential N-case issues. Maybe I'm overly paranoid, but I do not want to be the first one to try this in a production enterprise environment such as file system metadata
The tradeoffs for SAS compared to SATA are pretty simple and are no different than disk drives, but as SSDs are faster than regular disk drives, the differences are more pronounced. Here are some of the major differences:
- SATA drives are not dual-ported, so failover is an issue
- SATA does more processing in the driver for error issues, so retries and other issues with command processing mean the drive is slower, as more command issues are processed
- The SATA channel has a higher undetectable error rate than the SAS channel given the amount of ECC in the command packet. This is a significant issue, as the difference is likely four orders of magnitude.
If you have an enterprise application, SAS is the way to go.
Choosing the right SSD hardware type for your application is not that difficult, but it's a critical part of architectural design. The final article in this series will address the internal design of SSDs, along with some of the issues surrounding their use in RAID devices or connected to SAS controller cards.
Henry Newman, CTO of Instrumental Inc. and a regular Enterprise Storage Forum contributor, is an industry consultant with 28 years experience in high-performance computing and storage.
See more articles by Henry Newman.
Follow Enterprise Storage Forum on Twitter