Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
You have probably watched Happy Days at some point in your life (even Mark Whatney watched it while stranded on Mars). In the first years it was generally considered a good show and had good ratings. But then they showed the episode where the Fonz jumped over a shark on water skis. After that episode, Happy Days never recovered. The show stayed on the air but the ratings continued to drop. Thus the phrase "jumping the shark" was coined. While it originally applied to television programs, it has been extended to, "... indicating the moment when a brand, design, franchise or creative effort's evolution declines."
Storage technology is evolving rapidly. New types of storage, new storage solutions and new storage tools that hold a great deal of promise for pushing storage technology forward are being developed. It is extremely likely that some current technologies may not survive in their current form. Which technologies may jump the shark is the subject of this article.
Taking current trends and making predictions from them is always a dicey proposition. However, there are some trends that will have a clear impact on hard drives.
NVRAM and Burst Buffers
While hard drives are still being produced at a very high rate, other storage technologies are catching up in different categories. Solid State Drives (SSDs) are becoming increasingly popular, and now Intel/Micron's 3D XPoint NVRAM (Non-Volatile RAM) will provide a completely new way to store data.
3D XPoint will look like DIMMs and sit in the system DIMM slots. With regular DRAM memory, the data in memory is lost if the system is turned off. However, for NVRAM, the system can be turned off and the memory state remains in the memory (hence the label "non-volatile"). Rather than write data to a conventional storage device, the data can be left in memory and shared with other applications. This can improve application performance because an application doesn't have to read data from conventional storage — the data is already in memory, and the application just needs a pointer to its location. Moreover, NVRAM will usually come in terabyte (TB) quantities for systems, instead of gigabytes (GB), as is the case for DRAM.
Alternatively, some or all of the NVRAM can be used as storage (a block device) allowing the creation of a "burst buffer." Data can be quickly copied from DRAM to the NVRAM burst buffer because it's inside the system. Theoretically, the state of the system can be stored in the burst buffer while the memory contents are still stored in NVRAM. Then the system can be power cycled, the state can be read from the burst buffer, and the system will resume its previous state. While this can be done today, the fact that the memory contents stay in memory means that the restart to the last state is very, very fast.
The key to burst buffers is the extremely high bandwidth because the storage is on the memory bus. The current projections are that NVRAM won't be as fast as regular memory but it will be faster than SSDs. It will also cost less than DRAM but be more expensive than SSDs. NVRAM will first go on the market in 2016 or 2017 in HPC systems.
The introduction of NVRAM reduces the performance and capacity gap between main memory and an external file system. In the figure below, courtesy of The Next Platform, is an outline of the storage hierarchy before the advent of NVRAM and after the advent of NVRAM for a new HPC system at Los Alamos named Trinity. The image is from a talk given by a talk by Gary Grider from Los Alamos.
Trinity is expected to have a peak performance of more than 40 PetaFLOPS. It is also expected to have an 80 Petabyte (PB) parallel file system with a sustained bandwidth of 1.45 Terabytes/s, and a burst buffer file system that is 3.7 PB in capacity with a sustained bandwidth of 3.3 TB/s [Note: It will have a memory capacity of about 2PB, so the burst buffer can easily hold the entire contents of memory].
Notice how the burst buffer storage (NVRAM) in Trinity (on the right in the diagram) has a bandwidth that is 2-6 times that of the parallel file system but still lower than main memory. However, when the power is turned off, the data is not lost as it is with DRAM.
In Trinity, Los Alamos has introduced burst buffers into the storage hierarchy as well as something new they refer to as "campaign storage." The campaign storage layer is below the parallel file system and above the archive layer. It has about 1/10th the performance of the parallel file system but presumably has a greater capacity than the layers above it. It is intended to hold the data longer (months-years) and flushed less frequently.
The righthand diagram, which is the storage stack for Trinity, is projected to become a very common hierarchy for HPC systems in the next few years. The prior hierarchy on the left hand side of the diagram only had two layers of storage, but now users have to contend with four layers.