Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Over the last few months, the issue of long-term tape archiving has come up in my work several times.
Each time the issue is brought up, I hear from supposedly smart people, "We will put it on tape and migrate it in 15 years, one-half of the shelf life of the tape." The people saying this are in technology smart positions (VP of Engineering and Technology for Blah Blah, etc.). Often I am hearing this from organizations that have large digital archives or large analog archives that they plan to digitize. The problem is that you can't always take a tape drive that is 20 years old, with 20-year-old data on it, read the tape and migrate it to a new type of media.
Twenty years ago, the IBM 3480 cartridge was introduced, and if you had written the data from an IBM mainframe running MVS, you would likely still be able to read it today. But what if you had waited two to five years and then written it on a Sun Workstation running Solaris 4.0 or a Cray-XMP, or something else? Do you think you'd be able to read it today? Perhaps, but not very likely, and even if you could read it, how much would it cost to read the Cray-XMP data?
In one sense, I am making a case for mainframes. It might be because mainframe technology, rightly or wrongly, does not change quickly. Change in the mainframe market is evolutionary, not revolutionary. Remember back 20 years ago: you had multiple mainframe vendors besides IBM, including Univac, CDC and others, and the Japanese and Amdahl clones. These vendors have for the most part disappeared, but mainframe technology is backwards-compatible for much, much longer than open systems, be they Unix, Windows or whatever.
There are multiple issues surrounding long-term archival storage that must be considered if you hope to navigate these treacherous waters successfully. Let's take a closer look at them and outline the issues you need to think about carefully. With more and more regulations requiring that data be retained long-term, the topic is an important one.
Issue 1: Getting the data to the media
I have repeatedly discussed tape bit error rates recovered and un-recovered and various disk bit error rates for SCSI and SATA drives, but what about the network to the disk and/or tape device? What are the bit error rates for the Fibre Channel? The recovered bit error rates at the hardware level are about 10E-12. This is 100,000 times greater than the bit error rate on enterprise tape media. This might be why some new emerging standard for end-to-end data reliability are emerging using an 8-byte checksum from the HBA to the storage device. Look at www.t10.org and end-to-end data protection. We are still a few years away, but this will encompass data checking from the host to the device.
Issue 2: The media
As has been discussed ad infinitum, media has a 30-year shelf life. What does that mean? It means if you keep it under the manufacturer's perfect conditions for temperature, humidity and cleanliness, that you should be able to read most of your data, if not all.
What if you have a few days that you do not meet those requirements? Over 30 years, a lot can happen: hurricanes, ice storms, tornados, earthquakes, floods and power outages are just some of the possibilities. You name the disaster and someone has faced it, and even if you planned carefully, you still might not meet all the conditions.
That said, tape is still far cheaper thank disk storage, based on per-byte costs, including compression and power issues, and enterprise tape media is more reliable than the equivalent byte of SATA (3x) or SCSI disk (2x). Calculation of reliability using RAID is difficult at best and is usually proprietary. However, tape density is not increasing as fast as disk density.
In a perfect world, we would mount and re-tension (but not too often), it would not be handled by humans that could drop it, and the environment would always be perfect. If all of this was true, my tape media sources tell me there would be no drop in bit error rate. Since there is little likelihood of all these conditions being met, we will never know for sure.