A storage newsletter that shall go unnamed (we all make mistakes) discussed the dwindling of tape in 2012. Their opinion was that tape was on its way out. They pointed out that backup was moving to VTLs and that this change would be the death knell of tape.
The article granted archiving as a use case but questioned tape’s reliability over disk, and questioned tape’s slower access numbers over disk. Finally, the article reported that the only tape maker that was sharing numbers was SpectraLogic because news at other tape vendors was not good.
Everyone can have a bad day. From the perspective of 2014 here is why the article was largely wrong.
· Backup: They got this partially right. For decades tape was the backup medium of choice, and it is true that that is changing. Primary backup can be faster to disk or cheaper in the short-term to the cloud. The primary affected market is SMB, whose small amount of data can usually go straight to the cloud without impacting performance. However, mid-sized and enterprise produce reams of digital data that they must retain for long periods of time. For them, tape’s density and economies of scale remain an excellent backup and/or archival choice.
· Dependability. Tape dependability did have a bad rep for a while, largely thanks to poor DLT generations. Yet with the growth and stabilization of the LTO standard, this is no longer the case. Tape is proving more reliable than disk, especially lower cost disk. The National Energy Research Scientific Computing Center (NERSC) reported that tape cartridges are up to four orders of magnitude more reliable than SATA.
There are several reasons for this, including the Bit Error Rate (BER) and the bit rot phenomenon. BER predicts the percentage of faulty bits per total number of written bits. Tape shows a 10x BER improvement over premium disk. Bit rot – the gradual decay of data stored on magnetic media – is also a real concern in long-term data storage. Both tape and disk are magnetic but spinning disk is at greater risk. (Thus the 15-30 year rated lifespan of LTO.)
· Sales: After showing signs of bottoming out a few years ago, 2013 sales stopped declining and 2014 is seeing sales rise. LTO mostly clearly describes the state of the tape market. LTO-6 drives have sold drives and media representing 100,000 PB-worth of data capacity on tape. LTO-4 is losing some ground as an older release, while LTO-5 sales are increasing. LTO-6 is selling even faster, aided by LTO’s compatibility framework. Part of this popularity is cost-per-GB. Disk manufacturers are quick to cite falling disk prices but tape is still cheaper: a 1.5 TB tape cartridge costs about $40 compared to a similar capacity HDD for more than twice that.
· Performance. Disk-only vendors make statements like “tape is slower than disk.” That universal statement is not true: performance differs depending on the speed of the disk system or autoloader/library and the type of data being transferred. Disk is generally faster with random access where the disk heads can reach multiple specified locations faster than a tape drive can reposition its heads.
However, tape performance is generally superior with sequential access, which is why tape is particularly useful with backup, archive and big data sets.
Top Usage Cases for Tape Today
Today tape is thriving in important usage cases: archiving, the cloud (yes, the cloud), and big data. Disk manufacturers deny it but consider the source: vendors who do not offer tape products and thus have a vested interested in seeing tape die. These vendors will argue that they don’t sell tape because they don’t believe in tape. Yet in the face of tape’s capacity, economy and dependability, that is frankly disingenuous.
Archiving
The largest archiving usage case for tape is long-term archival retention. The classic usage is long-term archival storage off-site, an exceptionally cost-effective storage solution. Active archiving is also a valuable use case, where tape offloads data sets from primary disk and keeps it immediately available for analytics and reloading onto the production system. Some software enables tape systems to natively host analytics.
For example, National Geographic’s NG Global Media manages massive media archives. Its Television MediaCore is a broadcast and post-production facility that provides media services to clients. They typically generate 5-10TB of content a day and archive a good 90% of it to Spectra Logic tape libraries. The archive stays immediately accessible since a significant percentage of the media is accessed and re-used a within very short time periods.
The National Center for Supercomputing Applications (NCSA) Blue Waters supercomputer uses Spectra 380B tape libraries as an active repository. The library is capable of read/write rates of up to 2.2 PBs per hour and is capable of storing 380 PB.
The National Institute of Health (NIH) uses Oracle tape libraries for active archiving in its data center and for long-term retention off-site. The huge datasets are available for ongoing access and analysis by health researchers all over the world. And Major League Baseball’s production and broadcast arm also uses Oracle StorageTek libraries to add 25 to 30TB of video data per day onto tape.
Cloud
Some disk and appliance-based cloud services providers try to position tape and cloud against each other, as in “storing data in the cloud is faster and cheaper than tape.” This is a false argument since tape is in the cloud. Cloud data centers often own massive tape libraries for long-term, economical storage. Glacier may be the exception: Amazon swears that it does not use tape, although it coyly refuses to say what it does use. However, many major cloud providers including Google take advantage of tape’s capacity advantage.
Examples of scientific users who combine cloud and tape include CERN, the Argonne National Laboratory, and NASA. The Discovery Channel is a top example in the broadcasting field. In education, the University of Southern California uses its tape-based USC Digital Repository to store digital holdings in the cloud. USC partners with Levels Beyond www.levelsbeyond.com to index distribute and monetize its tape-based digital holdings in the cloud.
Big Data
Tape serves as nearly infinite and highly economical storage for large unstructured data. Even supercomputer vendor Cray uses tape for cold storage in its four-tier big data storage archive. For big data analysis, active archiving and large volume cartridges for huge data sets are key.
HP’s StoreEver ESL G3 enterprise line stores up to 75PB in a single system. Quantum’s largest enterprise Scalar model, the i6000, also expands to 75PB. Last year Oracle introduced a StorageTek tape drive capable of storing 8.5TB of raw data with speeds of 252MB/sec. This year IBM and Fujifilm announced a prototype cartridge capable of storing 85.9 billion bits of data per square inch, or up to 154TB of uncompressed data. IBM is also partnering with Sony, who has announced a prototype tape media with a maximum data density of 148 GB per square inch — or 185TB on one cartridge.
Looking Forward
· “Flape.” This unfortunate term combines “flash” and “tape.” Although IBM doesn’t use the term (small blame to them) they are have a case combining FlashSystems V840 with tape. The all-flash system has enough capacity and performance for a Tier 0 and Tier 1 production system. It intelligently migrates inactive data directly to a secondary tier, which could be disk or tape. IBM suggests that the tier be tape for economical long-term data retention.
· The rise of LTFS. LTFS is an immense subject on its own. Briefly, the LTFS tape file system stores data on tape media with its metadata, which enables users to access files on tape without needing a proprietary backup application or specific version. This solves the ongoing IT problem of having to search through backup catalogs to locate and restore files from backup tape. IBM is developing a tape system that integrates LTFS and GPFS, IBM’s disk cluster file system. The new system will make tape look like disk to servers and will create a common namespace across disk and tape for globally managed storage.
· Technical advances. LTO-6 is going strong. Each new LTO (Linear Tape-Open) generation makes big advances in density. And LTO-7 is on the near horizon and LTO-9 and 10 are on the roadmap. Vendor-proprietary tape cartridges are also steadily advancing. IBM, Oracle, Quantum, Spectra Logic and others are innovating on a number of fronts such as increasing capacity, reliability and durability, improving media lifecycle management, and enabling more rapid data access. Suppliers are also enhancing power consumption and cooling technologies, making tape libraries more economical to operate.
Economies of Scale
Economies of tape are economies of scale. The larger the scale, the greater are tape’s financial advantage over disk. Disk gets more expensive quickly, and since 80% or more of typical data storage is made of inactive data, the company is paying to maintain spinning disk for a large proportion of inactive files. And with the growing size of data and datasets, tape is tailor-made to cost-effectively retain data over long periods of time.
The upshot is that tape is neither dying, shrinking, nor retreating. I am going to let Crossroad Systems have the last word on that. This DP and archiving vendor made a video that summarizes every disk-only vendor’s tape pronouncement across the decades.
Christine Taylor is a writer and industry-watcher.
Photo courtesy of Shutterstock.