Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
The National Center for Supercomputing Applications (NCSA) Blue Waters supercomputer uses Spectra 380B tape libraries as an active repository. The library is capable of read/write rates of up to 2.2 PBs per hour and is capable of storing 380 PB.
The National Institute of Health (NIH) uses Oracle tape libraries for active archiving in its data center and for long-term retention off-site. The huge datasets are available for ongoing access and analysis by health researchers all over the world. And Major League Baseball's production and broadcast arm also uses Oracle StorageTek libraries to add 25 to 30TB of video data per day onto tape.
Some disk and appliance-based cloud services providers try to position tape and cloud against each other, as in “storing data in the cloud is faster and cheaper than tape.” This is a false argument since tape is in the cloud. Cloud data centers often own massive tape libraries for long-term, economical storage. Glacier may be the exception: Amazon swears that it does not use tape, although it coyly refuses to say what it does use. However, many major cloud providers including Google take advantage of tape’s capacity advantage.
Examples of scientific users who combine cloud and tape include CERN, the Argonne National Laboratory, and NASA. The Discovery Channel is a top example in the broadcasting field. In education, the University of Southern California uses its tape-based USC Digital Repository to store digital holdings in the cloud. USC partners with Levels Beyond www.levelsbeyond.com to index distribute and monetize its tape-based digital holdings in the cloud.
Tape serves as nearly infinite and highly economical storage for large unstructured data. Even supercomputer vendor Cray uses tape for cold storage in its four-tier big data storage archive. For big data analysis, active archiving and large volume cartridges for huge data sets are key.
HP’s StoreEver ESL G3 enterprise line stores up to 75PB in a single system. Quantum's largest enterprise Scalar model, the i6000, also expands to 75PB. Last year Oracle introduced a StorageTek tape drive capable of storing 8.5TB of raw data with speeds of 252MB/sec. This year IBM and Fujifilm announced a prototype cartridge capable of storing 85.9 billion bits of data per square inch, or up to 154TB of uncompressed data. IBM is also partnering with Sony, who has announced a prototype tape media with a maximum data density of 148 GB per square inch -- or 185TB on one cartridge.
· “Flape.” This unfortunate term combines "flash" and "tape." Although IBM doesn’t use the term (small blame to them) they are have a case combining FlashSystems V840 with tape. The all-flash system has enough capacity and performance for a Tier 0 and Tier 1 production system. It intelligently migrates inactive data directly to a secondary tier, which could be disk or tape. IBM suggests that the tier be tape for economical long-term data retention.
· The rise of LTFS. LTFS is an immense subject on its own. Briefly, the LTFS tape file system stores data on tape media with its metadata, which enables users to access files on tape without needing a proprietary backup application or specific version. This solves the ongoing IT problem of having to search through backup catalogs to locate and restore files from backup tape. IBM is developing a tape system that integrates LTFS and GPFS, IBM’s disk cluster file system. The new system will make tape look like disk to servers and will create a common namespace across disk and tape for globally managed storage.
· Technical advances. LTO-6 is going strong. Each new LTO (Linear Tape-Open) generation makes big advances in density. And LTO-7 is on the near horizon and LTO-9 and 10 are on the roadmap. Vendor-proprietary tape cartridges are also steadily advancing. IBM, Oracle, Quantum, Spectra Logic and others are innovating on a number of fronts such as increasing capacity, reliability and durability, improving media lifecycle management, and enabling more rapid data access. Suppliers are also enhancing power consumption and cooling technologies, making tape libraries more economical to operate.
Economies of Scale
Economies of tape are economies of scale. The larger the scale, the greater are tape's financial advantage over disk. Disk gets more expensive quickly, and since 80% or more of typical data storage is made of inactive data, the company is paying to maintain spinning disk for a large proportion of inactive files. And with the growing size of data and datasets, tape is tailor-made to cost-effectively retain data over long periods of time.
The upshot is that tape is neither dying, shrinking, nor retreating. I am going to let Crossroad Systems have the last word on that. This DP and archiving vendor made a video that summarizes every disk-only vendor's tape pronouncement across the decades.
Christine Taylor is a writer and industry-watcher.
Photo courtesy of Shutterstock.