In the consumer world, tape formats have all but vanished from the scene, replaced by CDs, DVDs and other digital disk-based systems. Will the use of tape in storage follow the same fate? Possibly not, but with each passing year, tape loses a little more ground to the encroachment of disk.
“Large customers are permanently replacing hundreds of TB of tape library capacity with new massive disk-based systems,” says Steve Duplessie, founder and senior analyst at Enterprise Strategy Group. “One big oil company, for instance, found it cheaper to place 2.6 PB of data on an Engenio disk system than to try to restore all that data from tape.”
Not only is the pace of the advance accelerating, but users are now faced with an ever-broadening choice of technologies for backup, archiving and other storage activities. Virtual Tape Library (VTL), nearline disk, massive array of idle disks (MAID), content addressed storage (CAS), replication, and continuous data protection (CDP) are some of the more popular options.
VTL can be either software or an appliance that is designed to make disks appear like a tape library. This technology has been around for a couple of years and is now gaining some serious traction with a wealth of vendors involved, among them Sepaton, FalconStor, Quantum, ADIC, Diligent and Overland.
One of the big pluses of VTL is that companies can implement it using existing tape backup software and older infrastructure elements. This means that the last decade’s investment in backup technology is preserved and aging skill sets don’t have to be replaced or upgraded. In addition, data can be retrieved from disk faster than from tape.
On the downside, however, once disk capacity fills up, it can mean the additional step of backing up the VTL to tape. And there may be other issues to resolve.
“Even if there is enough disk space available, companies can find themselves continuing to pay for robotic tape licenses that they no longer use,” says W. Curtis Preston, vice president of data protection services at GlassHouse Technologies. “Vendors want you to believe it is easy to implement VTL, but to really get the full benefits you have to redesign your system.”
VTL is now starting to incorporate other advanced features. FalconStor, for instance, has added CDP features to its VTL appliance. According to Wendy Petty, vice president of sales at FalconStor, users can now backup to VTL and also enjoy the instant recovery features of CDP.
This is simply a disk-based interim caching point before the information goes to tape. This opens the door to using cheaper ATA disks. SATA, for example, is popular for a second tier of disk storage to keep data near at hand for mission-critical or production data.
“The downside with nearline disk storage is that ATA disks are more prone to failure, but the upside more than compensates,” said Mike Karp, senior analyst at Enterprise Management Associates. “I like nearline SATA as it is inherently cheaper, works well enough and the higher failure rates can be addressed with RAID.”
Not everyone, however, uses these disks purely as cache. More and more vendors are issuing disk backup systems that dispense with backup entirely. In addition to VTL, companies such as Avamar Technologies, FilesX and Unitrends have released direct disk-based backup targets. Unitrends Data Protection Units are one example.
“Disks can store more than 1 TB of compressed data each and have a very long shelf life compared to tape, which has a significant and proven failure rate,” says Mark Phillippi, director of product management at Unitrends. “D2D speed has become a necessity in order to shorten the backup window.”
MAID gives you a huge bank of disks. Most of the disks sit idle, only spinning when the data on that specific disk is being accessed. This keeps the power requirements way down and improves overall cost.
COPAN Systems Revolution 200T, for instance, has 896 drives (250GB each), for a total of 224TB. With that amount of storage, you can use commodity heating and cooling to keep costs low. According to COPAN chief architect Chris Santilli, users can access up to 25 percent of their data at any one time compared to about 1 percent for tape. MAID, he says, has a VTL-like personality — a VTL platform sits on top of the COPAN hardware.
“Even though the disks power off when not used, they are monitored every 30 days to ensure they are functioning well, and RAID takes care of any failures,” says Santilli. “With tape, you might not use a cartridge for several years, and then it can fail when you attempt a recovery.”
He recommends MAID as a means of fast access to data that is written once and read occasionally. But with the price starting at $120,000, MAID is targeted at larger enterprise environments.
“When you look at the big picture, companies like COPAN give you huge capacity at a price that makes it economic never to have to go to tape,” says Duplessie.
CAS and COS
CAS adds a means of accessing specific information rapidly, using the content meta data to locate the data as opposed to the file system or block. The king of the CAS castle is EMC Centera, but other vendors are also involved in CAS. IBM, HP, Kasten Chase, StorageTek, Archivas and others also offer this technology.
“The value of CAS is that you can use meta data tags to understand what data is associated with what business process,” says Karp. “But it is a challenging concept. As no two vendors define it the same way, the various implementations are non-compatible.”
According to Michael Jochimsen, EMC’s global appliance manager, a CAS device tracks the content so that the application doesn’t have to be concerned about where it is. Centera adds other features such as a guarantee that the data is authentic and hasn’t been altered. If you are storing 20 e-mails with the same 2MB attachment, the software stores only one copy of the attachment, along with pointers to it from the various messages. Centera also has a time expiration feature for managing the information lifecycle.
“Centera sweet spot is for functions such as medical imaging, large files, e-mail archiving or having fast access to a large amount of content,” says Jochimsen. “It is not intended for transaction or database environments.”
While not quite CAS, Data Domain and others offer what is being termed “capacity optimized storage” (COS). Bart Bartlett, director of marketing at Data Domain, says that the compression and filtering technology built into its Data Domain DD400 series reduces backups by a factor of 20.
“We look at the data stream, and if we’ve already seen the data, we don’t then back it up once again,” says Bartlett. “We look at the data at a sub-block level and can compress an incremental backup to about one sixth of its normal size.”
SEPATON has added another twist to the content equation. The SEPATON S2100-ES is a VTL appliance with a base price of around $58,000 for a 3TB VTL. To overcome the sequential layout of traditional backup programs, the company offers a content aware plug-in for each backup vendor. This extracts meta data and uses pointers to give random access to the backed up data.
“This gives the customer the option of how fast they want to migrate from their current backup software to a more modern method,” says Miklos Sandorfi, CTO of SEPATON. “If you have people trained on traditional backup, you have no need to change anything while gaining functionality and performance with VTL.”
The list of tape alternatives appears to grow with each passing week. While the above gives some of the more popular alternatives, it hardly touches on CDP and entirely skips areas such as replication and snapshot.
GlassHouse Preston points out that replication and snapshot are not enough if used alone. “Replication is a one way street, as you can’t replicate backup,” he says. “You have to combine replication and snapshot to have a decent solution.”
For more storage features, visit Enterprise Storage Forum Special Reports