Online backup vendors have got it all wrong.
Last week’s article that showed that online backup providers vastly prefer disk over tape made me wonder if we had been transported back to 1999, when profits didn’t matter. Just based on acquisition costs and power usage, these vendors could be saving a fortune and providing a service that’s nearly as fast — and potentially more reliable — by using tape libraries instead of disk.
Those of you convinced that SATA is bringing down the cost of disk should read an article I did last year on The Real Cost of Storage. If you want performance and reliability, disk isn’t necessarily cheap.
The online backup community seems to be marching steadily down the disk backup path. I don’t believe this is the right decision for home backups, small businesses or even larger organizations, because disk is far more expensive than tape in so many ways, and tape also has a big “green” advantage. MAID is being used by some online backup vendors to reduce power costs, but tape is still cheaper than MAID, even after including the power savings of MAID.
We’ll start by looking at a variety of technologies from different vendors for comparison. The vendors chosen were not based on any preference, just on what I could easily find pricing for on the internet. These might not be the absolute best prices, but the relative numbers are likely pretty good.
A large enterprise IBM tape library-based module costs $38,365 and the expansion units cost $28,125. This allows for about 6,000 slots with a fair number of tape drives. Let’s say you get 20 LTO-4 drives at a cost of $16,710 per drive, and LTO-4 tapes are $140 per cartridge. The total cost for this full tape library would be $1,634,439, for the module, 15 expansion units, 20 drives and 6,000 tapes.
The total capacity without compression would be about 4.6 PB, and with compression 9.2 PB. The cost per PB is $357,049 without compression and $178,524 with compression. Compression for LTO-4 is two to one, according to the specification.
Now let’s do the same thing for storage using standard RAID technology for SATA, since backup and restoration are far less a performance problem than the bottleneck posed by the internet.
For the disk controllers I chose the Sun StorageTek 6540, which is really an LSI product re-branded by a number of vendors. I configured this with 16 1TB drives per tray and 12 trays in the controller, which comes to 192TB at $663,695.
That makes the cost per PB $3,456,745 for the Sun 6540, compared to $357,049 for LTO-4 with no compression and $178,524 with compression.
That makes disk almost 10 times more expensive than tape without compression. Of course, there are some caveats:
- The Sun 6540 has about 1800 MB/sec of bandwidth, while 20 tape drives uncompressed is about 2400 MB/sec (20*120 MB/sec).
- I did not include the cost of the hierarchical storage management software (HSM), which will not be cheap, but far less than the tape robot, nor did I include the cost of a small amount of highly reliable disk cache for the HSM to manage as a buffer.
Let’s add in another cost on the disk side that needs to be considered, and that is power. Let’s say I wanted a full tape library with 4.2PB of uncompressed storage for my online backup company. The power for the tape library would be minuscule. If the power was spinning SATA drives, the story is far different. 4.6PB of disk with SATA using RAID-6 8+2 for reliability would be 5,250 drives without hot spares. I am also adding 3 percent for hot spares to be safe, which yields 5,355 drives and 335 disk trays with 16 drives per tray. Using a SAS interface for the SATA drives for reliability is 13 watts per drive. If a standard SATA interface was used, it would be 11.6 watts per drive.
The total watt count is: 5,355 drives *13 watts per drive + 335 *375 watts per tray, or 195 kW. Let’s take a power cost of .10 per kW.
This doesn’t include the BTUs of heat generated, and a good estimate I have found is 1.45 times the cost. So the yearly cost for power is $247,994 based on current power costs, which don’t appear likely to go down anytime soon. Also, disk drives do not compress data, while tape drives do so automatically in hardware. This might not be a big deal for home internet backup, since much of the data may be in pre-compressed formats such as jpg and mp3, thus the fairest comparison is uncompressed tape to uncompressed disk, even though there will likely be some compression.
Most of us are connected to the internet over cable modems or DSL connections from our homes or businesses. I would say the performance for real sustained data movement ranges from 128 Kbit/sec to about 3 Mbits/sec. That means that an 8MB file takes anywhere from about 512 seconds on the lower end to 21 seconds on the upper end of that range. I doubt that most of us regularly get 3 Mbits/sec downloads, and at least for cable, never for uploads.
Tape pick, load time and position time is around 69 seconds. Obviously, with a disk cache for the upload, you really don’t care about this, since the data is cached on disk and written to table. The real issue is on the downloads.
It would make sense to have two modes for this type of service: restoration of a few files at most, or if there was a catastrophe, restoration of everything. If I had a few files, random files that were not created at the same time, then I would potentially have a delay for each file if they were not all queued up at once. But if they were all queued up at once, I could potentially just have a single 69-second delay if tape drives were available and then have all the files streaming in. I would wait for 69 seconds or so for the tape to start transferring data and then I would be streaming data. If I was restoring a large amount of data, then the HSM software could start pre-staging the files from tape far faster than I could receive the data on the fixed machine at home or work. I would have an initial delay of about 69 seconds, but the time to restore the files, given internet bandwidth constraints, would be orders of magnitude greater than the time to read the data from tape to the HSM disk cache. That, to me at least, is a “no-brainer.”
MAID latency is far less than tape, of course. MAID cost for power is also far less than tape, but disk drives are more expensive than tape on a per byte basis. You still have the RAID issue with MAID, but instead of 8+2 you have 3+1, so the percentage of space wasted on parity is greater. If you have latency problems, MAID is the way to go, but is there really a latency problem when restoring big files over the internet?
Since disk drive costs are more expensive per byte than tape costs, and latency for large restorations can be hidden by HSM pre-staging files, does using spinning disk for online backup make sense? If it took you an extra minute or so to restore that picture of you mother from Aunt Sandy’s 60th birthday that you accidentally deleted or some Barry White song that got corrupted somehow, does it really matter? Green issues aside, does using power make sense for this type of application? Why spend extra money for power and disk drives when tape is up to the job?
I don’t see the justification, unless the problem is a lack of technological expertise. I am constantly told that HSMs are hard to use, but this has never been my experience. HSMs have been around since the 1970s and are tried and tested. Perhaps the problem is that storage technology is considered too complex to get people to implement HSM. Whatever the hurdles are, to me this looks like an issue that online backup providers need to revisit. If nothing else, the door is open for some enterprising online backup vendor to compete on price using a good old-fashioned tape architecture.
Henry Newman, a regular Enterprise Storage Forum contributor, is an industry consultant with 27 years experience in high-performance computing and storage.
See more articles by Henry Newman.