Economic Realities of Archive Storage Page 2 - Page 2


Want the latest storage insights?

Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure

Share it on Twitter  
Share it on Facebook  
Share it on Google+
Share it on Linked in  

Economic Realities

To help explain the trade-offs, let's begin by examining various storage solutions and how often they might encounter an error.

One of the most prevalent risks is hitting what is called the "hard error rate." This is the number of bits that can be read before a sector will not be able to be read. This hard error rate varies a great deal across the storage options.

Henry Newman has written a wonderful article that presents a table that lists the hard error rates for various media and how this translates into petabytes (PB). That is, how many petabytes can be read from the storage media before hitting a read error. Henry's first table is reproduced below:

 Device   Hard Error rate in bits   Equivalent in bytes   PB equivalent  
 SATA Consumer   10E14   1.25E+13   0.01  
 SATA Enterprise   10E15   1.25E+14   0.11  
 Enterprise SAS/FC   10E16   1.25E+15   1.11  
 LTO and Enterprise SAS SSDs   10E17   1.25E+16   11.10  
 Enterprise Tape   10E19   1.25E+18   1,110.22  

An alternative way to think about this data is to consider how long it would take me to encounter an unreadable sector if I read the data at the full rate of the device.

Henry's second table showed the amount of time it takes (in hours) to encounter a read error when you use more than one or more devices and start reading from those devices at maximum speed (he varied the number of devices from 1 to 200). Table 2 below reproduces his data and adds two more columns for 250 devices and 300 devices.

 Device Type   1   10   50   100   200   250   300  
  Device   Devices   Devices   Devices   Devices   Devices   Devices  
  Hours to reach hard error rate at sustained data read rate  
 Consumer SATA   50.9   5.1   1.0   0.5   0.3   0.2   0.17  
 Enterprise SATA   301.0   30.1   6.0   3.0   1.5   1.2   1.0  
 Enterprise SAS/FC 3.5 inch   2,759.5   275.9   55.2   27.6   13.8   11.0   9.2  
 Enterprise SAS/FC 2.5 inch   1,965.2   196.5   39.3   19.7   9.8   19.7   9.8  
 LTO-5   23,652.6   2,365.3   473.1   236.5   118.3   94.6   78.8  
 Some Enterprise SAS SSDs   7,884.2   788.4   157.7   78.8   39.4   31.5   26.3  
 Enterprise Tape   1,379,737.1   137,973.7   27,594.7   13,797.4   6,898.7   5,518.9   4.599.1  

These two tables illustrate that if you use SATA or SAS disks, the upper bound on capacity before you encounter an unreadable sector is around 1 PB and the lower bound is about 110 TB. And even with this data, it is very difficult to give any kind of estimate of when you could encounter the hard error because it becomes so configuration dependent.

Encountering a hard error usually means that the RAID controller flags the drive as bad and then starts a rebuild. To get access to the data you tried to read, you need to wait for the rebuild to finish. The amount of time to complete a rebuild can be fairly long, but again depends on the configuration.

During this time, all of the remaining drives have to be read, increasing the probability of encountering another hard read error and possibly losing the RAID group. Consequently, you really need to have more than one copy of the data in the archive.

The number of copies also remains an open question as Henry's article points out. But at a minimum, you need two copies and may even need three copies (I haven't heard of more than three copies yet unless it is a globally distributed archive). At a minimum this means you need two to three times the capacity of the archive in storage hardware alone. If you want a 1 PB archive, you will need 2-3 PB in storage capacity.

Submit a Comment


People are discussing this article with 0 comment(s)