Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Efficient storage management includes migrating aging data through progressively less-expensive storage tiers. When data ends its migration at the cold storage stage, you can keep it for long periods of time at very low cost.
Cloud-based data storage generally falls into these four storage classes or tiers:
- Hot storage is primary storage for frequently accessed production data.
- Warm storage stores slightly aging but still active data. It costs less because the underlying storage systems don’t have the high performance and availability requirements, but it keeps data quickly accessible.
- Cool storage houses nearline data, which is less frequently accessed data that needs to stay accessible without a restore process.
- Cold storage is a backup and archival tier that stores data very cheaply for long periods of time. Restore expectations are few and far between. Security, durability and low cost characterize this tier.
Cold Storage Usage Examples
The biggest single reason for using cold storage is saving money by reducing use of hot, warm and cool storage tiers. Cold storage provides efficient and infinitely scalable capacity at a lower cost than any other storage tier.
For example, the healthcare industry produces massive amounts of medical images with retention requirements in the decades. The financial industry also has steep retention requirements, in some cases up to 30 years. Many financial institutions have stored this data in tape vaults for many years, but restoring massive data sets from tape is expensive. Cold storage in the cloud retains data for long periods, and restoring the data does not require original tape drives.
Litigation and regulatory investigations are also cold storage usage cases. For example, a retail chain might store massive amounts of backup on the cloud. One day the company receives a lawsuit from a customer who slipped and fell in a store seven months ago. The business will need to search through their backup for relevant data, collect it, analyze it and provide it to the reviewers within a few weeks. This is far simpler to do on cold storage in the cloud than from massive tape collections.
A third scenario is preserving raw data for analytics and secondary applications. Massive data sets are very expensive to keep on hot or warm storage systems. Cold storage tiers keep the raw data available for occasional access at a very low cost.
Cold Storage and the Public Cloud
For many companies, cold storage in the cloud offers distinct advantages over on-premise nearline storage or tape vaulting. The public clouds are ramping up their cold storage in response. Amazon Glacier and the new Google Cloud Storage Coldline are dedicated to long-term cold storage. Azure uses its Cool Blob Storage to serve both cool and cold tiers.
The three services have a lot in common. Storage pricing is very similar. Amazon and Google both charge .007 cents per monthly stored gigabyte. Azure charges by geographical regions, with price points ranging between $0.01 per gigabyte and $0.024 per gigabyte for cool and hot blobs. (Cool blobs are priced at the lower end of the scale.) Data access and recovery are more expensive than simple storage, which protects the public clouds against customers using cold storage as a cheap active data tier.
Durability is critical for all three services. Both Glacier and Coldline clock their durability in 11 nines (99.999999999 percent). Both services achieve this availability level by redundantly storing data across multiple domains, storage systems, and disks. As for durability, Azure goes beyond 11 nines by guaranteeing 0 percent data loss for both hot and cool storage blobs.
Recovery service levels differ somewhat between the three. For example, Amazon Glacier offers different service levels for restore times that range from minutes to hours while Google Coldline and Azure Cool Blob Storage offer fast recovery in milliseconds. Not everyone needs to recover cold data storage in such a short amount of time, but if you do — such as quickly accessing a backup data set — then the much shorter access time could prove very handy.