Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Today, what reassures us in our private lives can challenge us in our enterprise lives.
In our private lives, we're relieved to know that new government regulations mandate security for information about us. For example, think about HIPAA (Health Insurance Portability and Accountability Act) and how it requires the medical clinic where your doctor practices to keep a tight rein on who sees your medical records, even as the clinic communicates with your nationwide pharmacy and your health insurance company.
As contributors to the enterprise, however, we might be in charge of meeting those same government regulations. For some of us, it's not enough to run a series of snapshot backups anymore. We need to keep an exact copy of specific data, ensure only those authorized have access to it, provide fast access where required, prevent proliferation of extra copies, and destroy all copies on a mandated schedule.
As another real-world example, consider check imaging for your bank, where the data about your checks must be accurate, unchangeable, and secure for seven years, then guaranteed destroyed. In other words, the data must be authentic for the duration of its specific life expectancy and then be gone.
Information lifecycle management (ILM) was born from this context.
So Much Data, So Little Time
We're generating, using, and storing astonishing amounts of data, amounts that are growing at exponential rates. The challenge to enterprise data storage managers is to provide fast access for authorized users to the ever-growing data, to protect the data from unauthorized access, to schedule data destruction on a specified schedule, and to do all this without spending the enterprise's entire budget.
This challenge is especially daunting for managers of data that remains static once it has been generated, but that is also required to be stored, with a clean audit trail, for a specific amount of time, and then guaranteed destroyed. Typical applications that generate large volumes of this type of data are document imaging, medical imaging, check imaging, and email archiving.
When data must be analyzed and studied, it is often critical that its original form be preserved for future comparison and reference. At other times, the data exists as vast original image files, or large files that are accessed frequently in the first month after generation, but then rarely accessed again.
And then there's some data that must not be stored. Sarbanes-Oxley tells information managers it is not enough to store all data generated; instead, some data must not be kept. The IT manager must sort, store, and retire files according to their contents.
Conventional backup approaches capture data in a set process. For data that does not and should not change, the system administrator could be performing a full backup of the same data every day for the retention period.
For example, think again of your check images, copied daily for seven years. With tapes being recycled as part of a regular backup process, your bank may lose the audit trail or the original might be lost. Neither case is a good scenario for check images.
And conventional backup doesn't limit or control copies. For example, if the IT staff doesn't get the tape back from the warehouse, they use a new one and wind up with another copy of the data. At the end of the retention period, such as seven years for check images, it becomes impossible to guarantee all the copies have been rounded up and destroyed.
The enterprise's very real and rigorous budget also mixes into this challenge. If there were no money constraints, it would be fairly easy to keep everything online on disk, providing near immediate access to the data. And daily snapshots would guarantee against catastrophic loss.
If all this sounds squirmy to you, reality figures large in your life.