Over Memorial Day weekend, I experienced what every user of a computer system fears the most: a hard drive crash. For the next few hours, I hoped that it was some sort of OS error and I did not have to worry about data restoration. I turned out to be wrong and needed a complete […]
Over Memorial Day weekend, I experienced what every user of a computer system fears the most: a hard drive crash. For the next few hours, I hoped that it was some sort of OS error and I did not have to worry about data restoration. I turned out to be wrong and needed a complete restore.
Not to worry, my system administrator said. All you have to do is install your recovery CD and our backup/restore software and, like magic, the system will be restored.
So I ran to the office and then back home to reload Windows and start the process of recovering my system. After 36 hours of downloading at cable modem speeds, the system said it was ready to re-boot and I would get all my data back. A new hard drive would arrive the next day, so I thought I would be able to reload again with the CD-ROMs that were on order and I was all set. No loss of productivity for this consultant.
My hopes were high — until I got the message “Cannot Restore System Please Reinstall.” I called our system administrator at home (even though it was 4:45 A.M. — after all, there’s nothing more worthless than a consultant without data), and he said he did not understand why it did not work, since he had tested the process a few months earlier as part of the final decision to buy the package. He said he would get back to me ASAP, but in the meantime I could get access to my data via a Web interface.
What do all of these problems with a small company have to do with storage users? This is not an isolated problem — it happens all the time to all sorts of users, and while you usually can recover your data after some effort, it always takes way more time than expected. That’s why it’s important to make sure that your backup system works before you need it.
Where We Went Wrong
The problem we found even in my small company is that testing restoration of data is difficult and costly. It is usually done once and then forgotten.
In our case, we were evaluating different backup/restoration options for employees who travel. We did some significant backup and restore testing, but when we installed the final version of the software, we did not test it again. It appears that a simple parameter was not set correctly, so we could not do an automatic restoration. We could get our physical data back, but we could not restore the machine state. In my case, we kept beating our heads against the wall trying to restore the machine state, but it wasn’t going to happen. It took more than two weeks to get answers from the company handling our backup/restore environment. Fortunately, once the new disk drive showed up, I restored my system and my data myself.
So what I did learn from this experience, both from a policy and professional point of view?
I already knew the following:
What I learned was:
Recommendations
While it would be nice to blame vendors for everything, we have to take some responsibility ourselves. So here is a checklist of items to consider for backup/restore environments and why they should be considered:
Sooner or later, your hardware or software is going to break down and you are going to need to restore your data. You could send your bad hard drive to one of the places that takes the drive apart and reads the data block by block. This works pretty well, but is very expensive. Backups are your most effective method of ensuring that your data is not lost, but without testing your restore policy, you do not know if your backups are worth anything or if you can meet your restoration requirements.
I was down for the better part of a week, and for a consultant that can be a lifetime. Think if you were a tax accountant and you crashed on April 10 and lost a week, or some other timing-based business disaster. The restoration process and procedures must be tested no matter what the cost, since the alternative could threaten the survival of your business.
Henry Newman, a regular Enterprise Storage Forum contributor, is an industry consultant with 25 years experience in high-performance computing and storage.
See more articles by Henry Newman.
Henry Newman has been a contributor to TechnologyAdvice websites for more than 20 years. His career in high-performance computing, storage and security dates to the early 1980s, when Cray was the name of a supercomputing company rather than an entry in Urban Dictionary. After nearly four decades of architecting IT systems, he recently retired as CTO of a storage company’s Federal group, but he rather quickly lost a bet that he wouldn't be able to stay retired by taking a consulting gig in his first month of retirement.
Enterprise Storage Forum offers practical information on data storage and protection from several different perspectives: hardware, software, on-premises services and cloud services. It also includes storage security and deep looks into various storage technologies, including object storage and modern parallel file systems. ESF is an ideal website for enterprise storage admins, CTOs and storage architects to reference in order to stay informed about the latest products, services and trends in the storage industry.
Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved
Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.