Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
A disaster recovery plan is an insurance policy, of sorts. Your business needs a DR plan because a well-implemented disaster recovery plan will make your IT infrastructure whole when disaster strikes.
More than an offsite data center and a collection of tools for data recovery and getting your systems back up and running, disaster recovery—often shortened to DR—also encompasses the policies and procedures that your organization's IT workers should follow to successfully get your business back on track.
As any seasoned IT pro will tell you, disasters can take many forms. And they don't necessarily have to rise to the level of a data center-rattling earthquake or the storm of the century.
Sure, nature is responsible for its share of hurricanes, blizzards, floods, wildfires and countless other ways to interrupt a company's IT operations. But in terms disaster recovery, people and all their foibles can fall into the same category.
Human error, improper configurations and cyber-attacks can all cause servers and other IT equipment to fail. Sometimes a disaster can be traced back to a faulty server rack, a buggy application and other mishaps.
When it comes time to craft a disaster recovery plan, IT personnel must document its scope and objectives. While they may vary depending on the severity of a disaster and between organizations, even those operating in the same industry, the documentation should be clear on what it covers—from a modest fleet of desktop systems to massive data storage archives—and how the steps described therein help meet an organization's data recovery objectives and other goals.
Effective disaster planning for includes tying together many elements of the storage ecosystem.
Although disaster recovery and business continuity are sometimes used interchangeably—and they are indeed related—they serve very different purposes.
Disaster recovery is a subset of business continuity. Whereas disaster recovery is generally focused on a company's IT operations, business continuity involves the entire business or at least those functions that are critical to its ongoing operations.
A business continuity plan includes policies, procedures and contingencies that can be used to continue conducting business in the event of a disaster or other disruption. It takes more than a company's data and IT systems into account, reaching other areas that are typically outside an IT department purview, like office space, suppliers, employees and industrial equipment.
Considering the integral role of IT in today's modern enterprises, disaster recovery can be considered a vital component of a business continuity plan.
Your disaster recovery plan will be heavily influenced by the IT systems and services that your business relies on. Although there is no one-size-fits-all approach, here are some factors to consider.
- Virtualization Disaster Recovery
One of the benefits of virtualization is that it can eliminate the need to recreate a physical server when something goes wrong. Placing a virtual server on reserve capacity or the cloud are very real possibilities, making achieving your organizations recovery time objectives (RTOs) trivially easy in some circumstances.
Take stock of the virtualization platforms (VMware, Microsoft Hyper-V, Oracle VM, Citrix XenServer, etc.) used in your environment, along with the backup and recovery tools used by each, and draw up a plan to get virtual workloads up and running again.
- Network Disaster Recovery
Servers aren't the only part of an organization's IT infrastructure that may be affected by a disaster. Networks can also meet an untimely demise, which in turn can lead to failures in business applications and services that depend on reliable network connectivity.
A network disaster recovery plan often includes procedures on contacting the proper IT personnel, acquiring replacement networking equipment from vendors and other actions required to restore connectivity.
- Cloud-based Disaster Recovery
One of the most compelling reasons to include the cloud in your disaster recovery planning is the ability to use a cloud provider's data center as a recovery site without investing in additional facilities, systems and personnel. It also grants users access to cutting-edge IT capabilities, a consequence of a competitive cloud market in which AWS, Microsoft Azure, Google Cloud and others attempt to one-up each other.
There are several factors to consider before making the jump to disaster recovery as a service (DRaaS), including bandwidth, cloud storage costs, security and regulatory compliance, to name a few. As with any IT endeavor, identify the backup and recovery challenges that a third-party cloud provider may help solve along with the impact on your IT processes and budget before incorporating the cloud into your disaster recovery plan.
- Data center disaster recovery
A disaster recovery plan extends well beyond the IT systems housed in a computing facility. It involves the building itself, utility providers, backup power, physical security, fire suppression, HVAC (heating, ventilation and air conditioning), support personnel, and much more.
Preparing for a data center outage, outright damage to a data center building, intruders and other risks is essential and often requires the input of your company's IT teams, facilities management personnel and physical security experts.
Now it's time to get started creating a DR recovery strategy.
- Complete a risk assessment
A key step is to create a risk assessment that details the likelihood of a disaster and the risk it poses to your organization. This will help you prioritize your efforts. Although you may rarely face a hurricane or feel the earth tremor, there's a good chance that you'll experience a server failure or a cyber-attacker targeting your network. Plan accordingly.
- Collecting data and organize a plan
Gather the necessary information needed to create a disaster recovery plan. This may include an inventory of your servers and storage systems, network diagrams, data center blueprints and floorplans, key personnel, emergency contact numbers, backup and recovery procedures and workflows, third-party services, support information and more, depending on your specific setup.
Now comes the most crucial part: documenting it.
Assuming you haven't hired a disaster recovery consultant and are using in-house personnel to write a disaster recovery plan, it may be helpful to search out a disaster recovery template from an authoritative source. Even if you deviate from the template, it will acquaint you on how to use a methodical approach to problem solving, write for an intended skill level, avoid glaring omissions and provide an actionable, step-by-step guide of your own.
- Test your disaster recovery plan
After that's done, it's time for disaster recovery testing.
Like fire drills, where employees file out of an office building as if it were on actual fire, disaster recovery testing simulates an IT mishap. However, pulling the plug on a critical server or cranking up the heat in a data center is generally ill-advised. Luckily, there are other ways to determine if your plan will work.
Disaster recovery testing can involve tabletop tests where recovery procedures are discussed and evaluated without physically taking the actions described in the document. Businesses can also conduct hands-on technical tests where participants are tasked with restoring a system, helping them gauge their preparedness.
Finally, your disaster recovery plan should be a "living document," of sorts.
Routinely update it to account for changes to your infrastructure, technology updates, mergers and acquisitions and the many other factors that affect your IT environment. Be sure to update your testing procedures after significant changes.
And don't forget your most valuable resource: people.
Identify employees that will be put in charge when a crisis erupts and match skillsets to the affected systems and technologies. Remember to keep your employee information current—the best laid plans will fall apart if your workers are left scrambling to find someone who can help.