Going the Distance for Disaster Recovery


Want the latest storage insights?

Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure

As a subset of more comprehensive business continuance strategies, disaster recovery (DR) focuses on data integrity and availability when an outage occurs. The potential sources of disruption or outage directly shape our view of what steps must be taken to resume business operations.

Years ago, for example, local tape backup with off-site vaulting was considered adequate protection against outages. If a storage system failed, tapes would simply be reloaded for data restoration to disk. For higher availability, it was also common practice to mirror data between two local disk systems. If the primary failed, operations would be shifted to the secondary array.

These tactics assumed that a disruption would occur from fairly innocuous sources: a primary array taken off-line for maintenance, an unexpected hardware failure, operator error, or other local occurrence. No one was thinking that entire metropolitan areas or regions would suffer disruption due to massive power failures or terrorist attacks. Unfortunately, that age of innocence is now ancient history, and the hard new reality is forcing a fundamental reexamination of disaster recovery strategies.

The sphere of potential disruption has now expanded well beyond the local data center, beyond metropolitan boundaries, and beyond entire regional geographies. It is no longer sufficient to rely on local tape backup, local disk mirroring, site-to-site replication within a city, or data replication within a region.

Although many financial institutions in New York City still rely on recovery sites in New Jersey, the combination of religious extremism and anthrax terror of 9-11, as well as the more recent Northeast power blackout, demonstrated that the 20 miles across the Hudson River does not ensure the well-being of corporate data. Now, companies must think about securing their data assets far from their primary production centers, spanning hundreds or thousands of miles to safer havens.

The Need for Long-Distance DR Strategies

Previously, inherent distance limitations of Fibre Channel created a barrier to long distance data replication for disaster recovery. Fibre Channel fabrics connected by dark fiber and DWDM (dense wave division multiplexing) can only span metropolitan distances, typically well under 100 miles. This made it difficult to conceive of more comprehensive DR solutions that would cross regional or national boundaries.

New IP SAN technology, however, has abolished those limits, with demonstrated connectivity over thousands of miles. In addition, some IP SAN products provide multi-point routing capability, so that multiple regional data centers can be integrated in a single DR configuration. These new solutions open the door to enterprise-wide storage strategies, including global connectivity for multi-national companies that must secure highly dispersed data assets.

Removing obstacles to infrastructure expansion, however, is only part of the equation. DR applications such as synchronous data replication are sensitive to latency, which is tied directly to distance. Speed of light propagation dictates about one millisecond of latency for every 100 miles. A thousand mile span between primary and DR sites, for example, would inject roughly 10 milliseconds of latency each way, or 20 milliseconds latency round trip.

For synchronous applications, vendors typically recommend a maximum of 150-200 miles between sites. In practice, customers have pushed synchronous data replication more than twice that distance, despite lack of support by the supplying vendor.

Asynchronous data replication, in contrast, is highly tolerant of latency and can be driven across thousands of miles. The tradeoff between distance and data integrity in this case is the possibility that a transaction at a production site may be lost if a failure occurs. For highly mission-critical applications, synchronous replication is always preferred, even at the sacrifice of distance. One compromise is to implement synchronous data replication to a regional facility, followed by asynchronous replication between the regional facility and some more distant DR site.

Page 2: Complimentary New Technologies for DR

Submit a Comment


People are discussing this article with 0 comment(s)