Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Your main data processing facility suffers a catastrophe, but it's not a total disaster because your corporate data is safely backed up in a cloud storage facility on the other side of the continent. But here's the problem: how are you going to restore terabytes of information from the cloud to your local storage systems? Moving vast amounts of data long distances over TCP/IP networks takes time, and every minute that your systems are down costs you money.
Using techniques like caching and application protocol optimization, WAN optimization controllers (WOCs) can improve the performance of applications over WANs dramatically. But when it comes to sucking huge volumes of data over a WAN from one storage system to another, caching and application protocol optimization can't help significantly. Something more specialized is required.
NetEx, a Minneapolis-based company that was spun out of StorageTek in 1999, believes it has the answer. Its HyperIP solution is derived from code originally developed by IBM (NYSE: IBM) and Network Systems Corporation for moving large amounts of data over satellite links.
"We've modified this code to run with TCP-based storage apps," said Robert MacIntyre, a NetEx vice president. "Unlike conventional WOCs, we don't look at application protocols, and we don't do caching. We concentrate on mitigating the problems posed by long-distance WAN links, which are packet loss and latency."
HyperIP is available as an appliance, or more commonly as a virtual appliance that runs on a server with VMware's (NYSE: VMW) ESX hypervisor. One of these servers is needed at each end of a WAN connection, connected to the local area network. Storage-based apps like an EMC (NYSE: EMC) array or a NAS head "would point traffic at the HyperIP server, and that takes the traffic, sends it across the WAN to another HyperIP server using a HyperIP connection, and then passes it on to the LAN as TCP traffic at the other end," said MacIntyre.
The company claims that its technology also boosts performance in Data Domain (NASDAQ: DDUP) replication and deduplication environments, and HyperIP has been qualified with a number of other replication solutions, including those from EMC, VMware, Dell's (NASDAQ: DELL) EqualLogic line, FalconStor (NASDAQ: FALC), DataCore, NetApp (NASDAQ: NTAP) and Veeam, among others.
How does it work? Essentially, HyperIP uses four key techniques:
- Local TCP acknowledgements
- Efficient bandwidth utilization
- Mitigating the effect of packet loss
- Mitigating the effect of latency
Local TCP Acknowledgements
When a storage application sends data over the WAN, the local HyperIP server (rather than a remote server at the other end of the WAN) generates TCP acknowledgements to ensure that the TCP window stays open and the data keeps flowing. This is similar to the way many WOCs accelerate common protocols, effectively shielding the source server from any packet loss or latency problems that may occur on the WAN.
Efficient Bandwidth Utilization
TCP applications sharing limited bandwidth run as fast as they can until packet loss occurs, at which point performance begins to degrade. When some applications stop sending data, there is some delay before other applications take up the extra capacity that has become available.
HyperIP overcomes this by managing packet streams over the WAN, aggregating TCP packet streams from the LAN and sending them over the WAN using the HyperIP protocol. When new applications start, the traffic they generate is inserted into the HyperIP connection without causing congestion, and when the traffic stops the newly freed up bandwidth is immediately reclaimed for other applications to use.
HyperIP also uses block data compression (as opposed to individual packet compression) to reduce the number of packets being sent. Compression is automatically deactivated for data that is not compressible enough and for which the compression overhead outweighs the benefits of the compression itself.
Mitigating the Effect of Packet Loss
TCP always interprets packet loss as a sign of congestion and ultimately slows transmission, even though on long WAN connections packet loss may in fact be caused by data corruption. HyperIP is designed to react differently to packet loss to take this into account, resulting in transmission rates being maintained close to the capacity of the WAN link, even when packet loss is common.
Mitigating the Effect of Latency
Latency on a long WAN link can be very high (perhaps 60ms for a 3,000-mile link). To use the link to its full capacity using TCP, it is necessary to have large TCP window sizes to "fill the pipe," which involves TCP stack tuning that affects all application connections. With a HyperIP connection, HyperIP takes control of window size adjustments. By monitoring round-trip times, bandwidth capacity and transmission rates, it can calculate the capacity of the connection and adjust the HyperIP window size accordingly to ensure the connection is utilized as fully as possible.
That's the theory, but how well does it work in practice? Generally using HyperIP results in improvements in data rates by a factor of two, up to a factor of 30, claims MacIntyre. "Typically with TCP, you get 20 percent to 30 percent data throughput because a lot of the capacity is used up by protocol traffic. With HyperIP we get up to 90 percent throughput," he said. When packet loss mitigation becomes significant, the really big improvements up to 30 times more throughput are achievable, he said.
NetEx sells HyperIP using a pricing system based on the bandwidth of the link that it controls. For example, HyperIP software to govern a T1 link is currently priced at $1700 per copy (and a minimum of two copies are needed). HyperIP hardware appliances are more expensive. As customers' requirements grow, they can pay to unlock the software to govern higher bandwidth links if they wish to do so, up to a maximum of 800Mb/s.
For emergency data recovery situations, when a company needs to restore data from another site or a cloud storage facility in as short a time as possible, the company offers a Recovery on Demand feature. This allows customers to unlock the software so it can be used with any available bandwidth for a period of ten days at no extra charge. After that period, the HyperIP software is once again locked to the bandwidth that has been paid for.
With cloud storage becoming more popular (or at least more talked about) by the day, the need to move large amounts of data over long distances is likely to become increasingly common. And that means that demand for HyperIP, or similar solutions, is likely to see strong growth over the coming months and years.
Article courtesy of Enterprise Networking Planet
Follow Enterprise Storage Forum on Twitter