Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Once you've decided how you want to replicate data, you must decide how to transport the data to the other site, either via WAN and IP or dark fiber and a Fibre Channel connection.
Most dark fiber solutions use a Fibre Channel connection from the back end of the RAID controller to a Fibre Channel switch on the host side, to another Fibre Channel switch on the mirror side and then to the RAID. The reason this is done is that Fibre Channel buffer credits are required to ensure that the channel is operating efficiently and that switches have more buffer credits. Each buffer credit is basically a command in progress. (See Resolving Finger-Pointing in Storage.)
Given the latency for acknowledgements at the Fibre Channel layer, to ensure that the channel is filled with commands, you need about 120 buffer credits for 50 KM with 2Gb Fibre Channel. These are rules of thumb, and the real numbers depend on the measured latency in your Fibre Channel network. Since most Fibre Channel switch vendors have at most 256 buffer credits per port on their switches, this is a real problem if you want or need to mirror long distances. Some new vendors such as Celion have placed thousands of buffer credits on ports to allow high performance transmissions at huge distances (greater than 3,000 KM).
So dark fiber is an option, but you must ensure that your hardware matches the distance of your architecture. The biggest advantage of dark fiber and SCSI over Fibre Channel is the lack of protocol translation. Since you are effectively running the native protocol that the RAID is running, the latency and overhead for translation is not a concern, as it can be with a TCP/IP-based solution.
TCP/IP has a similar hardware solution. From the back end of the RAID, you use a channel and put that channel into a hardware Fibre Channel to IP converter. This is then connected to a WAN connection, then back to IP to Fibre Channel on the other end.
One of the big advantages here is that most of these converters can implement data compression. The Fibre Channel method cannot. Also, many large WAN routers support encryption, which is required in many environments.
The tradeoffs are pretty straightforward:
|Performance||Depends on data compression, but higher protocol translation overhead||If you have enough buffer credits, 2 Gb|
|Latency||Yes, but much of this is handled by TCP/IP||Yes, generally distances over 100 KM will require special hardware, since switches do not have enough buffer credits|
Moore's Law Meets Network Limits
Replication of data is becoming the rule instead of the exception. Two factors complicate this and will likely just get more difficult:
- Network performance is not increasing at the same rate as CPU performance, and these CPUs are generating more data.
- Network performance is not even increasing at the rate of disk density gains.
I believe the company formally known as JNI made its 2Gb FC card available in September 2000. Since that time we have gone from 36GB FC drives to an announcement from HDS for 300GB FC drives. As you can see, the numbers are out of whack, and they are not getting any better.
What does all of this mean? Without careful planning, a good understanding of the requirements, and the right hardware and software, you can expect problems, if not outright failure. Any project of this magnitude requires an overall architect who has end-to-end responsibility; you cannot architect remote mirroring in pieces and expect it to work. Next time we will cover replication for HSM-based solutions.