Improving System Availability With SANs Page 2
For true fault tolerance, multiple paths must be connected to alternate locations within the SAN, or to difference switches in a multi-switch fabric, or to different blades within a core fabric switch, or even to different switch modules of an integrated fabric. To provide full redundancy, some companies choose a dual-SAN configuration. Server-based path-failover software typically allows a dual-active configuration for dividing workload between multiple HBAs. The software monitors the "health" of available storage products, servers, and physical paths, and automatically reroutes data traffic to an alternate path when failures occur.
Many of today's storage devices feature multiple connections to the SAN, a critical requirement for fault tolerance in storage solutions. This guards against failures from a damaged cable, controller, or SAN component such as a Gigabit Interface Converter (GBIC) optical module. mirror storage subsystems on a peer-to-peer basis across the fabric also creates highly available storage connections for fault tolerance. Combining mirroring with switch-based routing algorithms (to route traffic around path breaks within the SAN fabric) creates a resilient, self-healing environment for the most demanding enterprise storage requirements. The mirrored subsystems provide an alternate access point to data regardless of path conditions.
Putting it All Together in the SAN Infrastructure
The SAN infrastructure itself is one of the storage networks most critical components for ensuring system availability. Fibre Channel switches are extremely reliable, especially when they feature hot-pluggable, redundant power supplies and cooling, plus hot-pluggable Gigabit Interface Converter (GBIC) modules that enable single-port optics replacement without impacting other working devices.
The industry's top switches have "five nines" availability and include redundant components that further increase system uptime. Integrated fabric products go a step further by combining highly reliable switch modules, redundant-path architectures, and path-failure rerouting within the fabric. The integrated fabric can be incorporated into a larger SAN either as a core connectivity point or as an edge solution to address higher port-count requirements.
As scalability requirements grow, companies can incorporate higher-end 2 Gbps core products that improve scalability, management and multiprotocol support. (See Fig. 1). Core switches also protect investments by supporting future Fibre Channel technologies and alternative edge technologies. Plus, they add capabilities like frame filtering, which centralizes zoning to the logical unit (LUN) of the storage, improves storage resource sharing, and provides advanced performance monitoring and improved security.
Networking in the Fabric
Networks go beyond redundancy to provide a more resilient infrastructure than is possible with single-point products. With an infrastructure of switches, administrators can grow their network to meet high port count needs. Networking a fabric of switches increases availability, design flexibility and "pay-as-you-grow" scalability.
One of the easiest ways to increase availability in the SAN, is the use of a meshed-tree networking topology. In this topology, devices are connected to edge switches, which are then connected to central interconnecting switches, which are in turn connected to other parts of the SAN or other devices. The network can be scaled to provide higher bandwidth and redundant connectivity.
Although high-availability devices within an enterprise SAN contribute to the overall availability of the entire system, they do not guarantee it. True high availability can only be achieved through an end-to-end system composed of highly available components and devices, as well as fault-tolerant capabilities. Dual attachment of servers and storage devices to a single fabric enables workload sharing while avoiding system disruption from any single failure. Because the switches run their own independent firmware and dont share memory, this reduces the risk of a single switch impacting the entire network.
To help prevent localized failures from impacting the entire fabric, SAN sections can be isolated through the use of zoning, in which defined zones limit access between devices within the SAN fabric. This is especially effective as companies build larger SANs with heterogeneous operating systems and storage systems. Companies can specify different availability criteria at the connection, node, and network level to address the potential impact of certain types of outages. Zoning limits the types of device interactions that might cause failures. Hardware zoning provides the most secure method, especially when hardware is available across the enterprise fabric. Software zoning provides a more flexible but less secure approach.
The Key to High System Availability
Achieving higher availability through redundancy and fault tolerance begins with a thorough understanding of specific system uptime requirements and designing a solution to specifically address those requirements. Complete system outages can only be avoided by eliminating all potential single points of failure through redundancy of components, devices, connections, and paths. Multiple connectivity paths, clustering techniques, and dual fabrics all contribute to a fault-tolerant solution, and by physically separating devices, administrators can help build fault tolerance by protecting against localized physical disasters. Additionally, networks of fabric switches and core switches are less vulnerable to localized disasters that might impact the fault tolerance of the entire system. Together, these measures help organizations become more efficient and reliable and achieve true high system availability.
About The Author : Derek Granath is director of product marketing at Brocade, and responsible for product management of the Brocade switching platforms. During his career at Brocade, Granath has been instrumental in defining and launching several new product lines for the entry-level and enterprise market segments. Granath holds a BS in Electrical Engineering from Stanford University and an MBA from Santa Clara University.