Resilient storage infrastructures are not exclusive to large Fibre Channel storage area networks (SAN) environments.
Any storage network (iSCSI or NAS), or for that matter, any storage infrastructure of any size, including direct attached (DAS), can be made resilient.
If your business depends upon applications that require data and storage to be available and accessible in a timely and coherent manner, then you are a candidate for a resilient storage infrastructure. Reasons for establishing a resilient storage infrastructure vary by the type and size of your environment to support continued accessibility or the ability to recover data to keep your business running.
Some of the drivers that are creating a need for resilient storage infrastructures include:
- More data being generated and stored longer;
- Increasing reliance upon data being available and preserved;
- Awareness of internal and external threats to data;
- Copies of data continue to increase, as do logs and journals; and
- Regulatory compliance and data protection demands.
Issues that affect resiliency include single points of failure, configuration and human errors, technology failure, and seasonal and peak workloads. Design factors to consider include on-going maintenance, upgrades, technology replacements, and interdependences of hardware, software, networks and firmware revision levels. The applicable threats and service-level requirements for your storage availability will also affect the design of your data storage infrastructure, along with growth and workload plans.
I/O performance bottlenecks are another area to look at — they can include the network, application, server and storage system. Some questions to ask include what’s normal for your storage and data infrastructure environment? Do you know what your configuration is? Do you know how your resources are being used, by whom and when? Do you know where performance bottlenecks exist in your environment and their effect? You can learn more about I/O performance bottlenecks in the StorageIO group white paper Data Center I/O Performance Issues and Impacts.
The following performance-enhancing techniques support resilient storage infrastructures:
- Caching and acceleration (bandwidth, I/O or latency) devices;
- RAIDand controller optimization for performance;
- Rapid drive rebuild and copy along with dual parity (RAID 6);
- Faster storage and network interfaces, transports and protocols;
- Clustered storage (scale capacity, performance, RAS) for block and file; and
- Tiered storage, including tiered access and tiered data protection.
Tiered access includes (how you get at and access data) interfaces (SAS, SCSI, SATA, Fibre Channel, Ethernet and InfiniBand) and protocols (FCP, FICON, and iSCSI), block or file. Tiered protection refers to how the data will be protected, including clustering for HA, RAID for accessibility, security, backup, snapshots, and local and remote mirroring or replication. Tiered media includes how the data is stored, such as disk or tape, RAID or JBOD, Fibre Channel, SAS, SATA or FC-near line, also known as FATA disk drives.
Don’t Miss Our Free Webcast
|Building Resilient, Scalable, Flexible Data Storage Infrastructures
September 18, 2006 (2 p.m. EDT, 11 a.m PDT) Speaker: Greg Schulz, Founder and Sr. Analyst, StorageIO
As storage networks mature, discussions have shifted from reasons to deploy, initial benefits, early-adoption success stories, and general architecture and technology debates to conversations about revised best practices and scaling to meet different needs. This Webcast looks at reasons for implementing a scalable networked storage environment (SAN, NAS, CAS or IP storage). It also examines various techniques, technologies and examples to implement storage networks on both a local and wide-area basis for different-sized environments. Attendees will learn what’s involved in designing, deploying and managing a resilient networked storage environment.
Register for the Webcast
Data protection techniques include disk-to-disk data protection (3DP), backup and copy, snapshot and point-in-time (PIT) copy, local and remote mirroring and replication, and archiving for data preservation. For remote data movement and access, eliminating single points of failure entails making sure that your redundant network paths actually take divergent paths.
You might think that using two separate bandwidth providers or carriers would ensure separate network paths. However, you may be surprised that in some cases, the same physical underlying network may be used by multiple carriers. Some technologies and techniques for spanning distances to move and access data include wide area data services (WADS), also known as wide area file services (WAFS), storage over IP, including iSCSI, iFCP, FCIPand NAS, as well as bandwidth services based upon SONET/SDH, OCx, MPLS and other network transports.
Management tools that support resilient storage infrastructures include change control, configuration coverage analyzers and verifiers, policy managers, data protection management (DPM), and event correlation analysis tools. For example, using configuration verification tools such as those from Onaro and EMC, you can gain insight into how your environment is configured, and hopefully identify potential faults before they become reality. DPM tools (not to be confused with Microsoft’s Data Protection Manager data protection tool) provide insight into how data protection tasks are performing and how resources are being used.
Scaling with stability means that as your storage infrastructure grows, you do not lose any stability in terms of performance, reliability or management due to unnecessary complexity. While your management activities should increase as your infrastructure increases, it should be in proportion to the size of your environment and to the level of service delivered.
Some of these techniques to boost availability should be obvious, including having multiple network and storage interfaces for accessing storage resources. However, others may not be so obvious. Having separate storage interfaces that converge into a single SAN switch or into separate SAN switches that are interconnected represents a potential point of failure, instead of two separate isolated fabrics, name spaces, zones and associated configuration items.
Keep in mind bandwidth requirements and needs for normal and peak seasonal application workloads. Design for performance and growth and eliminate complexity to reduce overhead and costs. The caveats of consolidation to watch out for include reduced performance due to increased capacity utilization as well as single points of failure.
To learn more about making your data storage infrastructure resilient and scaling with stability, stop by our free Sept. 18 webcast, Building Resilient, Scalable, Flexible Data Storage Infrastructures, and check out my book, “Resilient Storage Networks — Designing Flexible Scalable Data Infrastructures” (Elsevier) to learn about various design options and data protection techniques. Drop me a note at firstname.lastname@example.org your questions or comments.
Greg Schulz is founder and senior analyst of the StorageIO group and author of “Resilient Storage Networks” (Elsevier). For more storage features, visit Enterprise Storage Forum Special Reports