Book Excerpt: Building SANs with Brocade Fabric Switches Page 6


Want the latest storage insights?

Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure

By Josh Judd, Chris Beauchamp, Alex Neefus and Benjamin F. Kuo

How Much Downtime Is Acceptable to Production Components During Implementation?

It will likely be necessary to shut down some existing production devices during implementation, to ensure a safe transition onto the SAN. For example, you might have to shut down a host to install an HBA. Determine how much downtime is acceptable for each host, and at what times this can occur. Generally, you should try to schedule more downtime than you think you need to ensure that any unforeseen issues that arise during the implementation can be handled within the downtime window.

How Much Downtime Is Acceptable for Routine Maintenance? How Much Downtime Is Acceptable for Upgrades and Architectural Changes?

These two questions are intimately related, becauseto an end userthere is really no difference between downtime to a production system for maintenance, and downtime for an upgrade. Once systems are in production, you will want to keep them running as much as possible.

Many upgrades can be accomplished with zero downtime by using a double- or triple-redundant fabric architecture. No matter how well you plan the upgrade and maintenance processes beforehand, you will need to shut down specific hosts on occasion. For example, you might want to upgrade an HBA driver, which would typically require a reboot.

Note: Wherever possible, a redundant fabric architecture should be used. This will ensure the best performance and reliability, and will simplify maintenance tasks. In a redundant fabric architecture, every host has at least two paths to every storage device it connects to, and these paths traverse two completely unconnected fabrics. While it might appear on the surface to be more expensive, if hosts are to be dual-attached anyway, it is actually less expensive to attach them to two separate fabrics than to use one larger fabric, or a director-class switch. This does not even include the downtime ROI calculation, which, in high-availability environments, will usually overshadow the entire cost of the SAN. More details about redundant and resilient fabrics are provided in Chapter 7.

You should therefore determine in advance when you will be able to schedule downtime for every host and storage array, and for the fabric itself. You might not have to use every scheduled outage, but having them available to you when you do need them is essential.

One way to do this is to make a list of applications and services provided by the hosts on the SAN, and determine an owner for each. Take your list of SAN devices and map these devices to the applications and services they affect. This will provide a mapping of application/service owners, who are typically responsible for scheduling downtime, to devices that typically require downtime. Have each owner approve the downtime calendar for each device that affects his or her service.

The mapping of owners to devices should be kept up to date as changes in personnel, applications, and/or SAN infrastructure occur.

When Do You Need Each Piece of the Solution to Be Complete?

Once you have a table detailing which of the initiators communicate with which targets, you can begin to create a timeline for the project. Other members of the core team will tell you something like, "the customer database application must be online by mid-June." It is your task to define which SAN components you need to accomplish this, and to develop a timeline for adding these components that meet their requirements.

This is a high-level list of some of the questions that should appear on a SAN design interview form:

  • What overall business problem are you trying to solve?

  • >What are the business requirements of the solution?

  • What is known about the nodes that will attach to the SAN?

  • Which SAN-enabled application do you have in mind?

  • Which components of the solution already exist?

  • Which components are already in production?

  • Which elements of the solution need to be prototyped and tested?

  • What equipment will be available for testing?

  • How and when are backups to be done?

  • What will the traffic patterns in the solution be?

  • What do we know about current performance characteristics?

  • What do we know about future performance characteristics?

  • How much downtime is acceptable to production components during implementation?

  • How much downtime is acceptable for routine maintenance?

  • How much downtime is acceptable for upgrades and architectural changes?

  • >When do you need each piece of the solution to be complete?

  • Conduct a Physical Assessment

    You should now have the location of every piece of hardware that currently exists. In addition, you should know where each piece of hardware in the eventual SAN will be located.

    Look at each piece of hardware. Make sure that it does exist, and has all necessary pieces to function. This could include things like power cords, keyboard, mouse, monitor, Ethernet card, Ethernet cable, HBAs, and Fibre Channel cables. Note the physical dimensions of the hardware, and its power/cooling requirements. Does it rack mount? Does it have a network interface? How many Fibre Channel interfaces does it have? How much does it weigh? You should already have this information from the interview process, but you should verify that the information you were given is correct.

    Go to each location where SAN equipment or nodes will be installed, and again check to see that your information was correct. Notice how the equipment will fit into the space available. Notice how the equipment will enter the building. You should also have a meeting with the person in charge of the facility to discuss power, cooling, and equipment locations.

    Click here to buy book

    Building SANs with Brocade Fabric Switches

    Josh Judd is a Senior SAN Architect with Brocade Communications Systems, Inc. In addition to writing technical literature, he provides senior-level strategic support for major OEMs and end-users of Brocade storage network products worldwide. Chris Beauchamp (Brocade CFP, CSD) is a Senior SAN Architect for Brocade Communication Systems, Inc. Chris focuses on SAN design and architecture, with an emphasis on scalability and troubleshooting. Alex Neefus is the Lead Interoperability Test Engineer at Lamprey Networks, Inc. Alex has worked on developing testing tools for the SANmark program hosted by the FCIA. Benjamin F. Kuo is a Software Development Manager at TROIKA Networks. Headquartered in Westlake Village, CA, TROIKA Networks is a developer of Fibre Channel Host Bus Adapters, dynamic multipathing, and management software for Storage Area Networks.

    Submit a Comment


    People are discussing this article with 0 comment(s)