Weigh the Factors In Your SAN-UPS Juggling Act
Effective next generation enterprises must address the known needs and problems relating to current and past storage area networks (SANs) powering needs and problems. Despite revolutionary changes in SAN hardware and products, the design of the Uninterruptible Power Supply (UPS) infrastructure for enterprises has changed very little since 1965. Although SAN hardware has always required electrical power, the way that SAN systems are deployed today, has created new power-related problems which were not foreseen when the powering principles for the modern data center were developed over 30 years ago.
With the preceding in mind, Part II continues the UPS for SAN hardware theme by discussing a categorized and prioritized collection of SAN hardware powering needs and problems; avoiding costs from over-sizing a UPS system; UPS lifecycle cost imperatives; UPS rack powering options; power monitoring software; UPS adaptability/scalability, availability, manageability, and serviceability imperatives; and, next generation UPS systems for SAN hardware. Let's look at SAN hardware powering needs and problems first.
SAN Hardware Powering Needs And Problems
The recent surge in SAN usage has been accompanied by an equally large demand for high-quality UPS to feed the evolving SAN hardware infrastructure. SAN hardware power consumption is now growing by hundreds of megawatts per week--taxing the already stressed electrical grid. The requirement of a continuous source of high-quality power, forces critical SAN hardware infrastructures to rely on internal UPS quality. The caliber of these systems is often a defining factor for users who increasingly feel the ill effects of power-related needs and problems in millions of dollars per incident.
As the SANs continue toward their ultimate destiny, their criticality and reliability will continue to take on new importance. UPS quality is the number one issue affecting SAN hardware reliability today. The critical UPS requirements of typical SAN enterprises are increasing almost exponentially, along with the consequences of a power interruption.
The Enterprise UPS Imperatives
Realizing that many critical enterprises require more power and higher levels of quality and reliability, a UPS is necessary to ensure a stable power environment. With the preceding in mind, let's now take a look at the following enterprise UPS core imperatives:
- Lifecycle costs.
- Adaptability Or Scalability.
- Maintenance Or Serviceability.
UPS Lifecycle Costs
UPS life cycle costs are the most important power requirements. The typical life cycle cost of an enterprise's UPS requirement is usually planned to increase linearly from the actual startup requirement and achieve the design power capacity halfway through its expected life cycle.
Avoiding Costs From Over-Sizing A UPS System
The lifecycle costs associated with over-sizing a UPS system can be separated into two parts: The capital costs and the operating costs.
The excess capacity translates directly to excess capital costs. In addition to the costs associated with the UPS system, excess capital costs include infrastructure such as raised floors, as well as cooling system infrastructure.
The typical 100kW enterprise costs are usually on the order of $500,000 or $5 per kilowatt. This analysis indicates that on the order of 70% or $350,000 of this investment is wasted. In the early years, this waste is even greater. When the time-cost of money is figured in, the typical loss due to over-sizing nearly equals 100% of the entire capital cost of the SAN hardware! That is, the interest alone on the original capital is almost capable of paying for the actual capital requirement.
The excess lifecycle costs associated with over-sizing also include the expenses of operating the facility. These costs include maintenance contracts, consumables, and electricity. Maintenance costs are typically slightly less than the capital cost over the lifetime of the SAN hardware, when the equipment is maintained per the manufacturers instructions. Since over-sizing gives rise to under-utilized equipment that must be maintained, a large fraction of the maintenance costs are wasted. In the case of the 100kW SAN hardware example, this wasted cost is on the order of $250,000 over the system lifetime.
Excess electricity costs are significant when SAN hardware is oversized. The idling loss of a SAN hardware UPS system is on the order of 4% of the power rating. When cooling costs are factored in, this becomes 8%. A 100kW SAN hardware is oversized to typical values, with a nameplate rating in excess of the design rating. As it is in a typical SAN hardware, the wasted electricity over the 10 year system lifetime is on the order of 600,000 kWHr, equating to on the order of $30,000.
The total excess costs over the lifetime of the SAN hardware will on average be around 70% of the system cost. This represents an entitlement that could theoretically be recovered if the SAN hardware infrastructure could adapt and change to meet the actual requirement.
For many enterprises, the waste of capital and expense dollars becomes a lost opportunity cost, which can be many times larger than the out-of-pocket cost. For example, Internet hosting enterprises have failed when the unutilized capital tied up in one installation prevented its deployment in another opportunity.
It is very costly to increase UPS capacity partway through the SAN hardware lifecycle. The work associated with increasing SAN hardware UPS capacity during the lifecycle creates a large and unacceptable risk of creating downtime.
All of the engineering and planning for the ultimate SAN hardware UPS capacity must be done up-front. The load requirement of the SAN hardware will increase, but this increase cannot be reliably predicted. The result of the preceding assumptions is that SAN hardware is planned, engineered, and built out up-front to meet an unknown need. And, the UPS capacity of the SAN hardware is planned to be conservatively to the high side of any reasonable growth scenario.
UPS Adaptability Or Scalability
The solution requirements to meet the UPS adaptability or scalability share many features in common with the solution requirements for life cycle costs. In particular, pre-engineered, standardized, and modular solutions are needed.
Many issues related to UPS adaptability or scalability relate to the architecture of the UPS distribution system to the rack. A brief discussion of this subject is discussed next.
UPS Rack Powering Options
As SAN hardware is changed, the UPS requirement, which also includes the voltage requirement, the redundancy requirement, and the connector requirement, is often changed as well. Also, as rack enclosures have become the standard means for housing and organizing computing and communication systems, the UPS distribution system for the rack enclosure must adapt to these changing requirements.
The UPS requirements of modern computing equipment vary as a function of time, depending on the computational toad. However, the implementation of power management technologies into processors, servers and nearly all SAN hardware, has a substantial variation in UPS consumption in response to the computing load. This variation can be as high as 200% of the baseline power consumption of the SAN hardware. The UPS distribution system design for a rack enclosure must comprehend this variation.
The most common approach today is to design, engineer, and install UPSsolutions specific to a rack enclosure. Should the requirements for that rack enclosure change, an alternative UPS solution must be designed, engineered, and installed. While this approach can comprehend any unique UPS requirement, it involves significant planning and engineering. Rack enclosures are usually fed from a common power distribution panel within the enterprise. In most instances, this panel cannot be de-energized in order to adapt a rack enclosure(s) UPS distribution system (i.e. install another breaker). The result known as "hot work" not only introduces a very serious safety hazard, but a high degree of risk of creating a fault in the circuit being worked on and/or dislodging/faulting adjacent wiring circuits. Such errors result in undesirable downtime.
Ideally, the rack enclosure UPS system would be adaptable to any realistically possible combination of equipment, on demand, without the need to perform any work that would be a hazard to safety or that might adversely affect system availability.
In addition to the capability of the adaptable rack enclosure UPS system to respond quickly and economically to change, there are cycle time and cost advantages associated with the initial installation of the system, including a dramatic simplification to the up-front engineering and installation work associated with SAN hardware design. Furthermore, the ability to adapt the rack enclosure UPS system can allow the system to be "right sized" to the actual load requirement and grow with expanding needs. The economic benefits of rightsizing can be well over 50% of the lifecycle cost of SAN hardware as previously discussed.
Human error is commonly the dominant problem relating to UPS availability. Over 50% of all load drop events in SAN hardware are caused by human error. IT managers have expressed frustration at the wide variety of the types of human errors, and the number of unique types of human errors, which appear almost impossible to anticipate.
Nevertheless, a common denominator is the fact that humans take actions based on their own mental model of how the UPS system behaves, and very often their understanding of the system is wrong. These human errors occur during operation of the UPS system, but they also occur during design and installation. Standardization, automation, and simplification are required to overcome these problems.
The UPS manageability solution requirements are extremely expensive to design, install, and test in uniquely engineered systems. These imperatives clearly suggest the need for pre-engineered, pre-tested, and standardized management tools.
In other words, after you address the electrical concerns, consider the optional hardware available or the features that the vendor's (like American Power Conversion (APC), Clary, Falcon Electric, Liebert (Emerson), MGE UOPS Systems, ONEAC, OPTI-UPS, Powerware, Tripp Lite, Tsi Power, etc.) management software provides before you buy. For example, if the UPS must protect a group of servers, the management software's ability to vary the amount of time it takes to close each server's applications and shut down when power fails might be an essential feature. You might also want the UPS's power monitoring software to alert the administrator and network users about an impending shutdown.
Once you have ascertained your power needs and determined whether you want a centralized UPS for all your servers, or a UPS for each one, you must determine whose UPS to buy. In attempting to answer this question, you are likely to discover the fundamental truth about the UPS business: it's pretty much a commodity market. The hardware from the various vendors offers a very similar set of features, and the pricing is tightly bunched at key price points.
In a true commodity market, this scenario would suggest that if you don't buy on the basis of intangibles (such as company reputation or availability of 24-hour support), then you should buy strictly on the basis of price. However, UPS vendors can be distinguished readily by the quality of the software they include with their products.
The software generally provides certain basic features: it monitors the UPS, it monitors and records the quality of the incoming power, and it performs an orderly shutdown of a workstation or server if the UPS itself has to be shut down (during an extended power loss, for example). The software runs on the supported machine and communicates with the UPS via a DB-9 cord plugged into the serial port. On NT machines, the software runs as a service. As with many services, it can be set to start at boot-up or be activated manually. The service communicates with the UPS and collects the data. Another application then can be run to view and interpret the data.
Interpretation of the data can take two forms: the data can be presented visually by gauges or other similar devices; or, the data can trigger alarms when various monitored variables exceed user-established thresholds. In the latter case, most software supports paging the systems administrator, sending e-mail, or sounding an alarm.
UPS Maintenance Or Serviceability
It's important to ask questions about the vendor's UPS warranty and on-site service or maintenance options. Remember, that in a few years, you'll have to replace the batteries, and replacement is quite labor-intensive if you plan to purchase numerous units. Securing on-site UPS service options with the vendor before you purchase might free your IT staff from that labor in the future. Make sure you explore all the possible warranty alternatives with the vendor, so that you receive the option that is most convenient for you.
Next Generation UPS Systems For SAN Hardware
Finally, there are a number of changes required from current SAN hardware design practices. Many of these changes will require changes in the technology and design of UPS equipment, and how it is specified. Integration of the components of the UPS subsystem must move away from the current practice of unique system designs, and toward pre-engineered and even pre-manufactured solutions. Such solutions would ideally be modular and standardized, expandable at will, and would ship complete--but in parts that would rapidly plug together on site. Standardization will facilitate the learning process. By spreading the cost of developing high performance management systems across large numbers of standardized installations, advanced UPS management would be affordable to all customers.
Summary And Conclusions
UPSs with unity power factor reduce uncertainty of overloading UPS systems. SAN hardware UPS requirements are changing, traditional UPS designs haven't.
Over the past several years, electrical requirements of SAN hardware have changed dramatically. Unfortunately for consulting engineers, SAN hardware users and facility engineers, many UPS system designs have not kept pace with these changes.
This isn't a simple matter of wasting the company's money. It can be very dangerous if users and consultants don't understand this concept because there is a risk of overloading the UPS. It is more likely to exceed the KW rating than the kVA rating. When this happens, the UPS will need to transfer to bypass. That puts the SAN hardware load at risk.
Furthermore, SAN hardware is routinely oversized to three times its required UPS capacity. Over-sizing drives excessive capital and maintenance expenses, which are a substantial fraction of the overall lifecycle cost. Most of this excess cost can be recovered by implementing a method and architecture, which can adapt to changing requirements in a cost-effective manner while at the same time providing high availability.
In addition, individual rack enclosure UPS consumption in SAN hardware varies widely and is expected to grow in the next few years. Rack enclosure equipment is replaced five (5) or more times during the life of SAN hardware in a piecemeal manner. This situation requires a rack enclosure UPS distribution system that can cope with the changing requirements, These requirements can lead to a practical rack enclosure UPS architecture that can meet the requirements for an adaptable rack enclosure UPS system.
Finally, a systematic analysis of customer problems relating to SAN hardware UPS systems provides a clear statement of direction for next generation SAN hardware. The most pressing problems that are not solved by current design practices and equipment have the common theme of the inability of the SAN hardware to adapt to change. Next generation SAN hardware UPS systems must be more adaptable to changing requirements, in order to improve both availability and cost effectiveness.
John Vacca is an information technology consultant and internationally known author based in Pomeroy, Ohio. Since 1982, John has authored 39 books and more than 470 articles in the areas of advanced storage, computer security and aerospace technology. John was also a configuration management specialist, computer specialist, and the computer security official for NASA's space station program (Freedom) and the International Space Station Program, from 1988 until his early retirement from NASA in 1995. John was also one of the security consultants for the MGM movie titled : "AntiTrust," which was released on January 12, 2001. John can be reached on the Internet at email@example.com.