Weigh the Factors In Your SAN-UPS Juggling Act

Effective next generation enterprises must address the known needs and
problems relating to current and past storage area networks (SANs) powering
needs and problems. Despite revolutionary changes in SAN hardware and
products, the design of the Uninterruptible Power Supply (UPS)
infrastructure for enterprises has changed very little since 1965. Although
SAN hardware has always required electrical power, the way that SAN systems
are deployed today, has created new power-related problems which were not
foreseen when the powering principles for the modern data center were
developed over 30 years ago.

With the preceding in mind, Part II continues the UPS for SAN hardware
theme by discussing a categorized and prioritized collection of SAN
hardware powering needs and problems; avoiding costs from over-sizing a UPS
system; UPS lifecycle cost imperatives; UPS rack powering options; power monitoring
software; UPS adaptability/scalability, availability,
manageability, and serviceability imperatives; and, next generation UPS
systems for SAN hardware. Let’s look at SAN hardware powering needs and
problems first.

SAN Hardware Powering Needs And Problems

The recent surge in SAN usage has been accompanied by an equally large
demand for high-quality UPS to feed the evolving SAN hardware
infrastructure. SAN hardware power consumption is now growing by hundreds
of megawatts per week–taxing the already stressed electrical grid. The
requirement of a continuous source of high-quality power, forces critical
SAN hardware infrastructures to rely on internal UPS quality. The caliber
of these systems is often a defining factor for users who increasingly feel
the ill effects of power-related needs and problems in millions of dollars
per incident.

As the SANs continue toward their ultimate destiny, their criticality and
reliability will continue to take on new importance. UPS quality is the
number one issue affecting SAN hardware reliability today. The critical UPS
requirements of typical SAN enterprises are increasing almost
exponentially, along with the consequences of a power interruption.

The Enterprise UPS Imperatives

Realizing that many critical enterprises require more power and higher
levels of quality and reliability, a UPS is necessary to ensure a stable
power environment. With the preceding in mind, let’s now take a look at the
following enterprise UPS core imperatives:

Lifecycle costs.
Adaptability Or Scalability.
Availability.
Manageability.
Maintenance Or Serviceability.

UPS Lifecycle Costs

UPS life cycle costs are the most important power requirements. The typical
life cycle cost of an enterprise’s UPS requirement is usually planned to
increase linearly from the actual startup requirement and achieve the
design power capacity halfway through its expected life cycle.

Avoiding Costs From Over-Sizing A UPS System

The lifecycle costs associated with over-sizing a UPS system can be
separated into two parts: The capital costs and the operating costs.

Capital Costs

The excess capacity translates directly to excess capital costs. In
addition to the costs associated with the UPS system, excess capital costs
include infrastructure such as raised floors, as well as cooling system
infrastructure.

The typical 100kW enterprise costs are usually on the order of $500,000 or
$5 per kilowatt. This analysis indicates that on the order of 70% or
$350,000 of this investment is wasted. In the early years, this waste is
even greater. When the time-cost of money is figured in, the typical loss
due to over-sizing nearly equals 100% of the entire capital cost of the SAN
hardware! That is, the interest alone on the original capital is almost
capable of paying for the actual capital requirement.

Operating Costs

The excess lifecycle costs associated with over-sizing also include the
expenses of operating the facility. These costs include maintenance
contracts, consumables, and electricity. Maintenance costs are typically
slightly less than the capital cost over the lifetime of the SAN hardware,
when the equipment is maintained per the manufacturers instructions. Since
over-sizing gives rise to under-utilized equipment that must be maintained,
a large fraction of the maintenance costs are wasted. In the case of the
100kW SAN hardware example, this wasted cost is on the order of $250,000
over the system lifetime.

Excess electricity costs are significant when SAN hardware is oversized.
The idling loss of a SAN hardware UPS system is on the order of 4% of the
power rating. When cooling costs are factored in, this becomes 8%. A 100kW
SAN hardware is oversized to typical values, with a nameplate rating in
excess of the design rating. As it is in a typical SAN hardware, the wasted
electricity over the 10 year system lifetime is on the order of 600,000
kWHr, equating to on the order of $30,000.

The total excess costs over the lifetime of the SAN hardware will on
average be around 70% of the system cost. This represents an entitlement
that could theoretically be recovered if the SAN hardware infrastructure
could adapt and change to meet the actual requirement.

For many enterprises, the waste of capital and expense dollars becomes a
lost opportunity cost, which can be many times larger than the out-of-pocket
cost. For example, Internet hosting enterprises have failed when the
unutilized capital tied up in one installation prevented its deployment in
another opportunity.

It is very costly to increase UPS capacity partway through the SAN hardware
lifecycle. The work associated with increasing SAN hardware UPS capacity
during the lifecycle creates a large and unacceptable risk of creating
downtime.

All of the engineering and planning for the ultimate SAN hardware UPS
capacity must be done up-front. The load requirement of the SAN hardware
will increase, but this increase cannot be reliably predicted. The result
of the preceding assumptions is that SAN hardware is planned, engineered,
and built out up-front to meet an unknown need. And, the UPS capacity of
the SAN hardware is planned to be conservatively to the high side of any
reasonable growth scenario.

UPS Adaptability Or Scalability

The solution requirements to meet the UPS adaptability or scalability share
many features in common with the solution requirements for life cycle
costs. In particular, pre-engineered, standardized, and modular solutions
are needed.

Many issues related to UPS adaptability or scalability relate to the
architecture of the UPS distribution system to the rack. A brief discussion
of this subject is discussed next.

UPS Rack Powering Options

As SAN hardware is changed, the UPS requirement, which also includes the
voltage requirement, the redundancy requirement, and the connector
requirement, is often changed as well. Also, as rack enclosures have become
the standard means for housing and organizing computing and communication
systems, the UPS distribution system for the rack enclosure must adapt to
these changing requirements.

The UPS requirements of modern computing equipment vary as a function of
time, depending on the computational toad. However, the implementation of
power management technologies into processors, servers and nearly all SAN
hardware, has a substantial variation in UPS consumption in response to the
computing load. This variation can be as high as 200% of the baseline power
consumption of the SAN hardware. The UPS distribution system design for a
rack enclosure must comprehend this variation.

The most common approach today is to design, engineer, and install
UPSsolutions specific to a rack enclosure. Should the requirements for that
rack enclosure change, an alternative UPS solution must be designed,
engineered, and installed. While this approach can comprehend any unique
UPS requirement, it involves significant planning and engineering. Rack
enclosures are usually fed from a common power distribution panel within
the enterprise. In most instances, this panel cannot be de-energized in
order to adapt a rack enclosure(s) UPS distribution system (i.e. install
another breaker). The result known as “hot work” not only introduces a very
serious safety hazard, but a high degree of risk of creating a fault in the
circuit being worked on and/or dislodging/faulting adjacent wiring
circuits. Such errors result in undesirable downtime.

Ideally, the rack enclosure UPS system would be adaptable to any
realistically possible combination of equipment, on demand, without the
need to perform any work that would be a hazard to safety or that might
adversely affect system availability.

In addition to the capability of the adaptable rack enclosure UPS system to
respond quickly and economically to change, there are cycle time and cost
advantages associated with the initial installation of the system,
including a dramatic simplification to the up-front engineering and
installation work associated with SAN hardware design. Furthermore, the
ability to adapt the rack enclosure UPS system can allow the system to be
“right sized” to the actual load requirement and grow with expanding needs.
The economic benefits of rightsizing can be well over 50% of the lifecycle
cost of SAN hardware as previously discussed.

UPS Availability

Human error is commonly the dominant problem relating to UPS availability.
Over 50% of all load drop events in SAN hardware are caused by human error.
IT managers have expressed frustration at the wide variety of the types of
human errors, and the number of unique types of human errors, which appear
almost impossible to anticipate.

Nevertheless, a common denominator is the fact that humans take actions
based on their own mental model of how the UPS system behaves, and very
often their understanding of the system is wrong. These human errors occur
during operation of the UPS system, but they also occur during design and
installation. Standardization, automation, and simplification are required
to overcome these problems.

UPS Manageability

The UPS manageability solution requirements are extremely expensive to
design, install, and test in uniquely engineered systems. These imperatives
clearly suggest the need for pre-engineered, pre-tested, and standardized
management tools.

In other words, after you address the electrical concerns, consider the
optional hardware available or the features that the vendor’s (like
American Power Conversion (APC), Clary, Falcon Electric, Liebert (Emerson),
MGE UOPS Systems, ONEAC, OPTI-UPS, Powerware, Tripp Lite, Tsi Power, etc.)
management software provides before you buy. For example, if the UPS must
protect a group of servers, the management software’s ability to vary the
amount of time it takes to close each server’s applications and shut down
when power fails might be an essential feature. You might also want the
UPS’s power monitoring software to alert the administrator and network
users about an impending shutdown.

Power-Monitoring Software

Once you have ascertained your power needs and determined whether you want
a centralized UPS for all your servers, or a UPS for each one, you must
determine whose UPS to buy. In attempting to answer this question, you are
likely to discover the fundamental truth about the UPS business: it’s
pretty much a commodity market. The hardware from the various vendors
offers a very similar set of features, and the pricing is tightly bunched
at key price points.

In a true commodity market, this scenario would suggest that if you don’t
buy on the basis of intangibles (such as company reputation or availability
of 24-hour support), then you should buy strictly on the basis of price.
However, UPS vendors can be distinguished readily by the quality of the
software they include with their products.

The software generally provides certain basic features: it monitors the
UPS, it monitors and records the quality of the incoming power, and it
performs an orderly shutdown of a workstation or server if the UPS itself
has to be shut down (during an extended power loss, for example). The
software runs on the supported machine and communicates with the UPS via a
DB-9 cord plugged into the serial port. On NT machines, the software runs
as a service. As with many services, it can be set to start at boot-up or
be activated manually. The service communicates with the UPS and collects
the data. Another application then can be run to view and interpret the
data.

Interpretation of the data can take two forms: the data can be presented
visually by gauges or other similar devices; or, the data can trigger
alarms when various monitored variables exceed user-established thresholds.
In the latter case, most software supports paging the systems
administrator, sending e-mail, or sounding an alarm.

UPS Maintenance Or Serviceability

It’s important to ask questions about the vendor’s UPS warranty and on-site
service or maintenance options. Remember, that in a few years, you’ll have
to replace the batteries, and replacement is quite labor-intensive if you
plan to purchase numerous units. Securing on-site UPS service options with
the vendor before you purchase might free your IT staff from that labor in
the future. Make sure you explore all the possible warranty alternatives
with the vendor, so that you receive the option that is most convenient for
you.

Next Generation UPS Systems For SAN Hardware

Finally, there are a number of changes required from current SAN hardware
design practices. Many of these changes will require changes in the
technology and design of UPS equipment, and how it is specified.
Integration of the components of the UPS subsystem must move away from the
current practice of unique system designs, and toward pre-engineered and
even pre-manufactured solutions. Such solutions would ideally be modular
and standardized, expandable at will, and would ship complete–but in parts
that would rapidly plug together on site. Standardization will facilitate
the learning process. By spreading the cost of developing high performance
management systems across large numbers of standardized installations,
advanced UPS management would be affordable to all customers.

Summary And Conclusions

UPSs with unity power factor reduce uncertainty of overloading UPS systems.
SAN hardware UPS requirements are changing, traditional UPS designs
haven’t.

Over the past several years, electrical requirements of SAN hardware have
changed dramatically. Unfortunately for consulting engineers, SAN hardware
users and facility engineers, many UPS system designs have not kept pace
with these changes.

This isn’t a simple matter of wasting the company’s money. It can be very
dangerous if users and consultants don’t understand this concept because
there is a risk of overloading the UPS. It is more likely to exceed the KW
rating than the kVA rating. When this happens, the UPS will need to
transfer to bypass. That puts the SAN hardware load at risk.

Furthermore, SAN hardware is routinely oversized to three times its
required UPS capacity. Over-sizing drives excessive capital and maintenance
expenses, which are a substantial fraction of the overall lifecycle cost.
Most of this excess cost can be recovered by implementing a method and
architecture, which can adapt to changing requirements in a cost-effective
manner while at the same time providing high availability.

In addition, individual rack enclosure UPS consumption in SAN hardware
varies widely and is expected to grow in the next few years. Rack enclosure
equipment is replaced five (5) or more times during the life of SAN
hardware in a piecemeal manner. This situation requires a rack enclosure
UPS distribution system that can cope with the changing requirements, These
requirements can lead to a practical rack enclosure UPS architecture that
can meet the requirements for an adaptable rack enclosure UPS system.

Finally, a systematic analysis of customer problems relating to SAN
hardware UPS systems provides a clear statement of direction for next
generation SAN hardware. The most pressing problems that are not solved by
current design practices and equipment have the common theme of the
inability of the SAN hardware to adapt to change. Next generation SAN
hardware UPS systems must be more adaptable to changing requirements, in
order to improve both availability and cost effectiveness.

John Vacca is an information technology consultant and internationally
known author based in Pomeroy, Ohio. Since 1982, John has authored 39 books
and more than 470 articles in the areas of advanced storage, computer
security and aerospace technology. John was also a configuration management
specialist, computer specialist, and the computer security official for
NASA’s space station program (Freedom) and the International Space Station
Program, from 1988 until his early retirement from NASA in 1995. John was
also one of the security consultants for the MGM movie titled :
“AntiTrust,” which was released on January 12, 2001. John can be reached on
the Internet at jvacca@hti.net.

»

See All Articles by Columnist John Vacca