Planning and evaluating storage options is an ongoing challenge. A common IT paradigm is that, no matter how much space is available, data will always grow to fill it. Technologies like e-commerce are placing increasing demands on the availability, capacity and speed of storage technologies. In the past, the solution for increasing storage capacity was simply to add more storage devices. However, just adding more disk space using traditional technology can create bottlenecks that slow performance. In the 21st century, there are other options available.
Even though today's savvy IT managers can speak in acronyms, storage technologies can present a challenge. Competing technologies, such as DAS, NAS, and SAN using SCSI in a JBOD or RAID configuration provide not only technical solutions, but also a range of implementation and management considerations. Acronyms aside, all of these technologies are used to address a common goal; the need for more and always-faster data storage facilities. Beyond the technologies, "standards" evolve as each manufacturer provides extensions that may or may not lock IT managers into a single vendor. For those responsible for making storage technology purchasing decisions, it can be very confusing, and a bad decision can cost money and time and impact the performance of the entire network.
Detailed planning of storage enhancements gives IT management a chance to study the technologies and determine what provides the right scalability and flexibility for their network. However, such planning cannot take place without an understanding of the advantages and disadvantages of the underlying technologies. The planning process is made more difficult because even though standards do exist, they change constantly. In the network storage arena, for example, the National Committee for Information Technology Standards (NCITS), complemented by the Internet Engineering Task Force (IETF), the Storage Networking Industry Association (SNIA), and the Fibre Channel Industry Association (FCIA) all contribute to the standards. In addition, several vendors now offer certification and accreditation programs that let buyers identify compatible products and so circumvent compatibility issues.
These accreditation programs provide the information on which decisions can be made. Unfortunately, the storage technology market is changing so quickly that it can be difficult to keep abreast of the competing claims, certifications and accreditations. This article is the first in a series which will look at how accreditation programs like those available from Brocade, IBM, and EMC work, and at how they can be valuable in helping IT managers make the right decisions when it comes to network storage purchasing. Before we start looking at the specific programs, however, a historical view of the storage arena and the related technologies will make it easier to understand how these certification programs came into being.
Storage System Standards
Back in the days of mainframe computing, the disk subsystems required proprietary interfaces. Some vendors offered replacement disks and disk controllers, but the mainframe vendor set the standards. With the advent of microcomputers, however, the standards, which define the interface between the PC and the hard drive, have became more open. Today there are only two choices for storage system standards, AT Attachment (ATA), also known as Integrated Drive Electronics (IDE), and the Small Computer Systems Interface (SCSI). Of the two, for network storage applications, there is really only one choice, which is SCSI. However, to find out why SCSI is the preferred of the two, a look at ATA is warranted.
The AT attachment standard was first ratified in 1994 with the ATA-1 standard, and there since been five additional standards. Of these standards, only the two most recent are likely to be seen in a modern computer system. Unlike the previous storage technologies that required a separate controller, IDE devices have the controller built right onto the drive. The use of ATA standards with ATA/IDE drives all but eliminates cross vendor compatibility issues between drives.
Originally, the IDE/ATA standards specified a way to connect only disk drives to the microcomputer bus, but with the popularity of CD-ROM and the introduction of tape devices for PCs, the standard added packet interface and data block handling to become the IDE/ATAPI standard. This functionality was introduced with the fourth ATA standard, ATA-33.
In spite of the compatibility advantages of IDE devices, and the relatively low cost, IDE is not the preferred technology in network storage solutions, and for good reasons. The first of these reasons is speed - each of the ATA standards has operated at lower speeds than alternatives that use the SCSI standard. Next on the negative list is the number of devices that can be supported. Unlike some SCSI standards that can support 31 storage devices, current IDE standards can only accept four. Lastly, we have the lack of support for external devices, which means that IDE is only suitable for DAS solutions.
SCSI offers distinct advantages over IDE in terms of both speed and flexibility. The SCSI standards, of which there are many, define driver software, commands, management functions, and physical connections to allow a series of peripherals to work together. Although these devices must still be located fairly close to the host device (no more than 25 meters), there is support for 15 or 31 storage devices depending on the standard, and for both internal and external devices. Adding more devices can easily expand a SCSI subsystem. For these reasons and others, SCSI is the most popular storage standard.
Like other technologies, SCSI has evolved to support higher volume disks and higher transfer rates. Unfortunately, this creates multiple levels of specifications. In the case of SCSI, the result is an almost bewildering array of standards including:
In addition, a SCSI specification that uses IP addressing to transfer data is generating a great deal of interest. Although this shows promise and has been highly touted in the technical press, it remains a new implementation. As such, it will undergo several revisions before the "standard" becomes consistent.
Having now discussed the storage technologies used, we can examine how they are implemented. The most common storage configuration is simply that of having disks connected to a server. Each disk is effectively a stand-alone unit holding different data to any other connected disk. This configuration is often referred to as Just a Bunch of Disks (JBOD). Although it is a well-known approach, JBOD creates a single-point of failure. If the disk crashes, all of the data on the drive will be unavailable. There are few environments where such an occurrence is acceptable. Effectively managing storage is about more than providing space for users, it's about providing a responsive system that is also secure and reliable. JBOD systems provide few tools or options that allow response or reliability to be improved. With JBOD you can configure servers with faster and larger disks, but the underlying problems of throughput and the single point of failure remain.
To address the potential of a hard disk failure causing a loss of data, back in 1988, researchers at the University of California in Berkeley created a system called the Redundant Array of Independent Disks (RAID), as an alternative to the single drive approach which is referred to as a Single Large Expensive Disk (SLED). RAID combines multiple disk units into a single logical drive. This can increase performance by writing data onto several disks at the same time, though actual performance improvements depend on the RAID standard being used and the underlying storage technology. There are various RAID standards, referred to as levels, ranging from Level 0, which provides increased performance but no fault tolerance, to combination RAID levels which take the advantages of certain RAID levels and combine them with others to eliminate disadvantages. Here are the definitions of the RAID levels;
There are other RAID level combinations that are used, though these are the result of ingenuity and improvisation on the part of technicians rather than any ratified standards.
Although, depending on the RAID configuration, there are performance and fault tolerant advantages to using it, RAID only solves fault tolerance from a disk perspective rather than for the equipment it is attached to.
Whether JBOD or RAID, the next part of the storage puzzle is the way in which the storage subsystem is made available to clients. That can be in one of three ways - DAS, NAS or SAN.
Traditionally, all mass storage connected directly to the server, hence the term Direct Attached Storage (DAS). DAS configurations can be either a JBOD (Just a Bunch of Disks) or a Redundant Arrays of Independent Disks (RAID). The key aspect of DAS is that the disk subsystem is either in the same physical box as the system processor, or in an external box connected directly to the system processor unit. DAS satisfies some applications, but as access to the disks must occur through the server processor, each disk access takes valuable processing power away from the server. Further, each user must access the storage through a single channel that attaches the disk(s) to the server. This means that simultaneous disk accesses must be queued, and users can be left waiting. Enhanced caching systems alleviate some of these delays, but network performance lags when the data stores get large or when read and write requests take a long time to complete.
Network Attached Storage (NAS) represents a more recent advance in network storage, allowing devices to be connected directly to the network rather than to a server system in DAS. NAS devices are typically self-contained, with an embedded OS providing the functionality to clients. NAS devices are assigned an IP address which clients can then use to access the NAS device directly, or through a server which acts as a gateway.
While NAS provides a high level of flexibility and availability, it only operates as well as the network and IP allow. Some applications that transfer large blocks of data can slow down the network, and there remains a single path in and out of the disk subsystem. Unlike DAS, however, NAS can be located anywhere on the network. In addition, data can be placed on separate NAS drives and placed close to the users who use it as a way to avoid traffic overflows. Finally, NAS allows multiple operating platforms to access a single drive.
Although there are distinct advantages for NAS over DAS, particularly in terms of flexibility, there are additional concerns. If data is spread across multiple NAS drives, managers need to ensure that the information remains synchronized and reliable. In addition, as is true with most IP-based services, security becomes a concern. Posting storage medium on a network potentially exposes that data to anyone who can access the network, and so requires an additional level of data encryption and security.
Storage Area Networks (SANs) are highly touted as the solution to increasing demands for reliable and flexible network storage solutions. A SAN is effectively a mini network created between storage devices that allow the devices to communicate with each other without using of the standard network infrastructure. In addition, SAN's make it possible for multiple server system to access the same storage devices allowing the centralization of data.
In a SAN, disks attach to bridges and controllers that create dynamic links between the data disk and the server or workstation. Generally called Fibre Channel subsystems, the physical connections can be copper or fiber optic wiring.
The use of a bridge or switch to connect the SANs offers several benefits. In essence, any device can attach to a disk and operate as though it were a private network. This lessens the impact that heavy transactions and backup activities can have on network performance. Data transmission rates are 100MB to 200MB per second, and most expect these speeds to reach 2GB per second. In addition SAN's can be easily expanded.
Fibre Channel implements a layered architecture that control protocols and protocol translation. Fibre Channel SAN standards exist, and most fibre channel vendors claim to want open systems. However, interoperability remains elusive. The current specification focuses on transporting data and the protocols associated with that function. As a result, each vendor can (and does) implement a different management and control system. Further, vendors frequently extend their products, and this can prevent customers from mixing products from separate vendors on the same fibre channel.
Direct Attached Storage (DAS)
Network Attached Storage (NAS)
Storage Area Network (SAN)
What Runs With What?
In response to concerns over product and technology compatibility, several leading vendors support and participate in the accreditation programs discussed earlier. Under these programs, vendors can submit products for testing to determine compatibility with the other vendor's products.
Competing vendors also develop proprietary standards that compete directly with the "standard" technologies. IBM, for example, developed SSA, and several manufacturers offer products for that specification. IT managers face a difficult prospect of studying the competing standards and separating fiction from fact. In addition, new technologies will enter the market to challenge established products.
No wonder, IT managers scratch their heads.