Much has changed in the decade that Enterprise Storage Forum has been covering data storage technology, so in the next couple of articles, I’m going to offer an overview of the basics of storage networking to update a series that first ran nearly eight years ago (see Network Storage – The Basics).
Network storage is not all that much different from networked systems. It is the ability to provide storage services over a network. Just as you can have as few as two machines hooked together, you can have a server and storage connected together. Or just as you can have hundreds or thousands of machines connected together, you can have hundreds or thousands of machines connected to hundreds or even thousands of storage systems, both locally or over wide area networks (WANs).
Modern networked storage has evolved from the mid-1990s, where storage was connected via Fibre Channel hubs using Fibre Channel arbitrated loops (FC-AL), to today, where we have Fibre Channel fabrics, iSCSI over 1GbE or 10GbE, network-attached storage (NAS), InfiniBand and FCoE, which can all be used for storage networking. I believe that over the next few years we are going to see Fibre Channel fade away as a technology in favor of Ethernet (see Falling 10GbE Prices Spell Doom for Fibre Channel).
Storage networking grew out of a new set of requirements for UNIX systems in the 1990s. Back then, storage was still relatively expensive compared to servers, and breaking the storage in a server model (DAS) and allowing a storage pool to be shared with a group of servers was important, as was reliability. So each server then had it own storage that was allocated from a large pool of RAID storage. Some of the developments that allowed this were the development of RAID controllers, and, of course, Fibre Channel.
The obvious next step was sharing data as opposed to sharing storage. This data sharing requirement started to appear in the late 1990s with the introduction of shared file systems from a number of vendors. At first these were developed for specialized application requirements for a small number of servers, but by the middle part of this decade, these file systems become pretty general purpose and could support hundreds of servers, and sharing data today is as common as sharing storage was just five to seven years ago.
Storage Networking Definitions
There are some basic definitions that everyone should understand. Here are some of the most important, courtesy of SNIA, with some added commentary.
Direct Attached Storage (DAS)
A storage system with one or more dedicated storage devices that connect directly to one or more servers. Basically, you have a server directly connected to storage without going through a switch. The connection is point-to-point, with the cable going from the server directly to the storage.
Network Attached Storage (NAS)
1. As a storage system, a term used to refer to storage elements that connect to a network and provide file access services to computer systems. These elements generally consist of an engine that implements the file services and one or more devices on which data is stored.
2. As a network, a class of systems that provide file services to host computers using file access protocols such as NFS or CIFS protocol. See storage area network below.
I believe NAS devices and NAS-based storage will become more common than they are today. Most NAS systems do not scale to the size and performance that SAN systems scale to today. Most of this is because the performance of the communication protocols NFS and CIFS are not designed to stream data at high rates. A new version of NFS will likely become commonly available next year called pNFS, which we will touch on more in the next article.
Fibre Channel over Ethernet (FCoE)
A technology that encapsulates Fibre Channel frames in Ethernet frames, allowing FC traffic to be transported over Ethernet networks, which have recently become cheaper than Fibre Channel.
Storage Area Network (SAN)
Fibre Channel or iSCSI storage area networks, a definition that will soon include FCoE too.
SANs use block addressing, an algorithm for uniquely identifying blocks of data stored on disk or tape media by number and then translating these numbers into physical locations on the media.
Small Computer System Interface (SCSI)
SCSI is a collection of ANSI standards and proposed standards that define I/O buses primarily intended for connecting storage subsystems or devices to hosts through host bus adapters (HBAs).
Serial Advanced Technology Attachment (SATA)
SATA is a version of the ATA interface that uses a serial connection architecture.
Serial Attached SCSI (SAS)
A SCSI interface standard that provides for attaching HBAs and RAID controllers to both SAS and SATA disk and tape drives, as well as other SAS devices.
INCITS Technical Committee T10 is responsible for the national (ANSI) and international (ISO) standards for SAS.
Changing Storage Protocols
Until recently, what happened was that the host side created a SCSI packet with embedded data or commands and sent it to the FC HBA, which took the SCSI packet and sent it to the RAID controller using Fibre Channel protocol. The RAID controller then wrote to the Fibre Channel disks, and the disk drives took the SCSI packet and translated it to the data on the drive.
Today, the data protocol is still SCSI to the controller, but the encapsulation could be with TCP/IP and Ethernet, InfiniBand or FCoE. Once the data gets to the controller, things today likely get changed. The latest RAID controllers — and likely all future controllers for the foreseeable future — will take the front-end SCSI protocol and low-level hardware interface and then use SAS protocol to the disk trays, and then depending on the disk type (SAS or SATA), use the appropriate command set for the disk drive. SCSI is a subset of SAS, and SCSI is a superset of SATA. For SAS, some of the extra commands that SCSI does not have are not important for the transport, but important for drive management for power and errors, so the host side does not need these, only the RAID controller or SAS HBA. Below is the SCSI/SAS standard overview from the T10 Web site.
T10 SCSI Architecture |
Obviously, it’s not the easiest chart to follow unless you have spent a fair amount of time in the industry and know all the jargon and understand many of the low-level protocol issues. The T13 group, which handles SATA interfaces and commands, has a similar framework.
Hopefully this article has given you an overview of the basics of storage networking, along with an idea of the changes that are happening now and that I think will happen over the next decade. The storage stack is complex because it really has not changed much over the last 20 years and yet there have been huge changes in our industry. The move from Fibre Channel and other connectivity options to 10GbE is designed to reduce complexity and break the model of having networking administration separated from storage administration both from a hardware perspective and from a personnel perspective. If you are a network administrator, I would strongly recommend learning something about storage, and if you are a SAN administrator, I would strongly recommend learning something about networking. In the next article in this series, I am going to look at what SANs look like today and some of the configuration and administration issues and what SANs will look like tomorrow, and some thoughts on how to move from today’s world to the world we will face tomorrow.
Henry Newman, CTO of Instrumental Inc. and a regular Enterprise Storage Forum contributor, is an industry consultant with 28 years experience in high-performance computing and storage.
See more articles by Henry Newman.