This month I'm going to review some personal experiences with the installation, configuration, debugging, and operation of heterogeneous hardware and software. Of course, because of non-disclosure agreements (NDAs) in place with various organizations, the company names will remain anonymous to protect both the guilty and innocent...and me.
Over the last few years I have been involved with multiple shared file systems designs and installations with Fibre Channel, HBAs, RAID, tapes, and switches. You would think that with all the interoperability testing going on today, everything would be "plug and play" as with Windows XP. In most cases, it works about as well as Windows 2000 plug and play and, in some cases, as well as Windows 98 (sort-of kind-of with a number of gotchas).
Some people start off with an idea for an architecture based on a few PowerPoint slides from a vendor or two and add some of their own ideas based on internal requirements. Based on this, an architecture is defined, often with software and hardware from different vendors. PowerPoint engineering, as this type of engineering is often called, sometimes works and sometimes does not work very well. It then becomes someone's job to make the grandiose ideas work.
Not that I haven't created systems using this method. I call it the "Chinese a Menu" method where you get one or more choices per column. The difference is that I never expect these systems to work without a great deal of hard work, and we always plan extensive integration time to ensure we can get them to operate as expected. And sometimes even the best architectures cannot be made to work without changes from the hardware or software vendor.
Step One: Hardware
In complex architectures, picking hardware that works together is not always as easy as it seems. Severs, HBAs, FC Switches, RAIDs, and tapes from vendors have a matrix of what the vendor certifies interoperable with what other vendor products.
You might have a server that works with a certain HBA, and that HBA is certified with a specific FC switch and RAID, but your tape drives are certified with a different FC switch and/or different HBA. Fibre Channel tapes generally add more complexity to choosing hardware, given that tape error recovery is different and more complicated than RAID or disk error recovery. SCSI-connected tapes add additional issues if they are connected via a Fibre Channel-to-SCSI converter.
Finding error issues with HBAs is never any fun at all. The meanings of the errors that are tracked are often not well defined. Some errors messages are sent to system logs, but others are often only tracked within the HBA driver. Sometimes a GUI can be used to browse these error conditions. I have seen where driver configuration files are used to set the level of error tracking within the HBA driver and output to either the system log or HBA GUI.
The number of FC switch vendors with large port counts is relatively small. These switches are often call director class switches. Brocade, Inrange, McDATA, and now Cisco are the only vendors I am aware of that currently support over 64 ports. Most of these switches support most of the HBAs and most of the RAID devices.
Tapes are an issue for some switches, but the key word of caution is the word “most.” You will need to look at the switch vendor’s interoperability matrix for the switch, HBA, RAID, and tape devices to find out what works with what. Then comes the really fun issue of ensuring compatibility with driver and firmware releases for the HBAs, tapes, and RAIDs. Just because it worked with driver and firmware release XYZ does not mean the switch vendor will support a different driver and firmware release that might be required by a RAID or tape vendor.
Tracking switch errors is even more fun than tracking HBA errors. In most cases, to get the level of detail required for debugging, you need to connect to the switch via a vendor-supplied GUI. Some vendors provide SNMP export, but to get the really detailed information it is necessary to log in to the switch with the GUI. Issues such as CRC errors and low-level Fibre Channel errors must be monitored.
The number of RAID vendors continues to grow, and with that comes interoperability issues. Some vendors have a huge investment in large interoperability labs that test operating systems, servers, HBAs, and switches. Most of the major vendors support a large matrix of the above hardware. Of course, Linux has become an issue for many of these vendors, and the flavor of Linux supported for both the RAID and the RAID interface software (GUI) in some cases is an important consideration for both your site and the vendor testing.
RAID errors usually fall into three categories:
- Fibre Channel and/or SCSI errors between the RAID controller and switch
- RAID hard drive errors that might result in write reconstruct
- RAID backend Fibre Channel errors
The RAID errors between the switch and RAID controller are generally passed back to the server via error control within the HBA, but that doesn't mean that you always get everything you want. Whatever type of error you are receiving, more than likely you will need to view these errors via the management console or GUI provided by the RAID vendor. Server vendors that also sell storage sometimes pass the errors back to the system log, as they can integrate everything given that they control the OS.
Everything said above about RAID interoperability goes double or triple for tape. Making sure that everything works end-to-end is often very difficult. Issues with what tape firmware, HBA firmware, and HBA settings are huge. Setting up these tests is also very time consuming, and error injection is just plain hard. Buying HBAs from the tape vendor is a good idea just to have a finger to point and to make sure that you install the driver and firmware versions that they support.
As with RAID, multiple types of errors can occur, including:
- Tape drive errors at the SCSI or Fibre Channel layer
- Media errors within the tape itself
In most cases, media errors are passed back to the application using the tape drive and written to the system log file. As with RAID, tape drive errors are usually also passed back to the server side and written to the system log. Most drive vendors have SCSI pass-through commands that can be issued to get drive statistics and error conditions. These pass-through commands can get information or set up specific information within the tape drive and are “passed through” the system to the drive itself, not writing any data.
Debugging in the Real World
I was recently working with a customer on a problem where we were getting corrupted data and zero length files in a heterogeneous shared file system environment with dual HBAs, dual switches, RAID, and Fibre Channel tape. The first step was to figure out what was happening where and when, and correlate the log files.
This became a problem, though, as this was a new system, and the NTP daemons (network time protocol) had not been set up to run on the servers, IP switches, Fibre Channel switches, and RAID. The first step to debugging the problem was to match up the log times based on actual time and figure out what was happening and when it was happening. (Step 1a was getting the customer to ensure that NTP was running properly for future debugging.)
After matching the log files, we were able to determine a pattern for a number of problems and error conditions, all of which pointed to a hardware problem in the switch that only happened when the user application was performing asynchronous I/O to the client file systems. In other words, you had to fully understand the application, shared file system, HBAs, etc. – the entire data path – to discover the source of this problem.
The promises from vendors that everything will work together all of the time and that it will be easy to put together have yet to be realized. If you buy everything from a single vendor, you can generally be assured (at least if it has been sold for a while) that it will work together. On the other hand, if you or your manager decides that you're going to be the integrator, you need to pay careful attention to some of the basic interoperability issues for the hardware and software components.
Drivers and firmware compatibility issues continue to plague us. Most of the time everything works, but again, the key word is "most." The real problem areas that prove to be the most difficult are in developing High Availability (HA) systems with shared file systems and HBA, switch, RAID, and file system metadata failover. These systems almost always have the largest interoperability issues given the complexity and, from what I have seen, the lack of sufficient testing by the vendors. In the vendors' defense it takes a huge amount of money to maintain an interoperability lab just from the hardware and software perspective and even more money for smart people to run the lab.
Testing HA interoperability with shared file systems is very hard. Who is supposed to do the testing? The HBA vendor, switch vendor, file system vendor, tape vendor, who? You will likely get promises from the sales people from each of these companies. I guess Ronald Reagan said it best, “Trust, but Verify.”