Book Excerpt: SAN Backup and Recovery
By W. Curtis Preston
In this chapter:
SAN Backup and Recovery
LAN-Free, Client-Free, or Server-Free?
A SAN typically consists of multiple servers, online storage (disk), and offline storage (tape or optical), all of which are connected to a Fibre Channel switch or hub--usually a switch. You can see all these SAN elements in Figure 4-1. Once the three servers in Figure 4-1 are connected to the SAN, each server in the SAN can be granted full read/write access to any disk or tape drive within the SAN. This allows for LAN-free, client-free, and server-free backups, each represented by a different-numbered arrow in Figure 4-1.
TIP: Pretty much anything said in this chapter about Fibre Channel-based SANs will soon be true about iSCSI-based SANs. To visualize how an iSCSI SAN fits into the pictures and explanations in this chapter, simply replace the Fibre Channel HBAs with iSCSI-capable NICs, and the Fibre Channel switches and hubs with Ethernet switches and hubs, and you've got yourself an iSCSI-based SAN that works essentially like the Fibre Channel-based SANs discussed in this chapter. The difficulty will be in getting the storage array, library, and software vendors to support it.
- LAN-free backups
- LAN-free backups occur when several servers share a single tape library. Each server connected to the SAN can back up to tape drives it believes are locally attached. The data is transferred via the SAN using the SCSI-3 protocol, and thus doesn't use the LAN. All that is needed is software that will act as a "traffic cop." LAN-free backups are represented in Figure 4-1 by arrow number 1, which shows a data path starting at the backup client, traveling through the SAN switch and router, finally arriving at the shared tape library.
- Client-free backups
- Although an individual computer is often called a server, it's referred to by the backup system as a client. If a client has its disk storage on the SAN, and that storage can create a mirror that can be split off and made visible to the backup server, that client's data can be backed up via the backup server; the data never travels via the backup client. Thus, this is called client-free backup. Client-free backups are represented in Figure 4-1 by arrow number 2, which shows a data path starting at the disk array, traveling through the backup server, followed by the SAN switch and router, finally arriving at the shared tape library. The backup path is similar to LAN-free backups, except that the backup server isn't backing up its own data. It's backing up data from another client whose disk drives happen to reside on the SAN. Since the data path doesn't include the client that is using the data, this is referred to as client-free backups.
- Server-free backups
- If the SAN to which the disk storage is connected supports a SCSI feature called extended copy, the data can be sent directly from disk to tape, without going through a server. There are also other, more proprietary, methods for doing this that don't involve the extended copy command. This is the newest area of backup and recovery functionality being added to SANs. Server-free backups are represented in Figure 4-1 by arrow number 3, which shows a data path starting at the disk array, traveling through the SAN switch and router, and arriving at the shared tape library. You will notice that the data path doesn't include a server of any kind. This is why it's called server-free backups.
To my knowledge, I am the first to use the term client-free backups. Some backup software vendors refer to this as server-free backups, and others simply refer to it by their product name for it. I felt that a generic term was needed, and I believe that this helps distinguish this type of backups from server-free backups, which are defined next.
No backup is completely LAN-free, client-free, or server-free. The backup server is always communicating with the backup client in some way, even if it's just to get the metadata about the backup. What these terms are meant to illustrate is that the bulk of the data is transferred without using the LAN (LAN-free), the backup client (client-free), or the backup server (server-free).
Client-Free or Server-Free?
I am well aware that some vendors refer to client-free backups as server-free backups. I find their designation confusing, and I'm hoping that my term client-free backups will catch on. The reason that they refer to client-free backups as server-free is that the backup world turns things around. What you refer to as a server, the backup system refers to as a client. Although you consider it a server, it's a client of the backup and recovery system. When it comes to the backup and recovery system, the server is the system that is controlling all of the backups. I am trying to keep things consistent by referring to backups that don't go through a client (but do go through the backup server) as client-free backups. In order for me to call something server-free, the data path must not include any server of any kind. The only backup like that is an extended copy backup as described in the previous section. Therefore, I refer only to extended copy backups as server-free backups. That's my story and I'm sticking to it.
W. Curtis Preston has specialized in designing backup and recovery systems for over eight years, and has designed such systems for many environments, both large and small. The first environment that Curtis was responsible for went from 7 small servers to 250 large servers in just over two years, running Oracle, Informix, and Sybase databases and five versions of Unix. He started managing this environment with homegrown utilities and eventually installed the first of many commercial backup utilities. His passion for backup and recovery began with managing the data growth of this 24x7, mission-critical environment. Having designed backup systems for environments with small budgets, Curtis has developed a number of freely available tools, including ones that perform live backups of Oracle, Informix, and Sybase. He has ported these tools to a number of environments, including Linux, and they are running at companies around the world. Curtis is now the owner of Storage Designs, a consulting company dedicated entirely to selecting, designing, implementing, and auditing storage systems. He is also the webmaster of www.backupcentral.com.
1. This term may be changed in the near future, since iSCSI-based SANs will, of course, use the LAN. But if you create a separate LAN for iSCSI, as many experts are recommending, the backups will not use your production LAN. Therefore, the principle remains the same, and only the implementation changes.
8. Although it's possible that some software products have also implemented a third-party queuing system for the robotic arm as well, I am not aware of any that do this. As long as you have a third-party application controlling access to the tape library and placing tapes into drives that need them, there is no need to share the robot in a SCSI sense.
9. Network Appliance filers appear to act this way, but the WAFL filesystem is quite a bit different. They store a "before" image of every block that is changed every time they sync the data from NVRAM to disk. Each time they perform a sync operation, they leave a pointer to the previous state of the filesystem. A Network Appliance snapshot, then, is simply a reference to that pointer. Please consult your Network Appliance documentation for details.
12. There are vendors that are shipping gigabit network cards that offload the TCP/IP processing from the server. They make LAN-based backups easier, but LAN-free backups are still better because of the design of most backup software packages.