Book Excerpt: SAN Backup and Recovery Page 12
By W. Curtis Preston
Once the transaction log recovery is complete, the database is fully online, whether or not the backup mirror has been fully restored to the primary disk set (see Figure 4-14).
Isn't that recovery scenario a beautiful thing? Imagine recovering a 10-TB database in under a minute. If you need the ultimate in instant recovery, it's difficult to beat the concept of client-free backup and recovery.
Other Variations on the Theme
Some people love to have an entire copy of their database or file server sitting there at all times just waiting for an instant restore. Others feel that this is a waste of money. All that extra disk can cost quite a bit. They like the idea of client-free backups--backing up the data through a system other than the system that is using the data. However, they don't want to pay for another set of disks to store the backup mirror. What do they do?
The first thing some do is to share the backup mirror. Whether the backup mirror is in an enterprise storage array or a group of JBOD that is made available on the SAN for that purpose, it's possible to establish this backup mirror to more than one primary disk set. Obviously you can't back up more than one primary disk set at one time using this method, but suppose you have primary disk sets A and B. Once you finish establishing, splitting, and backing up primary disk set A, you can establish the backup mirror to primary disk set B, split it, then back it up again. Many people rotate their backup mirrors this way.
However, some storage vendors have a solution that's less expensive than having a separate backup mirror for each primary disk set and is more elegant than rotating a single backup mirror between multiple primary disk sets: snapshots.
Again, let's start with an enterprise storage array or a group of JBOD being managed by an enterprise volume manager. Recall that in the previous section "LAN-Free Backups," some volume managers and storage arrays can create a virtual copy, or a snapshot. To another host that can see it, it looks the same as a backup mirror. It's a static "picture" of what the volumes looked like at a particular time.
Creating the snapshot is equivalent to the step where you split off the backup mirror. There is no step where you must establish the snapshot. You simply place the application into a backup status, create the snapshot, and then take the application out of backup status. After that, the storage array or enterprise volume manager software makes the device visible to another host. You then perform the same steps as outlined in the previous section "Backing Up the Backup Mirror," for importing and mounting the volumes on the backup server. Once that is accomplished, you can back it up just like any other device.
Recovery of a snapshot
Similar to the backup mirror backup, the snapshot can typically be left around in case of recovery. However, most snapshot recoveries aren't going to be as fast as the backup mirror recovery discussed earlier. Although the data can still be copied from disk to disk, the recovery may not be as instantaneous. However, since this section of the industry is changing so fast, this may no longer be the case by the time you read this.
It should also be noted that, unlike the backup mirror, the snapshot still depends on the primary disk set's disks. Since it's merely a virtual copy of the disks, if the original disks are physically damaged, the snapshot becomes worthless. Therefore, snapshots provide quick recovery only in the case of logical corruption of the data, since the snapshot still contains a noncorrupted copy of the data.
A valid option
This variation on client-free backup represents a valid option for many. Depending on your configuration, this method can be significantly less expensive than a backup mirror, while providing you exactly the level of recoverability you need.
So far in this chapter, we have discussed LAN-free and client-free backups. LAN-free backups help speed backup and recovery by removing this traffic from the LAN. They allow you to purchase one or more really large tape libraries and share those tape libraries among multiple servers. This allows you to take advantage of the economies of scale that large libraries provide, as the cost per MB usually decreases as the size of a library increases. Backing up large amounts of data is much easier on the clients when that data is sent via Fibre Channel, instead of being sent across the LAN. Therefore, LAN-free backups reduce the degree to which backups affect any applications that are running on the client.
Client-free backups also offer multiple advantages. They remove almost all application impact from the backup client, since all significant I/O operations are performed on the backup server. If you use a backup mirror for each set of primary disk sets, they also offer a virtually instantaneous recovery method. However, this recovery speed does come with a significant cost, since purchasing a backup mirror for each set of primary disk sets requires purchasing more disk capacity then would otherwise be necessary. If your primary disk set is mirrored, you need to purchase a backup mirror, requiring the purchase of 50% more disk. If the primary disk set is a RAID 4 or RAID 5 volume, you will need to purchase almost 100% more disk.
What about snapshots? You may remember that we have discussed snapshots as an alternate way to provide instantaneous recoveries. However, snapshots (that have not been backed up to tape) only protect against logical corruption, and the loss of too many drives on the primary disk set results in a worthless snapshot. Therefore, client-free backups that use a backup mirror are the only backup and recovery design that offers instantaneous recovery after the loss of multiple drives on the primary disk set.
The remaining disadvantage to client-free backups is that you still need a server to act as a data path for the data to be backed up. Let's say that you've got several large backup clients running the same operating system, and all of their storage resides on an enterprise storage array that can perform client-free backups. Chances are that you are now talking about a significant amount of storage to back up each night. In fact, many client-free backup designs use full backups every night, because every file they back up changes every night. This is the case with almost any database that uses "cooked" files on the filesystem for its data files and is always true of a database that uses raw devices for the same purpose. (While there are a few block-level incremental packages available, they aren't yet widely used.) This means that if you back up five 2-TB clients, you are probably backing up 10 TB every night. Even though you have offloaded the data transfer from the data servers to the backup server, you will need a reasonably large server to back up 10 TB every night, and that server can have no purpose other than that.
With LAN-free backups, you back up the data without using the LAN. With client-free backups, you back up the data without using the client. What if you could transfer the data directly from disk to tape, without going through any server at all? If you could, your backups would be completely server-free. As with client-free backups, applications are almost completely unaffected, because few I/O operations are performed on the backup client.
One significant difference between client-free backups and server-free backups is that there are no homegrown server-free backup designs. The reasons for this will become clear as you see how deep into the filesystem and device driver levels you must go to accomplish the task.
Look, Ma, No Server
A truly server-free backup has a data path that doesn't include a server. It uses a server, of course, to control the process, but the data moves directly from disk to tape without going through any server's CPU--including the backup server. There are three essential requirements of server-free backup. You must be able to:
- Present the backup application with a static view of the disk set
- Map the blocks of data on the disk set to the files to which they belong
- Move the data directly from disk to tape
Getting a static view of the data
In order to use a backup mirror setup to present a static copy to the backup application, you must perform the following steps:
- Establish the mirror
- Quiesce the application writing to the disk
- Split the mirror
If you wish to use a snapshot to present a static copy to the backup application, you need to perform only two steps:
- Quiesce the application writing to the disk
- Create the snapshot
WARNING: Not all database applications permit online third-party backups.
The procedures performed during this step are essentially the same as those used for client-free backup. Therefore, if you need more detail about these steps, please refer to that section of this chapter.
Logically mapping the disk to the filesystem
Now that the backup application has a static view of the disks to back up, you must perform an additional step that is necessary because of the way you back up the data. LAN-free and client-free backups back up filesystem data in the traditional way--via the filesystem. However, the method server-free backup uses to transfer the data to tape has no knowledge of the filesystem. As you'll see in the next step, it will simply transfer blocks of data to tape.
Therefore, prior to transferring these blocks of data to tape, you must create a map of which blocks belong to which files. That way, when the backup application is asked to restore a certain file, it knows which blocks of data to restore. Figure 4-15 illustrates how this works. File A consists of data blocks A, B, and C. File B consists of data blocks D, E, and F. Once the snapshot or split mirror has been created, these mappings remain constant until the split mirror or snapshot is created again. While it's static, the backup application records the file associations for File A and File B. When the restore application asks for File A to be restored, the server-free backup software knows it needs to restore disk blocks A, B, and C.
Transferring the data directly from disk to tape
The reason for the step in the previous section is that the main difference between client-free and server-free backups is the path the data takes on its way to tape--as illustrated in Figure 4-16. You'll note that Figure 4-16 looks much like Figure 4-9 with a few minor changes. First, you no longer need the dedicated backup server that is seen to the right of the storage array in Figure 4-9. This means that Backup Server A takes the responsibility of quiescing the application that is writing to the disk. Second, the tape library and backup mirror (or snapshot) must be connected to a SAN with support for extended copy (xcopy). Finally, instead of sending the data to tape via the backup server (as shown in (4a) and (4b) in Figure 4-9), the backup software uses the SCSI extended copy command to send the data directly from disk to tape. This means, of course, that the data is being backed up on the raw-device level. Device-level backups are often referred to as image-level backups.
W. Curtis Preston has specialized in designing backup and recovery systems for over eight years, and has designed such systems for many environments, both large and small. The first environment that Curtis was responsible for went from 7 small servers to 250 large servers in just over two years, running Oracle, Informix, and Sybase databases and five versions of Unix. He started managing this environment with homegrown utilities and eventually installed the first of many commercial backup utilities. His passion for backup and recovery began with managing the data growth of this 24x7, mission-critical environment. Having designed backup systems for environments with small budgets, Curtis has developed a number of freely available tools, including ones that perform live backups of Oracle, Informix, and Sybase. He has ported these tools to a number of environments, including Linux, and they are running at companies around the world. Curtis is now the owner of Storage Designs, a consulting company dedicated entirely to selecting, designing, implementing, and auditing storage systems. He is also the webmaster of www.backupcentral.com.
1. This term may be changed in the near future, since iSCSI-based SANs will, of course, use the LAN. But if you create a separate LAN for iSCSI, as many experts are recommending, the backups will not use your production LAN. Therefore, the principle remains the same, and only the implementation changes.
8. Although it's possible that some software products have also implemented a third-party queuing system for the robotic arm as well, I am not aware of any that do this. As long as you have a third-party application controlling access to the tape library and placing tapes into drives that need them, there is no need to share the robot in a SCSI sense.
9. Network Appliance filers appear to act this way, but the WAFL filesystem is quite a bit different. They store a "before" image of every block that is changed every time they sync the data from NVRAM to disk. Each time they perform a sync operation, they leave a pointer to the previous state of the filesystem. A Network Appliance snapshot, then, is simply a reference to that pointer. Please consult your Network Appliance documentation for details.
12. There are vendors that are shipping gigabit network cards that offload the TCP/IP processing from the server. They make LAN-based backups easier, but LAN-free backups are still better because of the design of most backup software packages.