Book Excerpt: SAN Backup and Recovery Page 2 - EnterpriseStorageForum.com

Book Excerpt: SAN Backup and Recovery Page 2


By W. Curtis Preston

LAN-Free Backups

As discussed in Chapter 1, LAN-free backups allow you to use a SAN to share one of the most expensive components of your backup and recovery system--your tape or optical library and the drives within it. Figure 4-2 shows how this is simply the latest evolution of centralized backups. There was a time when most backups were done to locally attached tape drives. This method worked fine when data centers were small, and each server could fit on a single tape. Once the management of dozens (or even hundreds) of individual tapes became too much or when servers would no longer fit on a tape, data centers started using backup software that allowed them to use a central backup server and back up their servers across the LAN. (The servers are now referred to by the backup system as clients.)

Figure 4-2. The evolution of backup methodologies

 

This methodology works great as long as you have a LAN that can support the amount of network traffic such backups generate. Even if you have a state-of-the-art LAN, you may find individual backup clients that are too big to back up across the LAN. Also, increasingly large amounts of system resources are required on the backup server and clients to back up large amounts of data across the LAN. Luckily, backup software companies saw this coming and added support for remote devices. This meant that you could again decentralize your backups by placing tape drives on each backup client. Each client would then be told when and what to back up by the central backup server, but the data would be transferred to a locally attached tape drive. Most major software vendors also allowed this to be done within a tape library. As depicted in Figure 4-2, you can connect one or more tape drives from a tape library to each backup client that needs them. The physical movement of the media within the library is then managed centrally--usually by the backup server.

Although the backup data at the bottom of Figure 4-2 isn't going across the LAN, this isn't typically referred to as LAN-free backups. The configuration depicted at the bottom of Figure 4-2 is normally referred to as library sharing, since the library is being shared, but the drives aren't. When people talk about LAN-free backups, they are typically referring to drive sharing, where multiple hosts have shared access to an individual tape drive. The problem with library sharing is that each tape drive is dedicated to the backup client to which it's connected.[2] The result is that the tape drives in a shared library go unused most of the time.

As an example, assume we have three large servers, each with 1.5 TB of data. Five percent of this data changes daily, resulting in 75 GB of incremental backups per day per host.[3] All backups must be completed within an eight-hour window, and the entire host must be backed up within that window, requiring an aggregate transfer rate of 54 MB/s[4] for full backups. If you assume each tape drive is capable of 15 MB/s,[5] each host needs four tape drives to complete its full backup in one night. Therefore, we need to connect four tape drives to each server, resulting in a configuration that looks like the one at the bottom of Figure 4-2. While this configuration allows the servers to complete their full backups within the backup window, that many tape drives will allow them to complete the incremental backup (75 GB) in just 20 minutes!

An eight-hour backup window each night results in 240 possible hours during a month when backups can be performed. However, with a monthly full backup that takes eight hours, and a nightly incremental backup that takes 20 minutes, they are going unused for 228 out of their 240 available hours.

However, suppose we take these same three servers and connect them and the tape library to a SAN as illustrated in Figure 4-3. When we do so, the numbers change drastically. As you can see in Figure 4-3, we'll connect each host to a switched fabric SAN via a single 100-MB Fibre Channel connection. Next, we must connect the tape drives to the SAN. (For reasons explained later, however, we will only need five tape drives to meet the requirements discussed earlier.) If the tape drives supported Fibre Channel natively, we can just connect them to the SAN via the switch. If the tape drives can connect only via standard parallel SCSI cables, we can connect the five tape drives to the SAN via a Fibre Channel router, as can be seen in Figure 4-3. The routers that are shipping as of this writing can connect one to six SCSI buses to one or two Fibre Channel connections. For this example, we will use a four-to-one model allowing us to connect our five tape drives to the SAN via a single Fibre Channel connection. (As is common in such configurations, two of the drives will be sharing one of the SCSI buses.)

Figure 4-3. LAN-free backups

 

With this configuration, we can easily back up the three hosts discussed previously. The reason that we only need only five tape drives is that in the configuration shown in Figure 4-3, we can use dynamic drive sharing software provided by any of the major backup software companies. This software performs the job of dynamically assigning tape drives to the hosts that need them. When it's time for a given host's full backup, it is assigned four of the five drives. The hosts that aren't performing a full backup will share the fifth tape drive. By sharing the tape drives in this manner, we could actually backup 25 hosts this size. Let's take a look at some math to see how this is possible.

Assume we have 25 hosts, each with 1.5 TB of data. (Of course, all 25 hosts need to be connected to the SAN.) With four of the five tape drives, we can perform an eight-hour full backup of a different host each night, resulting in a full backup for each server once a month. With the fifth drive, we can perform a 20-minute incremental backup of the 24 other hosts within the same eight-hour window.[6] If this was done with directly connected SCSI, it would require connecting four drives to each host, for a total of 100 tape drives! This, of course, would require a really huge library. In fact, as of this writing, there are few libraries available that can hold that many tape drives. Table 4-1 illustrates the difference in the price of these two solutions.

Table 4-1: 25-host LAN-free backup solution

 

Parallel SCSI solution

SAN solution

 

Quantity

Total

Quantity

Total

Tape drives

100

$900K

5

$45K

HBAs

25

$25K

25

$50K

Switch

None

 

2

$70K

Router

None

 

1

$7K

Device server

software

Same

 

Same

 

Tape library

Same

 

Same

 

Total

 

$925K

 

$172K

TIP:   Whether or not a monthly full backup makes sense for your environment is up to you. Unless circumstances require otherwise, I personally prefer a monthly full backup, nightly incremental backups, followed by weekly cumulative incremental backups.

Click here to buy book

Building SANs with Brocade Fabric Switches

Authors
W. Curtis Preston has specialized in designing backup and recovery systems for over eight years, and has designed such systems for many environments, both large and small. The first environment that Curtis was responsible for went from 7 small servers to 250 large servers in just over two years, running Oracle, Informix, and Sybase databases and five versions of Unix. He started managing this environment with homegrown utilities and eventually installed the first of many commercial backup utilities. His passion for backup and recovery began with managing the data growth of this 24x7, mission-critical environment. Having designed backup systems for environments with small budgets, Curtis has developed a number of freely available tools, including ones that perform live backups of Oracle, Informix, and Sybase. He has ported these tools to a number of environments, including Linux, and they are running at companies around the world. Curtis is now the owner of Storage Designs, a consulting company dedicated entirely to selecting, designing, implementing, and auditing storage systems. He is also the webmaster of www.backupcentral.com.

Footnotes:

1. This term may be changed in the near future, since iSCSI-based SANs will, of course, use the LAN. But if you create a separate LAN for iSCSI, as many experts are recommending, the backups will not use your production LAN. Therefore, the principle remains the same, and only the implementation changes.

2. As mentioned later in this chapter, SCSI devices can be connected to more than one host, but it can be troublesome.

3. This is actually a high rate of change, but it helps prove the point. Even with a rate of change this high, the drives still go unused the majority of the time.

4. 1.575 TB ÷ 8 hours ÷ 60 minutes ÷ 60 seconds = 54.6 MB/s

5. There are several tape drives capable of these backup speeds, including AIT-3, LTO, Mammoth, Super DLT, 3590, 9840, and DTF.

6. 20 minutes × 24 hosts = 480 minutes, or 8 hours

7. These are Unix prices. Obviously, Windows-based cards cost much less.

8. Although it's possible that some software products have also implemented a third-party queuing system for the robotic arm as well, I am not aware of any that do this. As long as you have a third-party application controlling access to the tape library and placing tapes into drives that need them, there is no need to share the robot in a SCSI sense.

9. Network Appliance filers appear to act this way, but the WAFL filesystem is quite a bit different. They store a "before" image of every block that is changed every time they sync the data from NVRAM to disk. Each time they perform a sync operation, they leave a pointer to the previous state of the filesystem. A Network Appliance snapshot, then, is simply a reference to that pointer. Please consult your Network Appliance documentation for details.

10. It was the Microsoft's partnership with Veritas that finally made this a reality. The volume manager for Windows 2000 is a "lite" version of Veritas Volume Manager.

11. Prior to 9i, this was done with the suvrmgr command, but this command has been removed from 9i.

12. There are vendors that are shipping gigabit network cards that offload the TCP/IP processing from the server. They make LAN-based backups easier, but LAN-free backups are still better because of the design of most backup software packages.

13. It's not quite 100%, since the second stripe doesn't have to be a RAID 5 set. If it were simply a RAID 0 set, you'd need about 90% more disk than you already have.

14. If your backup software supports library sharing.


Page 2 of 13

Previous Page
1 2 3 4 5 6 7 8 9 10 11 12 13
Next Page

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 

Storage Daily
Don't miss an article. Subscribe to our newsletter below.

Thanks for your registration, follow us on our social networks to keep up-to-date