How to Protect SharePoint Data
Increasingly, many organizations are turning to Microsoft SharePoint as their primary collaboration and content and records management platform.
Nilesh Mehta, president of NGenious Solutions and his firm's senior SharePoint architect, claims that's because SharePoint "is by far the most user-friendly collaboration portal/platform out there right now."
While competitors might take issue with that claim, there is no denying SharePoint's growing popularity. And as the use of SharePoint has taken off, so has the amount of data being created, making the storage and recovery of SharePoint data mission critical for many, if not most, companies. And while Microsoft (NASDAQ: MSFT) has bundled or included several handy tools for backing up and retrieving SharePoint data, such as versioning and a two-stage recycling bin, storage experts say that is not enough if you truly want to protect, preserve and quickly recover data.
Database Structure Creates Storage Issues
Unlike, say, Microsoft Office, SharePoint users create "sites" that support specific business needs, such as creating documents, slides and forms, so data is effectively organized and categorized for easy searching and retrieval. That's because with SharePoint, data is centralized in a Microsoft SQL Server, which, while it has its benefits, also has some disadvantages, particularly when it comes to storage and retrieval.
"In the past, if you had a lot of Office documents, you were probably storing them on your PC or you were storing them in a network share somewhere," said Lauren Whitehouse, an analyst with Enterprise Strategy Group. "If they were stored in a network share, they were probably being backed up and archived."
If you were storing them on a PC, she said, it was less likely they were getting backed up, or backed up regularly.
"Now all of a sudden you're centralizing everything into one repository. And that repository isn't just a folder structure, it's a SQL database," she explained. "So now I'm taking all of the content that I probably have on my PC, but probably wasn't storing on the network, and I've moved it into a database on the network and the same thing with anything I had in a network share. So that potentially leads to a pretty significant increase in primary storage. And then, of course, any time you update one of those documents, you've got a versioning system that keeps multiple versions of your document. So now potentially not only is that primary store getting very large, but your secondary storage capacity is going to get very large too."
On top of that, instead of storing flat files in a folder structure, now you're storing flat files in a database, which makes the task of storing and retrieving that data a bit more complex.
As Whitehouse put it, next thing you know, you've gone from "a file server and a couple of PCs to a couple of clustered database systems and multiple file servers, in potentially multiple locations, to store [your] documents."
And backup and recovery likewise become more complex.
While Microsoft provides a safety net (i.e., versioning and that two-stage recycle bin) for folks who are prone to accidentally deleting files, for more serious back up and recovery, you have pretty much two choices within SharePoint. Either you can use Microsoft SharePoint's Stsadm tool for command-line administration, which, said Whitehouse, "is a little bit more complex for a non-Windows administrator to figure out" and does not give you a whole lot of customization options, nor necessarily guarantees you a complete backup. Or you can go to the Central Administration page to configure settings and backups.
Using SharePoint Central Administration, you can "point it to a backup file share and it will back up all your databases it does some kind of compression on them and your whole farm, including the configuration database, the content database, and everything in there, and takes basically a snapshot of your whole SharePoint environment," explained Mehta.
"So if you ever have a situation where you realize your SharePoint farm basically died on [you]," he said, you can use SharePoint Central Administration to restore that whole farm to where it was.
The only drawback with that is that the Central Administration graphical user interface (GUI) doesn't allow you to automatically schedule backups, which means you would manually have to do that. Whereas with the scripting option (using the Stsadm tool), "you can tie it into your Windows [Task] Scheduler, and every day at that time Windows [Task] Scheduler would kick off that command line for you," said Whitehouse.
Third-Party Backup and Recovery Options
So if you stay within the Microsoft Office SharePoint Server environment, your choices for storing data are to either go with the command-line tool, which can be hard to use and may not give you a complete backup of the configuration database, or you can use the built-in GUI, which is much easier to use but doesn't allow you to schedule your backups. Another problem with both solutions is that recovery time, especially from tape, can be painfully slow (taking up to 10 hours, depending on what you are trying to recover and from where, said Mehta).
That's why both Mehta and Whitehouse recommend that enterprises invest in a third-party SharePoint backup and recovery utility, such as AvePoint's DocAve.
"AvePoint has probably the best SharePoint backup, recovery and archiving solution," said Whitehouse. Probably because the company was born out of SharePoint consulting and that's pretty much all they focus on. AvePoint also OEMs DocAve to IBM (NYSE: IBM), for its Tivoli Storage Manager, and to NetApp (NASDAQ: NTAP), for its SnapManager.
Whitehouse and Mehta also like CommVault (NASDAQ: CVLT) Galaxy Backup and Recovery and Symantec (NASDAQ: SYMC) Backup Exec for SharePoint data storage and recovery.
Not only are AvePoint and CommVault's solutions faster, "they restore data with all the metadata there," said Mehta. "The exact time stamp, the owner information, modified by, etc."
Maximizing SharePoint and Minimizing Data
In addition to properly backing up your SharePoint data, Whitehouse also recommended that enterprises go the extra step of properly archiving data too. And when Whitehouse talks about archiving, as opposed to just storing data and shipping it offsite, to be recovered (if necessary) later, she means "removing persistent data from the environment, which is going to help you with your overall storage capacity in the long term. And it's also going to preserve that piece of information for long-term retention."
As Whitehouse explained, SharePoint is a collaboration tool, allowing people to work on projects together. However, when a particular project is done, typically the data doesn't change. "It's unchanging, and it's just sitting there taking up space [in the SQL database]," she said. And when that SQL database starts getting really large, "performance goes down. So you have to stay on top of it and continually prune your database to make sure you achieve the right levels of performance."
Hence the need for archiving. And again Whitehouse recommended that enterprises look to third-parties like AvePoint and CommVault to assist them with the task.
Another advantage of archiving your SharePoint data? It satisfies a lot of the compliance and e-discovery requirements imposed on companies.
In addition to archiving, Mehta advised enterprises to group sites by priority, "because that will help you in terms of how you back them up and restore them." He also recommended that end users limit sites to 5GB to 25GB if there are going to be multiple sites in the same content database. "It is easier to restore a site that is five gigabytes versus a site that is 200 gigabytes," he said.
And no matter which solution you use, make sure you have enough space on your SQL servers and back up your SharePoint data regularly, said both Whitehouse and Mehta. "That will save your life," said Mehta.