Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Gregg Paulk, director of Information Technologies at the Anderson Center for Autism, a nonprofit providing year-round day and residential programs to children and adults with autism, was frustrated with the center's tape-based backup drives.
With terabytes of data to store including sensitive medical and student records, digital pictures and video and e-mail correspondence nearly a million papers to be scanned and archived, and the center going through an expansion, he couldn't afford to wait around for information to be backed up or worry about information being lost or misplaced in the event of a problem.
Resiliency, Reliability and ROI
Paulk said the center's tape-based backup drives "were killing me on backup times and speed. It was taking us 60 hours to do backups. I had to get something that was significantly faster." In addition, Anderson's CEO was big on the three Rs: resiliency, reliability and ROI.
With a mental checklist he wanted something disk-based, that was fast, resilient and reliable, and that wouldn't break his budget but no specific system in mind, Paulk attended Storage Networking World in the fall of 2006 to do a little window shopping.
Cruising the aisles, Paulk checked out tape-based libraries, SANs, NAS and other storage systems, and archiving and de-duplication solutions. But the system that really caught his eye was NEC's HYDRAstor HS8 grid storage system.
Not only was the HYDRAstor disk-based and fast (or so the company claimed), but because of its grid-based architecture, users could start small and increase performance and storage capacity as needed by adding nodes, which meant less outlay up front.
As Karen Dutch, the vice president of NEC's Advanced Storage Product Group explained, "You can start relatively small, with four storage nodes, and [get] 300 percent more resiliency with the default setting than with, say, a RAID 5 environment, with the same storage overhead."
"As your data needs expand, you can simply add more storage nodes to that particular system," she said. "So you can start really small, with [a couple of] terabytes, and with one system scale today to over 10 PB."
In addition to the ability to scale as needed, Paulk liked that HYDRAstor could de-duplicate backup and archive data on a single platform. And, perhaps best of all, NEC was looking for beta testers.
"It was a win-win situation," said Paulk, who after briefly checking out some of the competition, signed up with NEC.
Saving Time and Space
Paulk and his team started beta-testing the HYDRAstor HS8 in February 2007 and concluded testing last fall and reported that the system passed, getting high marks in all areas (including cost, which is under $0.70 per GB).
"I've been quite astonished," said Paulk. "You know how technology is: You have companies that say they can do all these things better than sliced bread, and you always find some shortcomings. This is the first product I've ever worked with that actually does everything it says it can do." And more, he added.
During the testing phase, the center's lead tech pulled network cables from the back of the system during a live backup, creating a 25 percent resource loss. "Me and my lead tech were looking at each other and saying 'there's no way this is going to work.' And he said, 'It's all there, Gregg.'"
Not only was no data lost, there was no interruption. The system just kept backing up the center's data. And it has continued to successfully back up data some 2 TB of it without a single failure ever since.
"We do a full back up of our multiple servers weekly," said Paulk, "which in total combine to about 2 TB. However, we're currently in a testing phase of some archiving programs [part of the HYDRAstor solution] we have massive amounts of paper documents in basements and trailers and scattered all about campus that we're going to be scanning and storing so that 2 TB is going to grow steadily over the next year or so."
But Paulk isn't worried about all that extra data slowing down the system or operations in general. Indeed, since the system went live, instead of taking 60 hours a week to back up data, it now takes six, even though Anderson has almost twice as much data and twice as many servers as it did before HYDRAstor.
That's in large part due to HYDRAstor's data de-duplication technology, a patent-pending system that claims to reduce storage up to 95 percent by eliminating redundant data both across and within incoming data streams as well as across both backup and archive data. However it works, Paulk said that since installing HYDRAstor, the data center's been getting a 39:1 de-duplication ratio (though he and his team would have been happy with 10:1), freeing up lots of much-needed disk real estate.
As for managing the system, Paulk said it's easy. Using HYDRAstore's browser-based GUI, he and his team can choose whether or not to do one large backup or multiple smaller (or granular) ones.
That's because HYDRAstor uses a new technology called Distributed Resilient Data, or DRD. "We don't use RAID or mirroring technology in the system," explained Dutch. Rather, using DRD you, the user, "pick how many failures you want the system to survive in the way of HDD failures or node failures. You simply set that number in the GUI and HYDRAstor optimizes how you store data and creates parity based on that number."
Up to the Task
If by some chance something does go wrong with the system, or Paulk or a member of his team has a question, support is available 24/7. In fact, according to Dutch, NEC remotely monitors all systems, so it often knows before the customer if there is an issue and can fix it remotely.
To date, Paulk hasn't had any problems with the system. His only problem is managing the rapid growth the New York-based center has been experiencing. "We're growing extremely fast," he explained, with the center expanding services and adding facilities and housing to meet the needs of its growing client base, children and adults diagnosed with autism, which affects 1 in 150 children, according to recent estimates.
But by having a fast, resilient and reliable data backup, archive and de-duplication system in place, Paulk said, he's been able to spend more time focusing on and adequately provisioning for the center's upcoming IT needs and less time worrying about the potential loss of critical data.