The National Library of Scotland (NLS), the largest library in Scotland and a leading research institution, recently embarked on a mission to create a Trusted Digital Repository to preserve materials digitally and allow users to access that content online.
As part of the £1.8 million project, the library allocated up to £900,000 to cover data storage and retained the services of GlassHouse Technologies to help it design a proper storage specification and select vendors. The goal of the partnership was “to define and procure, though a thorough tender document process [RFP], the most effective solution for the greatest possible storage capacity delivering the most cost-effective value,” according to the RFP.
Prior to choosing a new storage system, the NLS, which has three central buildings in Edinburgh and a data replication/disaster recovery site in Glasgow, had approximately 10 Windows NAS and a number of application servers, holding approximately 20 TB of data. The goal for the new storage system: to hold at least 200 usable terabytes of rapid random access data storage, 100 TB of which will be stored at the NLS headquarters in Edinburgh, 20 percent of which will be used exclusively by one of up to 50 application servers, primarily running databases, with the remaining capacity shared for file storage over a TCP/IP Gigabit Ethernet network.
To Protect and to Scale
Working with GlassHouse, the library’s Information and Communications Technology Division (ICT) released a 17-page RFP in October, with proposals due by December 11, 2006. Among the Mandatory Requirements listed in the Solution Specification:
- Data shall be protected by fully redundant data storage hardware configurations without any single point of failure between data and host systems.
- RAID Level 6 using two parity drives per array is considered essential to protect ATA technology drives. RAID Level 5 using a single parity drive per array may be acceptable for SCSItechnology drives with sector sparing and recovery. At least one hot spare disk must be configured available within each disk enclosure on which a failed array will be rebuilt.
- Block-accessible storage shall support all Microsoft-supported server operating systems and all Linux distributions and kernel versions supported by Red Hat and Novell. Storage support for application services shall include all Microsoft-supported versions of Exchange and SQL and all significant SQL-compliant open source database application versions in general use. Support is understood to mean full and unrestricted software interoperability extending to all Intel-compatible and third-party hardware qualified with these operating systems, and the timely remediation of support issues.
- The ability to create scheduled copy-on-write snapshots of either the file-system or block-level storage.
“We wanted a system that would be simple to manage and scalable,” said David Dinham, ICT manager at the NLS. “The other issues that were important were open standards and that we had the right level of data protection so that we did not have a single point of failure within the architecture, and the ability to retrieve data relatively quickly.”
Hitachi, Brocade and ONStor Hit the Mark
After meeting with several vendors and weighing the pros and cons of each on a scorecard GlassHouse helped the ICT team devise, the NLS selected two Hitachi TagmaStore Adaptable Modular Storage Model AMS1000 systems with two Brocade M4400 Fabric Switches, two Brocade M1620 SAN Routers and two clustered ONStor Bobcat 2240 NAS Gateways.
“One of the features of the Hitachi system is it offers RAID 6, which not many other systems do,” said Colin Morrison, ICT systems and storage officer. “That and the ability to expand it easily were two of the key reasons we went with it.”
“It’s an open, modular, disk-based storage system that will give them the right levels of availability and performance and data integrity,” said Jim Spooner, strategy services team leader at GlassHouse Technologies. “It will also enable them to be protected appropriately and was the most cost effective, in terms of purchase costs, installation costs and cost of ownership going forward. It will allow them to upgrade in a modular way without ever having to affect step changes to meet new requirements.”
The NLS took delivery of the storage system in March. Next month, the ICT team hopes to put the system into full production at the library’s headquarters in Edinburgh and at its disaster recovery/replication site 100 miles away in Glasgow. In limited production so far, the new system appears to be performing as advertised, though the real test will come in the coming months and years, as the NLS aggressively digitizes and stores its vast collections of print and rich media and makes them available online.
Meeting Big Storage Needs
As for advice to other libraries or organizations looking to digitize, store and give users access to terabytes of digital data, Dinham recommends that they first “be very clear on what the business purpose of the system is. And then make sure that you get qualified people, like GlassHouse, to advise you on the technical infrastructure, especially if you don’t have a huge amount of experience with high-capacity storage.”
“You also need to very much understand your information lifecycle,” said Dinham. “The reason for making our system mainly disk-based is people need to pull data pretty quickly off the storage network, but also at our Glasgow site we’re going to be doing consistency checks all the time, to make sure that files are not corrupted, for example, and we need to do that relatively quickly.”