Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Hitachi Data Systems
Fred Oh, Sr. Product Marketing Manager, NAS Product Line for Hitachi Data Systems (HDS), says that while the term Big Data may be new, it is an outgrowth of developments dating back a decade.
“Pharmaceutical companies, eDiscovery events, Web scale architectures and other industries/activities have long deployed HPC-like infrastructures affording users the ability to answer big questions from Big Data” says Oh. “What is different now? Simply, what is possible now comes from infrastructure convergence (e.g. petabyte-scale storage, 4+ node clustered systems, and analytics) that is more efficient and cost effective, meaning more companies can take advantage of Big Data architectures and tools.”
In evaluating a Big Data NAS, Oh recommends looking at certain criteria, including:
- Enterprise-class performance and scalability
- Clustering beyond 2 nodes with a Single Namespace
- Hardware acceleration for network and file sharing protocols
- Large volumes and file system support
- Dynamic provisioning (aka Thin provisioning)
- Object-based file system supporting fast metadata searches
- Policy-based intelligent file tiering and automated migration (internal and external)
- Capacity efficient snapshots and file clones
- Virtual servers to simplify transition and migration from other filers
- Content-aware and integrated with backup and active archiving solutions
- Symmetric active-active storage controllers
- Unified virtualized storage pool supporting block, file and object data types
- Page-based dynamic tiering
But getting the right hardware in place is just one small piece of achieving the promise of Big Data.
“HDS believes Big Data to not be about technologists salivating at a new gold rush,” says Oh, “but about the promise of everyday people interacting with confidence in technologies to answer questions that may require analyzing enormous quantities of data to make their work and, ultimately, society a better place.”
Panasas, Inc. of Sunnyvale, CA, makes two 4U rackmount appliances – ActiveStor 11 and ActiveStor 12, each containing 20 2TB or 3TB SATA drives. They can scale up to 6.6 PB and 4 billion objects per file system.
“Big Data requires a new approach to software and storage in order to derive value from highly unstructured data sets,” says Panasas Chief Marketing Officer, Barbara Murphy. “Buyers who consider integrated platforms with software, storage and networking provided within a single, appliance-type system will find them exponentially easier to deploy and manage than DAS or do-it-yourself clusters, with considerable benefits in terms of reliability at big-data scale.”
The ActiveStor appliances use a parallel storage architecture and the Panasas PanFS parallel file system and can achieve a 1.5 GB/s from a single ActiveStor 12, or up to 150 GBps per file system. Rather than using traditional hardware RAID controllers, PanFS performs object-level RAID on the individual files.
“Unlike other storage systems that loosely couple parallel file system software like Lustre with legacy block storage arrays, PanFS combines the functions of a parallel file system, volume manager and RAID engine into one holistic platform, capable of serving petabytes of data at incredible speeds from a single file system,” says Murphy. “This is accomplished while avoiding the management burden and reliability problems inherent in most competitive network attached storage systems, making Panasas storage ideal for private cloud computing and dedicated storage environments alike.”
The SGI NAS is a unified storage solution composed of 2U and 4U building blocks containing SSD, SAS and/or SATA drives in sizes and speeds to meet customers’ needs. It uses a 128-bit file system, so it can hold up to 2^48 entries. The file system is unlimited in terms storage capacity, but the largest single namespace file system shipped to date is 85PB.
“It can address 1.84 x 10^19 times more data than 64-bit systems such as NTFS,” says Floyd Christofferson, Director of Storage Product Marketing for SGI in Fremont, California. “The limitations of ZFS are designed to be so large that they would never be encountered.”
SGI NAS includes full VM integration, even in mixed vendor environments. With support for multiple NAS and SAN protocols, standard features include inline de-duplication and native compression, unlimited snapshots and cloning, unlimited file size, and high-availability support. The NAS can be administered through a browser-based GUI from any desktop or tablet.
“You want to select a NAS solution that enables straight-forward integration into legacy storage environments and ensures that data is not trapped within expensive siloed arrays,” says Christofferson. “The solution must be flexible enough to not only manage continued data expansion but also not lock their organization into one type of storage architecture. Data will only continue to grow, so it’s important to select products that have a strong potential for boosting ROI.”