IBM Builds on Hadoop with New Storage Architecture

Enterprise Storage Forum content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

At the Supercomputing 2010 conference, IBM pulled back the curtain on a new storage architecture that, according to Big Blue, can double analytics processing speed for big data and the cloud.

Created by scientists at IBM Research – Almaden, the new General Parallel File System-Shared Nothing Cluster (GPFS-SNC) architecture was built on the IBM (NYSE: IBM) GPFS and incorporates the Hadoop Distributed File System (HDFS) to provide high availability through advanced clustering technologies, dynamic file system management and advanced data replication techniques.

The cluster “shares nothing,” in a distributed computing architecture in which each node is self-sufficient. The GPFS-SNC divides tasks between independent nodes and no one waits on the other.

According to IBM, the GPFS-SNC can “convert terabytes of pure information into actionable insights twice as fast as previously possible.”

In addition, the GPFS-SNC design won the Supercomputing 2010 Storage Challenge. The judging system for the Storage Challenge measures technologies based on performance, scalability and storage subsystem utilization, to determine the most innovative and effective design in high performance computing.

Prasenjit Sarkar, master inventor, Storage Analytics and Resiliency, IBM Research – Almaden, described the GPFS-SNC as a general purpose file system that allows IBM to compete in “all worlds,” whether it be against Google’s (NASDAQ: GOOG) MapReduce framework, in traditional data warehouse environments against Oracle’s (NASDAQ: ORCL) Exadata Database Machine and EMC’s (NYSE: EMC) Greenplum Data Computing Appliance, or in the cloud.

Sarkar said the GPFS-SNC boasts twice the performance of competing architectures, supports POSIX for backward compatibility, and includes advanced storage features such as caching, replication, backup and recovery, and wide area replication for disaster recovery.

“The world is overflowing with petabytes to exabytes of data and the challenge is to store this data efficiently so that it can be accessed quickly at any point in time. This new way of storage partitioning is another step forward on this path as it gives businesses faster time-to-insight without concern for traditional storage limitations,” Sarkar said.

IBM’s GPFS currently serves as the basis for the IBM Scale Out Network Attached Storage (SONAS) platform, which is capable of scaling both capacity and performance while providing parallel access to data and a global name space that can manage billions of files and up to 14.4PB of capacity. It is also used in IBM’s Information Archive and the IBM Smart Business Compute Cloud.

Sarkar did not comment on when or how the GPFS-SNC storage technology will find its way into IBM’s commercially available product portfolio. However, it stands to reason that the GPFS-SNC will be used in IBM’s recently announced VISION Cloud initiative, which has been formed to develop a new approach to cloud storage, where data is represented by smart objects that include information describing the content of the data and how the object should be handled, replicated, or preserved.

IBM announced the VISION project last week as a joint research initiative of 15 European partners to develop a so-called “smart cloud storage architecture.” The effort centers on delivering storage services within and across cloud boundaries through a better understanding what’s inside the data.

The VISION Cloud storage cloud architecture concept combines (a) a rich object data model, (b) execution of computations close to the stored content, (c) content-centric access, and (d) full data interoperability.

The VISION Cloud initiative will be spearheaded by scientists at IBM Research in Haifa, Israel, and supported by partners, including SAP AG, Siemens Corporate Technology, Engineering and ITRicity, Telefónica Investigación y Desarrollo, Orange Labs and Telenor, RAI and Deutche Welle, the SNIA Europe standards organization. The National Technical University of Athens, Umea University, Swedish Institute of Computer Science and University of Messin, will also contribute to the effort.

Follow Enterprise Storage Forum on Twitter.

Kevin Komiega
Kevin Komiega
Kevin Komiega is an Enterprise Storage Forum contributor.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends, and analysis.

Latest Articles

15 Software Defined Storage Best Practices

Software Defined Storage (SDS) enables the use of commodity storage hardware. Learn 15 best practices for SDS implementation.

What is Fibre Channel over Ethernet (FCoE)?

Fibre Channel Over Ethernet (FCoE) is the encapsulation and transmission of Fibre Channel (FC) frames over enhanced Ethernet networks, combining the advantages of Ethernet...

9 Types of Computer Memory Defined (With Use Cases)

Computer memory is a term for all of the types of data storage technology that a computer may use. Learn more about the X types of computer memory.