Case Study: Linux Can, Linux SAN

Enterprise Storage Forum content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

With such attractions as lower costs and flexibility, it was only a matter of time before the success of Linux in the server sector translated into broader application within the storage market. This doesn’t mean there aren’t still doubts and questions about its viability as a storage platform. On the other hand, when a company that boasts the fourth largest commercial supercomputer system in the world successfully deploys a Storage Area Network (SAN) using the Linux operating system, it’s time to take a closer look.

The company in question is NuTec Energy, serving the oil and gas industry with seismic imaging services. Based in Houston, Texas, NuTec employs 35 staff. In early 2000, the company struck a deal with IBM to develop a massively parallel supercomputing system capable of dealing with the ever-increasing demands of seismic signal processing for oil and gas industry-based applications.

The storage system initially consisted of 3000 Power 3/3+ CPUs with AIX on each server, and with each CPU running its own analysis. The Network File System (NFS) file server utilized 2 IBM ‘Shark’ units connected to three B80 servers and featured shared file access to all CPUs. By 2003, however, the system was not keeping up with the demands placed upon on it, so a project was established to specify a replacement.

Project Goals

According to Sampath Gajawada, manager of software development at NuTec Energy, “The target was a super-scalable SAN — a high-performance, single image storage environment using Intel, Linux, Fibre Channel, and Ethernet.” He defined several key objectives for the SAN:

  • Software tuned to be latency tolerant and massively parallel, offering buffered asynchronous communication and I/O
  • High I/O bandwidth (>500 processing nodes)
  • High computing power (processing power >2 Teraflops)
  • Large, flat file system (10-100TB), with easy storage management
  • Cost effective, solid price/performance balance, and scalable at low incremental costs

The main issues with the incumbent UNIX system were the high cost of the proprietary software and associated support and management, barely adequate computing power and bandwidth for some of the processing requirements, and a bottleneck on the storage NFS.

“The existing system just couldn’t cope with the demands of our Depth-Domain Analysis and Time-Domain Analysis,” said Gajawada. “We had reached the stage where business requirements were forcing us to reconsider our entire system. We looked at all the alternatives and settled on a combination of Intel and Linux.”

Page 2: The Switch

Continued from Page 1

The Switch

The decision to favor Intel/Linux enabled NuTec to create a lower cost structure with industry standard hardware and to take advantage of the improved FP (Floating Point) performance of Intel Pentium 4 processors, which would prove to be especially beneficial to NuTec for their intensive graphical image processing requirements. The Linux route would also eliminate the NFS bottleneck and provide data sharing with SAN performance via a CFS (cluster file system) on the SAN that features the ability to scale to hundreds of nodes with minimal management.

NuTec adopted Minneapolis-based Sistina Software’s GFS (Global File System) Linux cluster file system. Its cluster nodes physically share storage over Fibre Channel or shared SCSI, and while each node thinks the file system is local, file access is synchronized across the entire cluster.

In effect, GFS can pool storage onto cheap, efficient machines. NuTec’s system resides on a Fibre Channel SAN infrastructure from LSI Logic for high I/O performance. Processing consists of 350 dual processor P4 based nodes, providing 750 Linux-running CPUs, each of which is four times faster per box than the previous AIX processors.

The following table, prepared by NuTec, compares the two systems:

Metric UNIX/IBM Linux/Intel Performance

  • Bottlenecked by NFS
  • OK for single node
  • Performance cut by more than half at large scale

  • No bottlenecks
  • I/O at full SAN speeds
  • Performance scales linearly to hundreds of nodes

Cost

  • Costly proprietary hardware
  • Large footprint

  • Low-cost, industry-standard servers
  • 1/10th the footprint

Management

  • Large administrative effort – many nodes to maintain

  • Fewer nodes to maintain
  • Number of admins cut from four to two

One of the main challenges NuTec experienced in the changeover was porting imaging software from UNIX to Linux. Though there were risks involved, the company saw it as an opportunity to reduce costs and management, and they were able to successfully make the transition in just four weeks.

As a result, definite cost savings have been achieved. The headlines are 50 percent fewer administrators and a 90 percent reduction in the requisite data center space, down from 10,000 to just 1,000 sq ft.

“The bottom line is overall cost savings of 84 percent, including hardware and software,” says Gajawada. “And, as a bonus, a higher adoption of Linux elsewhere in the company as a direct result of this implementation.”

Feature courtesy of Enterprise IT Planet.

»


See All Articles by
Drew Robb

Drew Robb
Drew Robb
Drew Robb is a contributing writer for Datamation, Enterprise Storage Forum, eSecurity Planet, Channel Insider, and eWeek. He has been reporting on all areas of IT for more than 25 years. He has a degree from the University of Strathclyde UK (USUK), and lives in the Tampa Bay area of Florida.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends, and analysis.

Latest Articles

15 Software Defined Storage Best Practices

Software Defined Storage (SDS) enables the use of commodity storage hardware. Learn 15 best practices for SDS implementation.

What is Fibre Channel over Ethernet (FCoE)?

Fibre Channel Over Ethernet (FCoE) is the encapsulation and transmission of Fibre Channel (FC) frames over enhanced Ethernet networks, combining the advantages of Ethernet...

9 Types of Computer Memory Defined (With Use Cases)

Computer memory is a term for all of the types of data storage technology that a computer may use. Learn more about the X types of computer memory.