Saving Big Money With Open Source Storage
In these times of economic woe, many companies are looking for ways to trim their IT budgets while still getting the job done. That seems to be playing into the hands of a couple of storage trends the choice of iSCSI rather than Fibre Channel (FC), and the growing interest in open source storage software.
One firm embracing both is Digitar Inc. of Boise, Idaho. In essence, Digitar is an e-mail processing company. Its systems take care of spam, virus and other malware issues for customers, who are then sent cleansed e-mail. As such, its systems hold upwards of 50TB of data housed primarily on Sun Microsystems (NASDAQ: JAVA) hardware using free OpenSolaris software and Linux.
"Our entire operation is based on open source, and the financial perspective is the biggest reason," said Jason Williams, COO and CTO of Digitar. "But there are performance benefits too."
However, not everything in the operation is open source. In particular, the company's own e-mail processing software is proprietary. As it encapsulates the company's primary value, that is understandable.
"It is up to each area to decide if its software should be free or not," said Williams. "But as we are gaining the value of open source in many ways, we have a responsibility to contribute to the open source community as a whole."
Digitar has been using Novell's (NASDAQ: NOVL) SUSE Linux software on its HP (NYSE: HPQ) servers since it opened its doors. However, the company tried the Linux storage subsystem a few years ago with unsatisfactory results.
"At that time, we found the Linux storage subsystem to lack reliability and the Linux Volume Manager (LVM) to be slow," said Williams. "Back then, I wasn't familiar with OpenSolaris and I must confess that I was anti-Solaris, as I had found it difficult to use while at college. I preferred the Solaris kernel but believed Linux to be more user-friendly."
He had what he describes as a "flaky" array at that time, which was spitting SCSI I/O errors. As Linux ignored them, this led to lots of database corruption. The only reason Digitar even considered OpenSolaris as an option was because Sun offered it for free.
"If it hadn't been free, we would never have looked at, yet it made our I/O and corruption problems go away," said Williams. "This gave us the time we needed to get a new LSI array in."
Performance, Cost Benefits
Tests showed a performance drop of 40 percent when Digitar used LVM compared to a drop of 15 percent using OpenSolaris. On the cost side, Williams sees the irony in his unwillingness to pay around $1,000 per server for Solaris before it was open source. Yet the cost of a Linux server license from one of the Linux vendors would have been more than that, sometimes more than twice the price. With OpenSolaris being free, the savings were significant when multiplied by over one hundred servers, he added.
"An OS has largely become a commodity item," said Williams. "I'd rather put those funds into software development or hardware."
OpenSolaris is now used in about 60 percent of the organization's servers, with Linux running on 40 percent. On the hardware side, Digitar is using Sun X4500 boxes for about half its storage, with Sun 4240 servers and Sun 7000 Unified Storage holding another 20 percent. The rest sits on traditional storage arrays. Over time, Williams said the company will phase in more X4240s.
Commodity Hardware, SATA Drives
When it comes to cost cutting, he's also a fan of iSCSI over FC, and x86 gear running SATA drives compared to high-end disk arrays. To his mind, it is better to use SATA drives, as you get more IOPS per dollar.
He gave the example of 15K 146GB SAS drives costing $180 and 7.2K 250GB SATA disks from the same vendor costing $55 you can buy 3.2 SATA disks for every SAS disk. Those three disks provide a combined IOPS of 240 compared to 175 for the SAS. Thus, higher RPM doesn't necessarily translate into higher performance.
Williams also questioned using proprietary hardware from the big storage vendors compared to buying x86 boxes from the likes of Sun. One proprietary array, for instance, was priced at $150,000 compared to $35,000 for the X4500. The latter offered more processing power and 24TB of disk storage, though only about half the memory, while the former didn't cover any disk trays at all.
A Vote for Sun ZFS
He said OpenSolaris' ZFS file system is the best and cheapest way to mirror data across disks, enabling good redundancy and reliability on JBODs.
"Since ZFS came out, it has saved our behind more than once," said Williams. "The combo of OpenSolaris and ZFS is such that I would now be quite willing to pay for what it offers."
ZFS provides copy-on-write so there is no need to buy additional snapshot functionality, block checksums to detect corruption, an integrated file system/volume manager, write bundling and dynamic striping. Williams gave the example of a database corruption that had resulted in vendor finger pointing. The checksum capabilities highlighted the fact that the file system was clean. This, he said, was the only way to be sure that the actual error lay in the DB itself.
Saving with Solid State
Solid state drives (SSDs) are not something you normally mention in the same breath as cost savings. They cost orders of magnitude more than SATA drives and are still in the very early adopter stage. Yet Williams includes them in his rundown of how to get far more from a storage environment.
Digitar uses Zeus-IOPS SSDs by STEC inside three of his four Sun X4500 servers in a hybrid arrangement. He swaps out one SATA disk in each server for a STEC drive as a way of getting far more bang for his memory buck the SSDs are used in place of more RAM, as opposed to being a storage space for data.
While the price may be high, he said he can have up to 640GB of SSD operating as a memory cache, compared to a maximum of 128 RAM. According to his numbers, the price works out to $50 per GB for RAM and 25 cents per GB for SSD.
"There is no other way on the market to get 10X performance for $2,000," said Williams. "We continue to use cheap SATA disks in the remainder of the server for volume storage."
While those servers use write-optimized SSD, Williams also used read-optimized flash in his Sun 7000 Series arrays for data analysis tasks.
"We are currently using SSD to accelerate disk, not to replace it," said Williams. "As the price drops, though, I'm sure they will be used more frequently in place of high-end disks."