Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Open source storage has come a long way in the last few years. There are good open source offerings on the backup, mirroring, file system, NAS and storage virtualization side. It is possible to cobble together an awful lot of disks and run them at high performance without the need for state-of-the-art hardware. Even companies known for proprietary offerings like EMC (NYSE: EMC) are on board.
"EMC most often encounters open source in the form of a Linux-based host connected to our storage products," said Jay Krone, senior director of storage platforms at EMC. "Customers are purchasing Intel- or AMD-based servers and putting Linux on them to take best advantage of volume pricing on the hardware and minimal-to-no licensing costs on the software."
Krone said customers tend to add open source applications, like the Apache Web server, or proprietary products, like Oracle (NASDAQ: ORCL) databases, to those Linux-based servers to address a wide spectrum of business problems. To meet this trend, most EMC storage hardware and software products have been adapted to run in a Linux environment. For example, EMC's PowerPath family is available in Linux.
Despite the recognition by EMC and other data storage vendors, opinions differ on how far open source storage has come.
"I still wouldn't say that there were a lot of open source storage apps," said Jason Williams, CTO at Digitar of Boise, Idaho, a company that makes heavy use of Linux and Sun (NASDAQ: JAVA) open source software.
Greg Schulz, senior analyst and founder of StorageIO Group, is more upbeat about the state of open source storage offerings.
"There is a wide variety of open source storage solutions and applications from different sources, ranging from volume managers, iSCSI and NAS stacks, file systems, clustered file systems, object-based storage solutions, dedupe and compression, among others, not to mention all of the propriety or commercial solutions that may leverage open source technology embedded into turnkey solutions and products," said Schulz. "Of traditional server and storage vendors, Sun is probably the most notable and vocal around open source storage, along with many smaller startup vendors."
Sun's "Amber Road" project, now known as Unified Storage Systems (UFS) or the Sun Storage 7000 series, is built around preinstalled OpenSolaris and ZFS on x86 hardware. These units support both file and block data protocols, thin provisioning, replication, mirroring, snapshots, antivirus and analytics. An HPC version adds Linux to the mix too.
"Amber Road is essentially a NAS system that integrates inexpensive servers with open source software in an easy-to-use appliance," said David Trachy, a principal engineer at Sun. "The whole point is to get around the premium you have to pay for proprietary disk systems."
ZFS, in particular, is garnering good reviews. Offered free with OpenSolaris, it provides a high level of data integrity, as well as mirroring between sites. According to Trachy, it can be used as the basis for huge data repositories. It is already being picked up by partners like greenBytes and Nexenta Systems to build storage systems.
"Startups are using ZFS and combining it with JBODs to create different products and appliances," said Trachy. "Missing in Sun's open source lineup is FC [Fibre Channel] block-level storage and pNFS, but these will be added over time."
In addition, Trachy notes that ZFS integrates well with solid state drives (SSDs), which are beginning to gain traction in the storage world. Williams, for example, swapped SATA drives inside Sun X4500 servers for ZeusIOPS SSDs from STEC (NASDAQ: STEC) to function as a high capacity (up to 640 GB) memory cache. SATA remains his platform of choice for volume data storage.
Competition for ZFS comes from the likes of Red Hat's (NYSE: RHT) Global File System (GFS), the Linux Logical Volume Manager (LVM) and file systems like ext4 and BTRFS. GFS was first developed at the University of Minnesota as a means of offering high performance and data sharing capabilities for the Linux platform, as well as storage virtualization. While GFS is controlled by Red Hat, LVM comes in a wide range of versions in the open source community.