Terabytes of Storage for Pennies a Gigabyte

Enterprise Storage Forum content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

One online backup provider took a hard look at data storage costs and decided to build its own storage servers for just pennies a gigabyte. The good news is the company has decided to share what it learned.

In a recent blog posting, Tim Nufire, vice president of engineering at Backblaze, explained how the company is able to give consumers unlimited storage for just $5 a month.

After pricing options from Dell (NASDAQ: DELL), Sun (NASDAQ: JAVA), NetApp (NASDAQ: NTAP), Amazon (NASDAQ: AMZN) S3 and EMC (NYSE: EMC), the company found it could build its own storage servers for a fraction of the cost.

“As we investigated these traditional off-the-shelf solutions, we became increasingly disillusioned by the expense,” wrote Nufire. “When you strip away the marketing terms and fancy logos from any storage solution, data ends up on a hard drive. But when we priced various off-the-shelf solutions, the cost was 10 times as much (or more) than the raw hard drives.”

Backblaze’s solution — a 4U rack-mounted Linux-based server that contains 67 terabytes at a cost of $7,867, or a little more than 11 cents a gigabyte — adds just 44 percent to the cost of the hard drives.

A Backblaze Storage Pod is made up of a custom metal case containing one Intel (NASDAQ: INTC) motherboard with four SATAcards plugged into it, with nine SATA cables running from the cards to nine port multiplier backplanes that each have five hard drives plugged directly into them, for 45 hard drives in all. The port multiplier backplanes use Silicon Image (NASDAQ: SIMG) chips.

The pods boot 64-bit Debian 4 Linux and the JFS file system. Backblaze uses the fdisk tool on Linux to create one partition per drive, then the company clusters 15 hard drives into a single RAID-6 volume, using two of the drives as parity drives. The RAID-6 is created with the mdadm utility. On top of that is the JFS file system, “and the only access we then allow to this totally self-contained storage building block is through HTTPS running custom Backblaze application layer logic in Apache Tomcat 5.5,” Nufire wrote.

The appliances are accessed through HTTPS — no iSCSI, Fibre Channel or NFS — because “None of those technologies scales as cheaply, reliably, goes as big, nor can be managed as easily as standalone pods with their own IP address waiting for requests on HTTPS,” wrote Nufire.

Nufire didn’t reveal anything about Backblaze’s proprietary software “that de-duplicates and chops data into blocks; encrypts and transfers it for backup; reassembles, decrypts, re-duplicates, and packages the data for recovery; and monitors and manages the entire cloud storage system. This process is proprietary technology that we have developed over the years.”

“Backblaze Storage Pods are building blocks upon which a larger system can be organized that doesn’t allow for a single point of failure,” said Nufire. “Each pod in itself is just a big chunk of raw storage for an inexpensive price; it is not a ‘solution’ in itself.”

And that’s where storage vendors like EMC and NetApp begin to earn their money.

Backblaze said it has no plans to sell its hardware or software, and it asked that anyone using the design credit the company and offer feedback. Nufire also thanked a long list of people who helped develop the storage system.

Interest in Backblaze Storage Grows

Despite having no plans to enter the storage hardware business, Backblaze co-founder and CEO Gleb Budman said the company has had a “flood” of interest since the blog was published on Sept. 1.

“We’re trying to figure out how to help them right now without completely getting diverted from our core business of online backup,” Budman told Enterprise Storage Forum.

“We also believe that by providing the core building block of inexpensive storage, there are a ton of people out there that can innovate on top of them,” he said. “At the moment, I feel like we have a tiger by the tail, as hundreds of people have said they’re going to build their own, people are talking about how to tweak them for their own purposes, and we’ve had a ton of tech folks want to come down and chat about the design. I guess we’ll see where this goes.”

Follow Enterprise Storage Forum on Twitter

Paul Shread
Paul Shread
eSecurity Editor Paul Shread has covered nearly every aspect of enterprise technology in his 20+ years in IT journalism, including an award-winning series on software-defined data centers. He wrote a column on small business technology for Time.com, and covered financial markets for 10 years, from the dot-com boom and bust to the 2007-2009 financial crisis. He holds a market analyst certification.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends, and analysis.

Latest Articles

15 Software Defined Storage Best Practices

Software Defined Storage (SDS) enables the use of commodity storage hardware. Learn 15 best practices for SDS implementation.

What is Fibre Channel over Ethernet (FCoE)?

Fibre Channel Over Ethernet (FCoE) is the encapsulation and transmission of Fibre Channel (FC) frames over enhanced Ethernet networks, combining the advantages of Ethernet...

9 Types of Computer Memory Defined (With Use Cases)

Computer memory is a term for all of the types of data storage technology that a computer may use. Learn more about the X types of computer memory.