eHarmony Finds IP Storage Love
Conventional wisdom says that when you need lots of storage capacity, NAS just doesn't cut it. You have to have a Fibre Channel SAN, they say, for larger capacities. But implementing an FC SAN typically requires plenty of know-how, several storage administrators and a lot of expensive gear.
That's why eHarmony.com of Pasadena, Calif., is moving away from Fibre Channel. Even in the face of monumental storage growth — from 1 TB two years ago to more than 100 TB today — it has largely avoided FC SANs. And just like the couples featured in its TV ads, it found love in the form of a NAS and iSCSI architecture.
"When we went this route a couple of years ago, we had a few people working in storage," said Mark Douglas, vice president of technology at eHarmony. "Now we have zero dedicated people managing storage and a whole lot more capacity. Yet we have had no storage downtime in two years."
Love is big business these days. From its beginnings in 2000, eHarmony has blossomed like a proverbial spring romance. It is the number one online relationship service, with more than 14 million registered users and 10,000 to 15,000 new members every day. This amounts to 255 million pages views and 45 million e-mails a month — a transaction volume of 400 to 1,000 per second. It spans over 200 countries and has branched out from its roots as a means of bringing singles together. Now it has added a bookstore as well as services built around preparing for and repairing a marriage.
eHarmony's rise to fame, however, wasn't without a few technological 'lover's tiffs' along the way. Despite rapid expansion, in April 2005 its storage environment was as basic as it could possibly be. 1 TB of data was spread across three direct attached storage (DAS) arrays.
"We outgrew these arrays when we could no longer fit drives into the boxes," said Douglas. "I don't have anything against SAN, but I didn't want to have to gain the expertise. So I chose iSCSI and NAS instead."
eHarmony's environment consists of multiple elements, including direct attached storage arrays, NAS gateways, de-duplication gateways and high-throughput network switches.
Four storage arrays from 3PARdata Inc. manage more than 100 TB. These arrays have automated tuning, which dispenses with the need for volume layout planning. Data is spread across every drive in the array in order to heighten performance. Instead of a bottleneck at a few drives that are assigned as the destination for specific information, the load is balanced across all drives. Thus reads and writes are more rapid.
Each 3PAR array has four storage controllers, with 96 Fibre ports inside each one. At 25,000 IOPS per controller, this adds up to 100,000 IOPS per array — more than enough to meet the performance needs of the corporate database. Douglas notes that the company to date has not exceeded 60,000 IOPS at his highest peak hours.
"We have implemented a direct connection to these storage arrays from our databases," he said. "The only work involved now is an hour or two per month to create the volumes required."
NAS Devotion, Loveless Tape
Redundant NAS gateways are used to provide shared storage access to each computer center in the data center. These offer a throughput of 300 MB/second per gateway. They utilize clustering to provide failover capabilities. The NAS gateway technology is supplied by OnStor Inc.
"This equipment provides access to most users," said Douglas. "Most don't need the speeds that block-level storage provides."
The online service provider, then, has evolved a series of storage classes depending on need. Its databases are in the highest classification and merit the fastest Fibre connections on its disk arrays. Further data is left to reside on cheaper disk with slower connections.
"Currently, our arrays support Fibre Channel drives and Fibre ATA drives," said Douglas. "Eventually, I want to move away from Fibre and have a full iSCSI environment."
While NAS and iSCSI get plenty of devotion at eHarmony, tape remains loveless and scorned. The company uses de-duplication technology as a means of removing tape from its storage environment. A series of deduplication gateways placed in front of the NAS gateways provide data compression rates of 20 to one. Result: little need for tape and its associated backup hassles.
"This gives us months of online backups instead of weeks of tapes," said Douglas. "Thus, we are able to store all the data we need on disk rather than tape."
eHarmony is fortunate in that it doesn't have much in the way of archiving needs. Since it is not under the gun of any serious industry regulation — love, for now, remains relatively unregulated — it can keep just about everything it needs on disk. And with the existing system able to support growth of up to 1 PB, it can retain this approach for at least the next couple of years.
Network switches by Force10 Networks utilize iSCSI to extend storage to any machine over IP, providing a capacity of up to 630 ports and 900 Gbps.
"iSCSI certainly isn't slow," said Douglas. "Its performance meets all our needs outside of our database machines."
If eHarmony was to do one of its compatibility profiles, it would surely be matched with startups and emerging technology players.
"Virtually every technology that eHarmony uses was created since 2000 by emerging companies," said Douglas. "Despite the obvious risk, this approach has worked very well for us."
He even took a moment to ridicule an unnamed "old school" storage provider. Characterizing this company as a "well known vendor," he asked them to explain their technology. What the reps laid out, he says, was a complex architecture using plenty of RAID boxes. He was reluctant to add any more details.
"How they wanted to storage our data seemed as clumsy as putting it all on memory chips," said Douglas. "Our total cost of ownership with emerging vendors has been excellent, as we have no people to deal with. Everything is self-managing."