Storage Users Speak: Data Lakes, DR, Big Data, Virtualization, the Cloud
To learn about data storage, it’s best to talk to the actual users. It’s not every day you get the chance to interact intensively with storage users. Increasingly, they are either prevented from talking by overly cautious PR/legal types, or they are frightened to say much for fear of getting into trouble. We did some reporting in the field; for this article, we focused on EMC users.
Eric Tomasella is Director of Infrastructure & Operations at EmblemHealth, the largest non-profit healthcare provider in New York State. He said his company received a rude awakening after the events of Superstorm Sandy a couple of years back. That event paralyzed Lower Manhattan and coastal areas of surrounding boroughs and New Jersey. Power was out for many days, including the EmblemHealth data center.
IT thought they were covered by the fact that they had a disaster recovery (DR) site in New Jersey. But they didn’t plan on it suffering from flooding, too. In any case, employees were largely stranded at home and couldn’t get to the site and the email system was done. Suddenly, the inadequacies of what seemed like a perfectly good DR plan were brought clearly into focus. And it wasn’t just the fact that IT suffered some embarrassment at the way things unfolded. The business itself suffered badly.
“Not being able to recover quickly cost us a lot of money,” said Tomasella.
Sandy drove a series of upgrades at the healthcare company. Plans included switching to more of a Virtual Desktop Infrastructure (VDI) environment so staff could log in remotely whenever needed. In addition, the company decided to pare down its hardware inventory via consolidation and virtualization. This led to a considerable reduction in the number of physical Windows and Linux servers, as well as older EMC CLARiiON frames being swapped over to EMC VNXs. In addition, EmblemHealth implemented Cisco UCS servers and VCE vBlock integrated systems when it created its own private cloud.
The company employs two vBlock boxes in an active-active arrangement with 2500 users on each one. Between the two, all employees are covered. This cloud encompassed their VDI environment, supplemented by XtremeIO to provide the flash-based horsepower to ensure there were no slows of any sort experienced by desktop users either in-house or remotely.
“XtremeIO solved any VDI performance issues,” said Tomasella. “It was easy to implement and enabled employees to use any device. Their desktop and files go wherever they are.”
He is happy with the results of the upgrade. Gone are heft payments to a colo for space and power. The new hardware, he said, takes up very little space and uses a lot less power. No more throwing disk at performance issues. Flash now operates as the top tier for storage in the cloud. Disk continues to be used for lower tier storage, mostly SATA. EmblemHealth recently completed a DR test, restoring 300 servers and its entire VDI environment in the process.
“Sandy made the business appreciate IT more and value what we provide,” said Tomasella. “We realized we needed to be talking to each other more and to work together. They gave us full funding we needed to transform DR.”
Can Data Lakes Stop Electricity Theft?
Instead of a flood, the driver towards a more sophisticated IT infrastructure at Canadian utility BC Hydro was losses of $100 million in one year due to theft of electricity. Enterprising thieves stole power from neighbors, from substations and even had the guts to do so from high-voltage lines.
BC Hydro decided to act. It deployed nearly two million smart meters at houses and businesses, fed that data into a Pivotal Greenplum database and an EMC data lake backed by Isilon scale-out NAS storage and a wealth of SAP data, too. Proprietary systems add geospatial data as a means of detecting discrepancies in voltage patterns. SAS for predictive analytics gives further insight into where losses are coming from.
It takes a lot of storage to make all this happen. 10 Isilon nodes form the core of the data lake and enable the Hadoop cluster to sift through a massive amount of data. For example, the smart meters alone churn out about 25 TB annually. Three racks of Pivotal appliances and Cisco UCS servers form the backbone of the analytics that sits on the Isilon foundation.
“Analytics is helping us achieve our goal of a 75% reduction in electrical theft,” said Elizabeth Fletcher, Deputy Director of the Smart Metering & Infrastructure Program at BC Hydro.
As the company evolves its data lake, she noted some challenges. On the governance side, in particular, she said the business, not IT, needs to drive how the lake will be used.
“The business has to establish the priorities rather than IT deciding what it thinks it might need,” said Fletcher.
Public V Private Cloud
Public versus private cloud is certainly a debate that will continue for some time. But Eric Craig, CTO Infrastructure and IT Operations at NBCUniversal, is firm in his commitment to the hybrid cloud concept i.e. augmenting the company’s private cloud with the public cloud when necessary.
“The public cloud is not always the best option, but we use it for sudden bursts of traffic and peak loads while maintaining our private cloud for regular, more predictable traffic,” said Craig. “This way seems more cost effective and more manageable for us.”
This strategy in tandem with a major push on server and storage virtualization has allowed the media giant to remove several tons of hardware from the data center. This, said Craig, adds up to a $30 million drop in annual operational costs. The company’s IT is currently 70% virtualized.
One area where this is paying dividends is in Universal theme parks. The private cloud is being used to deliver more targeted content to visitors during their experience.
“Our goal is to developing a rich private cloud and serve as the broker to the public cloud so business units don’t need to go direct to the public cloud for immediate needs,” said Craig.
Photo courtesy of Shutterstock.