One of the biggest problems with cloud storage is data movement. Transferring large quantities of data to and from the cloud takes time, which can impact performance.
This issue is the province of a batch of storage startups who address it in a wide variety of ways. Some seek to increase caching capabilities. Some boost processing at the edge of the network by bringing the processing to the data. Some accelerate the speed of transfer. Some minimize the amount of data that has to be shunted around the globe for apps such as synch and collaboration. And some deal with the trials and tribulations of network file sharing across a large number of sites.
Lack of Speed
Barry Phillips, CMO at Panzura, thought the biggest problem with the cloud was the lack of speed for data transfer. Whatever cloud you use, it is always some distance away. So if you want to use local storage, you need to cache it.
Panzura seeks to solve this problem for unstructured data in the cloud. A caching appliance on premises supports the cloud as a means of getting rid of primary NAS and backup. Otherwise, a regular internet connection is utilized as opposed to a premium network connection.
“We do well in global software development where you have dev in one location and test in another,” said Phillips.
In one example, he said a current client in the media industry was taking ten hours to transmit 50 GB video files many times per day. This is now down to less than a minute. Another example is sending seismic data from one location to a big compute farm in the cloud and then sending the results back. In addition, for CAD applications, which can be very chatty and therefore consuming time and network resources, Phillips said Panzura makes CAD files much faster to open.
Global deduplication is another part of what Panzura does. This cuts down the amount of data that has to be transferred. Global file locking is included, too. The basic value proposition is to replace a raft of file servers, NAS appliances and WAN optimization appliances with a combination of a Panzura appliance and Microsoft Azure.
Sensor Data Overwhelm
One stealthy startup is Igneous, which has eleven patents to its name. The problems it solves involve sensors and other devices that generate huge amounts of data, up to 1 TB per hour. Just a few of those and a network becomes saturated. Igneous is designed to solve this problem, though the details are sketchy.
“We have a zero-touch infrastructure for data that you can’t or won’t send to the cloud,” said Steve Pao, CMO of Igneous. “It provides automated and API-driven services that operate on your network behind your firewall.”
He noted the difficulty in moving bits to the cloud fast enough as machine data and Internet of Things (IoT) data proliferates. While humans tend to generate word docs and emails galore, machines and sensors can generate high-fidelity data and larger files that continue to rise in size. Scientific equipment in biology, chemistry and physics now deal in huge files. And media and entertainment resolutions continue to increase.
“We are talking about really big data, how you move it around and how you analyze it,” said Pao.
Very large data sets, he added, have different workflows. Instead of moving the data to the computing power as in older paradigms, you need to move the compute to the data.
“Operations on the data are iterative so need a different compute paradigm,” said Pao. “Processing and compute have to live together in an unconstrained environment.”
Big File Transfer
The volume and size of modern files is overwhelming existing network infrastructure. For industries like media, entertainment, and oil and gas, it’s a real challenge to send big files around the world rapidly.
“We provide solutions to move very large data files quickly and securely across the network,” said Doug Davis, CEO of BitSpeed.
He gave the example of it taking 22 hours to move a TB of data six miles on a GigE network. Using his product, that is down to less than an hour. This is achieved using a special box installed at the client site and the regular Transmission Control Protocol (TCP) which is a core protocol of the Internet protocol suite. Davis said multiple streams of TCP are employed up to the line’s capability.
“Moving large movie files from coast to coast in the U.S became ten times faster for one media customer, yet they were already using state-of-the-art acceleration technology and 10GigE,” said Davis.
Data Dispersal
Talon is all about data consolidation, centralization and collaboration. It targets pain points such as inefficient transfer of documents and areas of potential insecurity such as Dropbox and email. Talon CloudFAST works closely with Microsoft Azure.
The company doesn’t own the data center infrastructure. Someone like Azure, EMC or NetApp provides the backend, and it supplies the software. Talon focuses on the access problem. Typically in the form of a Windows VM, it pools data from remote sites, puts it in one location and locks it. It then provides high-performance global file sharing.
“We do not do replication of everything between the central site and remote offices connected to it,” said Shirish Phatak, CEO of Talon. “Data is only synched when you request them.”
This approach enables Talon to have much smaller caches than other competing solutions. Thus it can scale easily as it doesn’t labor the network with a lot of replication traffic. Once a document is streamed from HQ to the local area, it is available locally, and you can us it like a local doc.
Network Latency
It’s all very well to dump everything into cloud file-sharing systems, but what do you do if the network slows to a crawl. That is one of the cloud storage problems addressed by CTERA. Jeff Denworth, senior vice president of marketing, said it is all about unifying file services across the enterprise. “We secure and accelerate enterprise file sharing and help move people from NAS to the cloud,” said Denworth. “We provide this as a secure platform within your own firewall.”
This platform provides a cloud file system that you deploy where you want. Known as the CTERA Portal, it includes global dedupe, multi-tenancy, cloud orchestration, file versioning, central management, data loss prevention, encryption and authentication. It can be used as disk-to-disk to cloud backup, too, and has disaster recovery (DR) built in. Organizations can choose to deploy software where they are looking for direct access to the cloud or use cloud storage gateway appliances where offices must persist data locally to overcome unpredictable WAN access.
“We can move organizations to a cloud style of data management and collaboration without sacrificing edge performance,” said Denworth. “If the WAN goes down, the data remains there stored locally and it catches up to synch later when network conditions improve.
Denworth added that CTERA gateways have a current threshold of 60 TB per site and that it doesn’t really play in markets with larger single-site requirements. It specializes in distributed environments (hundreds to thousands of locations). That said, a lot of traffic it taken off the network via dedupe.
Clunky Data Movement
One final mention: Agylstor likes to maintain a veil of mystery around its technology, but it has to do with the clunky nature of data movement for large volumes of data. Think about this for a minute: If you want all your data back from the cloud, you can try to relay a PB over the Web. With a 10 Mb/sec line, that would take 30 years. Even a 10 GigE line would take more than a year. That’s why the big phone companies like Verizon can charge a fortune to lease extra capacity and faster fiber connections to move data around. Let’s see what Agylstor comes up with to address this issue.
Photo courtesy of Shutterstock.