Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
Software-defined storage (SDS) was supposed to be the next great thing. But it hasn’t yet gained the traction that many expected.
That said, the term isn’t going away. Most vendors still stress it in their press releases. Many experts believe that it is the trend of the future. “The world is moving to software-defined,” said Jeremy Burton, chief marketing officer of Dell EMC.
So which factors are driving the trend toward SDS? Experts offered 10 tips about where software-defined storage can shine.
1. Economies of Scale
Varun Chhabra, senior director, product marketing, Dell EMC Storage, said the progress of SDS in the enterprise is mainly at the high end of the market.
“Large customers are typically more interested as they are willing to bet on the standardization of hardware,” he said. “They have the scale to extract value from software-defined storage."
2. Block Storage
Chhabra added that most interest in SDS is on the block side. Customers who are running databases and have experience of running a traditional SAN are more willing to roll out EMC’s software-defined ScaleIO fabric.
“It’s easy for them to see how it works, and they don’t have to change their processes,” said Chhabra. “That’s why block storage is in the vanguard of SDS acceptance.”
Meanwhile, as two business units within Dell Technologies, Dell EMC and Dell have collaborated on ScaleIO Ready Nodes. This combo of ScaleIO software with pre-configured Dell PowerEdge 14th generation servers (including NVMe drives) is said to simplify deployment of an SDS infrastructure.
Once large enterprises adopt software-defined storage on the block side, they then are more willing to introduce it as object storage, said Chhabra. Updates to the Dell EMC Elastic Cloud Storage (ECS) platform for object storage include the ECS Dedicated Cloud Service, which enables hybrid deployment models for ECS, as well as ECS.Next, which features enhanced data protection, management and analytics capabilities. All come with SDS capabilities.
3. Private Cloud
Dell EMC is betting on the private cloud as opposed to the public cloud. That makes sense given that the company’s software and hardware offerings will power the private cloud.
Michael Dell, CEO of Dell Technologies, goes as far as to say that if you go public cloud, you will be uncompetitive because the private cloud will be software defined, which enables automation. That’s why the company has certified much of its SDS product line to run on the new Dell PowerEdge all-flash servers.
“Dell 14g PowerEdge Servers give you greater compute and IO capability, as well as the density you need, NVMe and 25 Gig Ethernet on board,” said Greg Schulz, an analyst at StorageIO Group.
4. Scale-Out NAS
RozoFS is a recent entrant in the market for SDS scale-out NAS. It was designed from the ground up to be software defined. It already has customers running its SDS software in production on Cisco, Dell, HP, Lenovo Quanta and Supermicro servers. It is said to be scalable to tens of PBs and 256 servers.
“RozoFS’s SDS architecture enables it to deliver very high performance on all workloads (random, sequential),” said Michel Courtoy, chief operating officer, Rozo Systems. “With the widespread availability of SSD-based servers and flash arrays, SDS solutions need to deliver high performance or they become the bottleneck in scale-out storage solutions.”
To deliver high availability, the software leverages an erasure code rather than replication, which can exert a high overhead. Traditional storage solutions rely on Reed-Solomon erasure coding that requires complex matrix inversions. This is compute intensive and slow. RozoFS is based on erasure coding technology known as the Mojette Transform. It is said to be very fast and supports the performance of flash-based storage. Courtoy added that this has enabled media and entertainment 4K post-production workflows to run on x86 storage servers using hard-disk drives (HDDs).
5. Cloud Hydration
Cloud hydration is the term for the common problem of getting data to the cloud. Some experts are projecting that by 2020 the bulk of data center traffic and enterprise workloads will be based in the cloud – yet getting it all there is often costly, time consuming and painful.
“Published reports cite that it would take 120 days to migrate 100TB of data using a dedicated 100Mbps connection,” said Kevin Liebl, vice president of marketing, Zadara Storage.
The new Zadara Storage Cloud Hydration service is part of its enterprise storage-as-a-service (STaaS) through the Zadara Storage Cloud. It can be deployed at any location (cloud, on-premises or hybrid), supporting any data type (block, file or object) and connecting to any protocol (FC, iSCSI, iSER, NFS, CIFS, S3, Swift).
6. Multiple Clouds
Those who put all their eggs in one cloud basket may live to regret if an outage occurs. That’s why Liebl says more businesses are looking for multi-cloud capabilities.
“Having a good scale-out, software-defined storage cloud that’s natively multi-cloud preserves their options,” he said. “Whenever the next outage hits, IT managers toggle a setting on a dashboard and switch their data and applications to another cloud, and they continue unaffected.”
Avinash Lakshman, CEO and founder of Hedvig, is another believer in a multi-cloud strategy.
“Costly outages like Amazon's S3 blip back in February made enterprises realize they can't put all of their eggs in a single cloud basket," he said.
Leveraging multiple clouds means addressing the way data is distributed and protected. Automation tools are making application portability a reality (using VMs or containers), but data portability is another matter entirely.
“Trying to move your app across a cloud boundary is often like filling a sandbox with a teaspoon,” said Lakshman. “You need technology that handles that data management for you. Many software-defined storage technologies offer a distributed approach for this data storage, management and protection.”
An example is Hedvig Distributed Storage Platform 2.0. It provides enterprise data centers with new multi-workload, multi-cloud and multi-tier capabilities. It includes a programmable data management tier that enables data to span hybrid- and multi-cloud architectures. This is said to solve the problem of data portability with enhancements that provide data locality, availability, and replication features across any public cloud.
7. Flexible Management
Everything goes in cycles. Direct attached storage (DAS) emerged in the beginning. But then came shared storage in the form of a storage-area network (SAN). And now the pendulum is swinging back towards a DAS approach, but with a software-defined twist.
“Traditional models on how storage is provisioned and consumed are no longer satisfactory in many cases,” said David Hill, an analyst for Mesabi Group. “Enter software-defined storage, which decouples the storage controller software that manages traditional storage array systems from the underlying physical storage. The result is a software-based model that increases deployment flexibility, enabling customers to choose to use the decoupled software with virtually any heterogeneous storage platform rather than being locked into a storage system.”
This storage management flexibility extends even further. SDS can also be used in appliance-based and service-based (cloud) deployment models. Instead of being limited to scale-up block and file solutions, SDS supports scale-up and scale-out as well as choices of block, file and object storage. This ability is fueling SDS growth.
8. Finding Unity
A challenge for storage is that it can have too many elements: file storage, block storage, direct attached, deduplication appliances, backup software, replication and DR systems, and more. Software-defined architectures are a smart way to unify file and block services and integrate several data services as well.
Case in point: Nexsan Unity offers one unit with both file (NAS) and block services with enterprise file sync and share, n-Way sync, secure active archiving and data reduction. The new Unity systems offer up to 40 percent performance improvement over previous models, plus more flexibility and lower costs. New all-flash configurations are available. These boxes utilize the Intel Xeon Processor E5 v4 family and have more memory to enable more IOPs and lower latency. They are said to have an effective usable price of $1 per GB or less. The all-flash configurations support 1.92 TB, 3.84 TB and 7.68 TB SSDs.
Note in case of confusion: Dell EMC also has a product line (all-flash) called Unity. There is currently a trademark skirmish between Dell EMC and Nexsan over this label.
A novel way to achieve the goals of software-defined storage is to provide a method of disaggregating storage from compute resources. For example, DriveScale does this through an SAS-to-Ethernet bridge, which essentially puts the commodity drives in a SAS JBOD onto the Ethernet fabric. The company also advocates an increase in the number of Ethernet ports used in a rack to increase the "north-south" bandwidth to accommodate disk I/O traffic. The user then defines servers and clusters of servers in an API/GUI. Physical drives are assigned to servers based on these definitions. The assignments can be changed at will.
“This is quite distinct from typical software-defined storage, which accepts the hardware as it is and puts an abstraction layer on top of the hardware so that applications don't have to be aware of the configurations of the hardware,” said Gene Banman, CEO of DriveScale. “Our approach changes the configuration of the underlying hardware to properly balance compute, networking and storage resources, depending on workflows.”
DriveScale is used with "bare metal" applications that run on large numbers of servers, such as Hadoop, Spark, and Cassandra. There is no abstraction layer. Benefits are said to be a boost in TCO compared to commodity servers with direct attached drives, better utilization of available hardware and ease of use.
Ross Turk, director of product marketing, storage and big data at Red Hat, thinks we are going to see a lot more variety in approaches to hyperconverged deployment.
“It doesn't always make sense to co-locate services — especially when scaling is uneven — but it's really great to be able to,” he said. “As software-defined storage becomes more mainstream, we are all becoming crafty about the kinds of environments it can be deployed in — small and big.”
Red Hat Ceph Storage 2.2 is a unified, distributed storage platform designed to provide everything OpenStack users need. It runs on the Red Hat OpenStack Platform as scalable infrastructure-as-a-service (IaaS) that powers private and public clouds. It introduces the ability for users to co-locate Red Hat Ceph Storage with OpenStack on Ceph's Object Storage Daemon (OSD) with the OpenStack Nova Compute Service. That means that users can shrink their footprint, deploying services with less hardware, while lowering costs and maximizing utilization.
Photo courtesy of Shutterstock.