Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
I am not the first nor will I be the last to write about picking the right cloud storage vendor for your company. Clearly, it’s a popular topic.
There are likely a few reasons that everyone wants to address how you pick a cloud provider. The obvious reason is that a vendor is marketing their product via writers being paid to write articles about cloud technology. Needless to say there are lots of articles – some good, some bad – about choosing the correct cloud storage and applications environment.
My approach, as it usually is, is not to discuss products but to discuss the requirements. That way, readers can pick and choose what is important to their operation. No vendor is paying me to write about their cloud.
In the 1990s there was saying about RAID. You could have it cheap, you can have it reliable, you can have it fast. Pick any two.
I think that the same could be said at this point for the cloud market. Except the list is quite a bit different and longer and way more complex, but so is the technology. These are the list of issues that must be addressed:
(They’re are in no specific order; their relative importance will depend on your requirements.)
• Speed of data access and QoS (quality of service)
• Amount of data to be stored
• Cost, including cost to move to another vendor
• Data integrity and provenance
• Availability for your applications and data
I think that each these areas are important in terms of helping your better understand if you’re going to make the move to the cloud.
One thing that I cannot emphasize enough is to get what the vendor tells you (or says in marketing literature) in writingas part of the contract.
I have firsthand experience with cloud vendors putting one thing in their marketing literature about reliability and integrity and refusing to put the marketing features into a contract. If it’s not in writing in the contract, it doesn’t matter what the marketing claims are, unless you want to spend a protracted time in court. Not worth the cost of lawyers.
Speed of Data Access and QoS
Depending on your business, the speed of access is likely going to be an issue. But equally important is the quality of service.
Let’s say you are in a retail business and you depend on black Friday for your business profitability and this is your peak usage. Getting required performance 99.99% of the time might sound like a good deal, but that is about 52 minutes per year that you will not get the speed you wanted or the QoS you paid for and need.
I suspect that many businesses will need their peak performance on that day and from what I understand large retailers have agreements for credit card approval codes that address around exactly this type of problem.
If you have crunch times you need to make sure that you specify what your needs are during those times and get in writing something you will be able to live with. If they will not put it in writing walk away.
You of course need to be willing to pay the vendor to meet this requirement. Nothing is free and if they do not meet the requirement and your business is impacted, what are the penalties? Something to think about.
Amount of data that needs to be stored
The amount of data you need to store will likely be the largest cost item.
You need to run the numbers and make sure that the cost is in line with your expectations. Does the cost change significantly if you need more storage for a short period of time? Does the cost change if you increase your access rate significantly?
All of these are reasonable questions. Make sure you not only get the pricing on these items but also understand the limits. What if you need have 2x the access rate, can the vendor support that whenever you want? Do you have to pay the potential access rate or size change?
Some of these areas are well documented by some vendors and some are not. Get the costs and the cost ranges before you sign up, or you might get some unwanted surprises.
The Cost to Move to Another Vendor
Cloud is emerging technology, so migration from one vendor to another is more of an unknown. The cost and time to move from, say, RAID vendor X to RAID vendor Y – and of course the potential methods – are pretty well known. But it took many years for the technology to mature to where it is today.
How do you move from cloud vendor A to B? Cloud vendor B could say “Okay we can do this,” but at what cost?
Cloud vendor A might have some reservations about Cloud vendor B accessing the data in their cloud. Additionally, what is the fine print in the contract? Some vendors have some significant costs for you to move data out of their cloud if you want to do it in a reasonable amount of time.
Some of the issues might be out of your staff’s control and the vendor’s control. For example, what about the time required to move the data? Moving 100 TB of data between vendors might take a very long time if you have an agreement that says you are only going to access say 20 GiB of data per day.
So you get a TB every 5 days and get to move your data in just 500 days. Not very practical is it?
When your data store gets large, moving your data around can get very time consuming. Remember an OC-192 channel – which is not cheap – is only about 800 MiB/sec of bandwidth. About the same speed as a single FC-8 or SAS 6 Gbit port.
The performance of your local network and your local RAID controllers is going to be far faster than moving data around clouds between vendors. This is because WAN network performance has not kept up with the performance of local storage – and this won’t change anytime soon.
Data Integrity and Provenance
If any of you have read my column over the last 10+ years I have consistently talked about the need for end-to-end data integrity.