Download the authoritative guide: Enterprise Data Storage 2018: Optimizing Your Storage Infrastructure
What This Product Would Solve
Assuming that the market analysis is valid and that the pain points customers are suffering from are correct enough for them to considering purchasing it, the product I would create would be a SAN/NAS hybrid that combines the best of both worlds and adds significant new features.
Many NAS limitations are based on TCP/IP overhead, and NAS does not allow for centralized control. The only way to centrally control a heterogeneous shared file system is to move most of the functionality to a single unit, as you cannot control an end-to-end security policy from one host in a pool of heterogeneous machines.
So, for the data-centric world I think is coming, the only way to manage the data is to create a single machine with a new DMA-based protocol that looks like NFS in terms of no changes to the user application, but scales more like a locally-attached RAID communicating without TCP/IP. This new protocol would have to support:
- High performance and scalability (i.e. low overhead)
- DMA communication of the data to the host
- No application changes (POSIX standards and read/write/open system calls)
- WAN and SAN access
Since my data is now centralized, security, replication, data encryption, HSM, backup, and disaster recovery policies can be implemented more easily. Another advantage is that I would be free from having to write and maintain tools for each OS, OS release vendor, etc.
The new box would have a tight coupling between the file system and the reliable storage. I might have RAID 1-like functions for small random access files and RAID 5-like functions for larger, sequentially accessed files. The file system could understand the topology of the file in question and read ahead based on access patterns like reading the file backward, even though the file might not be sequentially allocated. Tight coupling between the cache and the data would improve scaling and reduce latency and costs.
Ah, cost — that’s the key. What would the return on investment (ROI) be for this new data-centric device? Well, that’ where my dream ended. We may never know if this box would work, what the ROI would be, and whether or not people would actually buy it, but I do believe it meets the requirements of the market.
Can it be built? I think it can. Will it be built? I don’t know, but it sure would solve a bunch of problems if done correctly.
Please feel free to send any comments, feedback, and/or suggestions for future articles to Henry Newman.