Is Key-Value Data Storage in your Future? - Page 2
A simple example of the difficulty in using key-value storage is POSIX IO functions allow you to seek to a specific point in the file (offset). There is no corresponding key-value function that allows you to move a file pointer to a specified position in a file. With key-value storage you either get the entire value associated with the key or you get nothing. In the case of the Kinetic drive you get up to a 1 MiB chunk of data or you get nothing. You can't seek to a specific spot in a file and then do some sort of IO operation.
This is just an example of the issues surrounding the use of key-value storage. You either have to re-write your application to a specific IO library or you have to write a POSIX compliant (or close enough) library that talks to the key-value storage.
Another option is to write the IO library so that it has an intermediate file system for doing seeks and other operations that key-value storage can't provide. When a file is accessed or opened or created, a POSIX compliant file system, which is used as an intermediate file system, does all of the operations required by the application. Then at certain points the data in the intermediate file system is "flushed" or copied to the key-value storage. Making sure the data in the key-value storage is consistent with the intermediate file system takes some careful design and coding. This isn't the easiest solution but it's somewhere between a non-POSIX IO library and a POSIX compliant library.
One area where key-value storage can work well is archive storage. Archive storage is where you can park data that is rarely used but you want to keep around for a period of time. This usage pattern indicates that applications are not likely to directly interact with archive storage alleviating the need for POSIX compatibility or a re-write of the application. Really you just need tools to "put" the data into the archive and "get" it from the archive when you need it. These two operations, "put" and "get" map very well to key-value storage.
As I indicated earlier, it's not easy to create a POSIX compliant file system using key-value storage because of the difficult in mapping all of the POSIX IO functions to the simple functions of key-value storage. However that doesn't mean it's impossible - merely difficult. One way to achieve this is to take existing object-oriented file systems and adapt them to key-value storage. Ceph is a perfect example of this.
Ceph can present storage to clients as either an object-oriented file system, block storage, or as files. The file system inside Ceph that backs all three types of storage is object based. Version 0.80 (Firefly) of Ceph had some experimental support for a key-value OSD (Object Storage Device). So it is definitely possible to create a storage solution that isn't just an archive. It doesn't have to have the limited performance of archive storage either. Other file systems with good performance that could be adapted to use key-value storage are Lustre and GlusterFS. Performance is not a limiting factor for key-value storage.
Key-value Storage: Summary
Key-value storage is quickly becoming a popular storage technology. It is a fundamental data representation in many computer languages and is used in a great number of database tools. It's a very convenient mechanism for storing data. The large capacity but lower performance storage world is abuzz with key-value storage concepts. But key-value storage isn't limited to archive or lower performing storage. On the contrary, it can be used for faster performing storage as examplified by Ceph.
As an example of what you can do with key-value storage and how simple it can be, Seagate has created a new storage drive called Kinetic that you address using REST-like commands such as get, put, and delete. A simple open-source library allows you to then develop IO libraries so that applications can perform IO to/from the drives. Some object storage solutions such as Swift have already been ported to use the Kinetic drives. Ceph is also developing a version that can use Kinetic drives. Other object based storage systems such as Lustre and Gluster could theoretically use this technlogy as well.
Keep an eye on key-value storage. It could be coming to a file system near you.
Photo courtesy of Shutterstock.