Storage security is one of those big issues that keep IT managers up at night, worrying about whether their data is safe from bad guys hoping to steal everything from credit card and Social Security numbers to bank accounts, corporate secrets or even celebrity passport files.
In a perfect world, what everyone wants is the information to be encrypted from the time it is created until the time it is destroyed. They also want data to be locked down so that every file can be accessed only by those who are authorized. It would be nice if all systems supported biometric access and no one could spoof it like you see in spy movies. Of course, none of this is really possible without extraordinary work.
But with data breaches growing as fast as data itself, it’s a good time to explore some of the larger challenges to supporting total data security. Of course, exploring and having an end-to-end solution are two different things, but if we get an idea of the extent and complexity of the problem, we at least have a start.
Key Management
Key management is always at the top of the list of challenges, and the reason for that is simple: It is really hard to have consistent and secure key management for everything from files, storage networks, disk drives, tapes and all of the other potential devices. Key management requirements differ between industries and governments. Different levels of security are required by different organizations and even within organizations, and each require different types of security policies.
Some places might want a model where a quorum of people are required to be able to change keys (you better not have a quorum of those people in the same place at one time, or it is possible that control of your data might be lost), and there are other places where a single person (not a good idea) or multiple people might have the same data security key control privileges.
You also have the issue of levels of security. Right now, you have different key management frameworks for tapes and disk drives, not to mention other things in the data path. There are groups working on key management that are trying to round up and herd the cats. The IEEE (https://siswg.net) is one of the groups, and last year at the IEEE Mass Storage Conference they held a key management summit. Key management is a problem that is going to be with us for a while.
Performance Issues
One of the biggest challenges with encryption is the performance hit. Naturally, the cost depends on where and what method you use to encrypt the data. For example, there is no cost for encryption of data using LTO-4 encryption hardware or for the Seagate Cheetah drive, and the reason is pretty simple: Both devices use hardware encryption that can run at the rate of the devices.
There are some devices that encrypt in the storage network. These devices might run at the full rate of the devices, but that might not be good enough for tape. If you encrypt before you compress, that eliminates the possibility of compression; what you need to do is first compress then encrypt. The question then arises whether the inline device can run at full rate with compression and encryption for tape drives. With tape drives getting faster and faster, this is often difficult to do at a reasonable cost.
Now what about the server side? I am aware of a number of HSM applications that can do a checksum of the file when reading or writing data to tape. I have also seen that when this checksum is turned on, the tape performance drops significantly.
We all know that the complexity of the calculation of a checksum is far less than the complexity of an encryption algorithm. Complex encryption algorithms are computationally intensive and commodity CPUs are not likely the best place to do this type of algorithm. These types of algorithms are best suited to specialized hardware such as an ASIC or some of the faster FPGAs. CPUs are getting faster and faster but memory bandwidth is not growing at the same rate as CPU performance, and moving data in and out of memory to encrypt it and decrypt it in systems that do a lot of I/O (databases, video and audio applications and backup systems) will not likely be able to keep up with the decryption requirements to ensure that the machine can meet I/O requirements.
Standards
While there may be a standards group working on standards for key management, that does not address, much less solve, the problem. This is no group trying to address the security and key management needs for the whole data path.
Since the encryption framework needs to be able to define encryption within the file system as the minimum starting point (I am aware of some groups that want to encrypt the data in memory and decrypt it before processing), a framework needs to be created that supports multiple levels of encryption for each file within a file system and allow provenance to be tracked and maintained for the file throughout its life.
Along with this type of maintenance, other things about long-term ownership must be addressed. For example, if the file is encrypted by a user but can be used by other users, how can the other users be authenticated? What happens to the file if the original person that created the file departs under less than ideal circumstances? How can the file be decrypted and given to another user for ownership? Who has that responsibility, and what security mechanisms are available to ensure that the file’s ownership is moved without giving access to the people doing the move? In an encrypted world, just because you have the root password does not mean that you can see everything.
These are just some of the problems that are going to happen with file systems. I am sure that you can imagine in a totally secure world that there are going to be problems with the storage network, the storage controller and storage devices. The management framework is a difficult problem.
All Together Now
Encryption is a difficult problem, given the performance issues and the management complexity. For encryption to work well and be manageable, it is going to require more than just addressing the key management problem, though that would be a great first step. It is going to require having a standards-based framework from the user down to the device. Different standards bodies will need to work together. That means that groups such as The OpenGroup (POSIX), IETF, ANSI T10, T11, T13 and others are going to have to all play nicely together in the SAN box, and if the future of storage is now FCoE, we have to add Ethernet to the mix.
Encryption today is addressing only point problems: Encrypt the tapes so we can ship them without worrying about the data, encrypt the disk drives so that if they are removed they cannot be read. These solutions solve real-world problems where people have lost data and it has cost a great deal. These solutions would not come close to addressing the problem that TJX had where their system was hacked and they lost millions of dollars.
The way I see it, we have two short-term major problems to overcome before encryption can begin to be implemented end-to-end. First, we need to figure out how to encrypt in the host without using the CPU, as current CPU technology is not fast enough to encrypt and decrypt at line rate while still providing enough computation capability for the real application.
And second, we need end-to-end standards that address encryption and key management for everything in the data path and everywhere the data could move. That means things like NFS, ftp, sockets and other movement methodologies have to work. This is a big order for the standards community and for the vendors that have to implement those standards.
I think we’re a long way off from a data secure world. Yes, some vendors will have vendor-specific solutions that work, but if we’re going to ever have a standards-based framework, we’re going to have to get everyone moving in the same direction.
Henry Newman, a regular Enterprise Storage Forum contributor, is an industry consultant with 28 years experience in high-performance computing and storage.
See more articles by Henry Newman.