Online Data Migration Performance: Preparing a Game Plan Page 2
An old approach to performing backups and data relocations is to do them at night while the system is idle. As previously discussed, this does not help with many current applications, such as e-business that require continuous operation and adaptation to quickly changing system/workload conditions. The approach of bringing the whole (or parts of the) system offline is often impractical, due to the substantial enterprise costs it incurs. Perhaps surprisingly, true online migration and backup are still in their infancy. But, existing logical volume managers (such as the HP-UX logical volume manager (LVM) and the Veritas Volume Manager, VxVM) have long been able to provide continuing access to data while it is being migrated. This is achieved by creating a mirror of the data to be moved, with the new replica in the place where the data is to end up. The mirror is then silvered (the replicas made consistent by bringing the new copy up to date) after which the original copy can be disconnected and discarded. An online data migration architecture (ODMA) uses this trick, too. However, at present, there isn't any existing solution that bounds the impact of migration on client applications, while this is occurring in terms that relate to their performance goals. Although VxVM provides a parameter (vol_default_iodelay) that is used to throttle I/O operations for silvering, it is applied regardless of the state of the client application. (To establish a backup mirror is to tell the primary disk set to copy its data over to the backup mirror, thus referring to this as silvering the mirror; a reference to the silver that is put on the back of a "real" mirror.)
High-end disk arrays provide restricted support for online data migration: the source and destination devices must be identical Logical Units (LUs) within the same array, and only global, device-level QoS guarantees such as bounds on disk utilization are supported. Some commercial video servers can re-stripe data online when disks fail or are added, and provide guarantees for the specific case of highly-sequential, predictable multimedia workloads. An ODMA does not make any assumptions about the nature of the foreground workloads, nor about the devices that comprise the SAN subsystem. It provides device-independent, application-level QoS guarantees. Existing storage management products can detect the presence of performance hot spots in the SAN when things are going wrong, and notify system administrators about them but it is still up to humans to decide how to best solve the problem. In particular, there is no automatic throttling system that might address the root cause once it has been identified.
Although an ODMA eagerly uses excess system resources in order to minimize the length of the migration, it is in principle possible to achieve zero impact on the foreground load by applying idleness-detection techniques to migrate data only when the foreground load has temporarily stopped. An ODMA also provides performance guarantees to applications (i.e., the "important tasks") by directly monitoring and controlling their performance.
There has been substantial work on fair scheduling techniques since their inception. In principle, it would be possible to schedule migration and foreground I/Os at the volume-manager level without relying on an external feedback loop. However, real-world workloads are complicated and have multiple, nontrivial properties such as sequentiality, temporal locality, self-similarity, and burstiness. How to assign relative priorities to migration and foreground I/Os under these conditions is an open problem.
For example, a simple 1-out-of-n scheme may work if the foreground load consists of random I/Os, but may cause a much higher than expected interference if foreground I/Os were highly sequential. Furthermore, any non-adaptive scheme is unlikely to succeed: application behaviors vary greatly over time, and failures and capacity additions occur very frequently in real systems. Fair scheduling based on dynamic priorities has worked reasonably well for CPU cycles, but priority computations remain an ad-hoc craft, and the mechanical properties of disks plus the presence of large caches result in strong nonlinear behaviors that invalidate all but the most sophisticated latency predictions.
Recently, control theory has been explored in several computer system projects. For example, control theory used to develop a feedback control loop to guarantee the desired network packet rate in a distributed visual tracking system. A control theory was applied to analyze a congestion control algorithm on IP routers. While these works apply control theory on computing systems, they focus on managing the network bandwidth, instead of the performance of end servers.
Feedback control architectures have also been developed for Web servers amd e-mail servers. In the area of CPU scheduling, a feedback was developed based on a CPU scheduler that synchronizes the progress of consumers and supplier processes of buffers. Scheduling algorithms based on feedback control were also developed to provide deadline miss ratio guarantees to real-time applications with unpredictable workloads. Although these approaches show clear promise, they do not guarantee I/O latencies to applications, nor do they address the SAN subsystem, which is the focus of an ODMA.
Finally, a feedback-based Web cache manager can also achieve differentiated cache hit ratio by adaptively allocating storage spaces to user classes. However, they also did not address I/O latency or data migration in SANs.
Summary and Conclusions
The focus in this article has been on providing latency guarantees, because the bounds on latency are considerably harder to enforce than bounds on throughput, as a technique that could bound latency would have little difficulty with throughput. The primary beneficiaries of QoS guarantees are customer-facing applications, for which latency is a primary criterion.
The main contribution of this article is a novel, control-theoretic approach to achieving these requirements. So, with the preceding in mind, an ODMA adaptively tries to consume as much as possible of the available system resources left unused by client applications, while statistically avoiding QoS violations. It does so by dynamically adjusting the speed of data migration to maintain the desired QoS goals, while at the same time maximizing the achieved data migration rate by using periodic measurements of the SAN's performance as perceived by the client applications. It guarantees that the average I/O latency throughout the execution of a migration will be bounded by a pre-specified QoS goal formulation. If desired, it could be extended straightforwardly to provide a bound on the number of sampling periods during which the QoS goal formulation was violated (although it did so reasonably and effectively without explicitly including this requirement; and suspected that doing so would reduce the data migration rate achieved) possibly more than was beneficial.
Finally, potential future work items in online data migration performance include a more general implementation that interacts with performance monitoring tools. This also includes developing a low-overhead mechanism for finer-grain control of the migration speed, making the controller self-tuning to handle different categories of workloads, and implementing a new control loop that can simultaneously bound latencies and violation fractions.
[[The preceding article is based on material provided by ongoing research at the Department of Computer Science, University of Virginia and the Storage and Content Distribution Department, Hewlett-Packard Laboratories, 1501 Page Mill Road, Palo Alto, CA 94304 on the "Aqueduct" online data migration tool. Additional information for this article was also based on material contained in the following white paper:
"Aqueduct: online data migration with performance guarantees" by Chenyang Lu, Guillermo A. Alvarez and John Wilkes available at: http://www.hpl.hp.com/personal/John_Wilkes/papers/FAST2002-aqueduct.pdf]
John Vacca is an information technology consultant and internationally known author based in Pomeroy, Ohio. Since 1982 Vacca has authored 39 books and more than 480 articles in the areas of advanced storage, computer security and aerospace technology. Vacca was also a configuration management specialist, computer specialist, and the computer security official for NASA's space station program (Freedom) and the International Space Station Program, from 1988 until his early retirement from NASA in 1995.