Computer Science Department
School of Computer Science, Carnegie Mellon University


A Practical Approach to Replication
of Abstract Data Objects

Joshua J. Bloch

May 1990

Ph.D. Thesis


Keywords: Replication, data structures, algorithms, distributed systems, weighted voting, fault tolerance, highly available systems, optimistic timestamps, optimistic two-stage protocols, replicated sparse memory, transaction systems

There is a great need for computer systems that remain available with high probability at all times. Highly available systems can be implemented on networks of general purpose computers by replicating data: storing the data redundantly at two or more of the nodes comprising the system. Some replication protocol is necessary to control access to the replicas. In essence, the replication protocol orchestrates the replicas to form a single distributed data object. If a replicated data object is to be used in an application where data consistency is required, the replicated object must display the same semantics as its serially accessed, single-site counterpart. It is difficult to design replication protocols that combine one-copy serializability with high = performance.

This dissertation describes an architecture that provides efficient, easy-to-use replicated implementations for a wide variety of useful data types, including directories, record files with secondary indices on selected fields, and priority queues. The data objects display single-copy serial semantics and provide high availability and concurrency. The architecture is relatively easy to implement as it derives its recovery and concurrency control properties from the support of an underlying distributed transaction system. A fairly complete prototype implementation of the architecture was built on top of the Camelot system. Experiments were performed to evaluate its performance.

The heart of the architecture is a family of efficient replication protocols that implement a class of table-like data objects called replicated sparse memories or RSMs. The replication protocols are based on Gifford's weighted voting technique. An underlying structural property of the RSM that allows efficient implementation of all its operations is proven. Simulation results are presented that suggest RSMs are time and space efficient in a wide variety of configurations. A Markov model of the RSM is constructed and analyzed. The analysis indicates that RSMs are time and space efficient in all configurations for all random operation mixes.

This dissertation introduces the concept of optimistic two-stage protocols, a new technique for reducing communication costs in a broad class of distributed algorithms. The architecture makes heavy use of optimistic two-stage protocols. In particular, optimistic timestamps are used to speed up blind writes.

180 pages

Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by