|
CMU-CS-99-111
Computer Science Department
School of Computer Science, Carnegie Mellon University
CMU-CS-99-111
Scalable Concurrency Control and Recovery for Shared Storage Arrays
Khalil Amiri*, Garth Gibson, Richard Golding**
February 1999
CMU-CS-99-111.ps
CMU-CS-99-111.pdf
Keywords: Input/output devices, concurrency, storage management,
reliability, distributed systems
Shared storage arrays enable thousands of storage devices to be shared
and directly accessed by end hosts over switched system-area networks,
promising databases and filesystems highly scalable, reliable
storage. In such systems, however, concurrent host I/Os can span
multiple shared devices and access overlapping ranges potentially
leading to inconsistencies for redundancy codes and for data read by
end hosts. In order to enable existing applications to run unmodified
and simplify the development of future ones, we desire a shared
storage array to provide the illusion of a single controller without
the scalability bottleneck and single point of failure of an actual
single controller. In this paper, we show how rapidly increasing
storage device intelligence coupled with storage's special
characteristics can be successfully exploited to arrive at a high
performance solution to this storage management problem. In
particular, we examine four concurrency control schemes and specialize
them to shared storage arrays; two centralized ones: simple server
locking, and server locking with leased callbacks; and two distributed
ones based on device participation: distributed locking using
storage-device-embedded lock servers and timestamp ordering using
loosely synchronized clocks. Simulation results show that both
centralized locking schemes suffer from scalability
limitations. Moreover, callback locking is particularly suspect if
applications do not have much inherent locality and if the storage
system introduces false sharing. Distributed concurrency control with
device support is attractive as it scales control capacity with
storage and performance capacity and offers the opportunity to
piggyback lock/ordering messages on operation requests, eliminating
message latency costs. Simulations show that both storage-optimized
device-based protocols exhibit close to ideal scaling achieving 90-95%
of the throughput possible under totally unprotected
operation. Furthermore, timestamp ordering uses less network
resources, is free from deadlocks and has performance advantages under
high load. We show how timestamp ordering can be extended with careful
operation history recording to ensure efficient failure recovery
without inducing I/Os under normal operation. This brings the overhead
of concurrency control and recovery to a negligible few percent
thereby realizing the scalability potential of the shared array I/O
architecture.
24 pages
*Department of Electrical and Computer Engineering, Carnegie Mellon University
**Hewlett-Packard Laboratories
|