CMU-CS-13-126 Computer Science Department School of Computer Science, Carnegie Mellon University
Algorithmic Engineering Towards Bin Fan December 2013 Ph.D. Thesis
First, to improve the worst-case throughput of a FAWN-KV system, we propose a randomized load balancing scheme that can fully utilize all the nodes regardless of their query distribution. We analytically prove and empirically demonstrate that deploying a very small but ex- tremely fast load balancer at FAWN-KV can effectively prevent uneven or dynamic workloads creating hotspots on individual nodes. Moreover, our analysis provides service designers a mathematically tractable ap- proach to estimate the worst-case throughput and also avoid drastic over- provisioning in similar distributed key-value systems. Second, to implement the high-speed load balancer and also to improve the space efficiency of individual key-value nodes, we propose novel data structures and algorithms, including the cuckoo filter, a Bloom filter replacement that is high-speed, highly compact and delete-supporting, and optimistic cuckoo hashing, a fast and space-efficient hashing scheme that scales on multiple CPUs. Both algorithms are built upon conventional cuckoo hashing but are optimized for our target architectures and workloads. Using them as building blocks, we design and implement MemC3 to serve transient data from DRAM with high throughput and low-latency retrievals, and SILT to provide cost-effective access to persis- tent data on flash storage with extremely small memory footprint (e.g., 0.7 bytes per entry).
134 pages | |
Return to:
SCS Technical Report Collection This page maintained by reports@cs.cmu.edu |