|   | CMU-CS-14-143R Computer Science Department School of Computer Science, Carnegie Mellon University 
 
 
Queueing with Redundant Requests: 
Kristen Gardner, Samuel Zbarsky*, Sherwin Doroudi**, January 2015 
Supercedes CMU-CS-14-143 
 Recent computer systems research has proposed using redundant requests to reduce latency. The idea is to run a single request on multiple servers and only wait for the first completion (discarding all remaining instances of the request). However no exact analysis of systems with redundancy exists. This paper presents the first exact analysis of systems with redundancy. We allow for any number of classes of redundant requests, any number of classes of non-redundant requests, any degree of redundancy, and any number of heterogeneous servers. In all cases we derive the limiting distribution on the state of the system. In small (two or three server) systems, we derive simple forms for the distribution of response time of both the redundant classes and non-redundant classes, and we quantify the "gain"" to redundant classes and "pain" to non-redundant classes caused by redundancy. We find some surprising results. First, in many cases the response time of the redundant class follows a simple Exponential distribution and that of the non-redundant class follows a Generalized Hyperexponential. Second, once a class is fully redundant, it becomes "immune" to any pain caused by other classes becoming redundant. We also compare redundancy with other approaches for reducing latency, such as optimal probabilistic splitting of a class among servers (Opt-Split) and Join-the-Shortest-Queue (JSQ) routing of a class. We find that redundancy outperforms JSQ and Opt-Split with respect to overall response time, making it an attractive solution. 
 
29 pages  
 
 | 
| 
    Return to: 
	SCS Technical Report Collection This page maintained by reports@cs.cmu.edu | |