Computer Science Department
School of Computer Science, Carnegie Mellon University
Analyzing Response Time in the Redundnacy-d System
Kristen Gardner, Samuel Zbarsky*, Mark Velednitsky**
Redundancy is an important strategy for reducing response time in multi-server queueing systems that has been used in a variety of settings, but only recently has begun to be studied analytically. The idea behind redundancy is that customers can greatly reduce their response time by waiting in multiple queues at the same time, thereby experiencing the minimum time across queues. Redundancy has been shown to produce significant response time improvements in applications ranging from organ transplant waitlists to Google's BigTable service. However, despite the growing body of theoretical and empirical work on the benefits of redundancy, there is little work addressing the questions of how many copies one needs to make in order to achieve a response time benefit, and the magnitude of the potential gains.
In this paper we propose a model to evaluate these questions. Our model is called the Redundancy-d system; each incoming job makes copies at a constant number of servers, d, chosen at random. We derive the first exact expressions for mean response time in Redundancy-d systems with any finite number of servers, as well as expressions for the distribution of response time which are exact as the number of servers approaches infinity, provided an asymptotic independence assumption holds. Using our analysis, we show that the biggest response time improvement comes from having each job wait in only d = 2 queues.