|   | CMU-CS-23-128 Computer Science Department
 School of Computer Science, Carnegie Mellon University
 
    
     
 CMU-CS-23-128
 
Optimal Scheduling in Multiserver Queues 
Isaac Grosof 
Ph.D. Thesis 
July 2023  
CMU-CS-23-128.pdf
 
Keywords: 
Queueing theory; scheduling; optimal scheduling; response time; multiserver; tails; M/G/1; M/G/k; dispatching; Shortest Remaining Processing Time; SRPT; First-Come First-Served; FCFS; Nudge; Guardrails; work; Multiserver-job; ServerFilling; ServerFilling-SRPT; DivisorFilling; WCFS; Finite Skip; RESET; MARC; Gittins index; heavy-traffic; stochastic improvement; tail probability
 
 
Scheduling theory is a key tool for reducing latency (i.e. response time) in queueing systems. Scheduling, i.e. choosing the order in which to serve jobs, can reduce response time by an order of magnitude with no additional resources. Scheduling theory is well-developed in single-server systems, where one job is processed at a time. However, little is known about scheduling in multiserver systems, where many jobs are processed at once. Results are especially limited in stochastic multiserver scheduling theory. Today's datacenters have thousands of servers, and scheduling theory is unable to analyze such systems. 
This thesis proves the first optimality results and first closed-form bounds on
mean response time for scheduling policies in stochastic multiserver models which reflect the behavior of modern computing systems. The thesis explores three themes: 
 
I start by studying one-server-per-job multiserver models, and prove the first
results on optimal scheduling in that setting. Optimality results are proven for
both a central-queue model and a dispatching model. I invent a novel class of
dispatching policies, guardrails, to achieve these results.
Next, I study the multiserver-job (MSJ) model, where different jobs require
different amounts of resources to be served. I prove the first characterization
of mean response time for any scheduling policy in the MSJ model, as well as
the first optimality results. I invent novel scheduling policies, ServerFilling and
ServerFilling-SRPT, to achieve these results.
Finally, I study the effects of scheduling on the tail of response time, rather than
mean response time. The prior state-of-the-art for scheduling for the tail was
First-Come First-Served, which was conjectured to achieve optimal asymptotic
tail of response time. I invent a novel scheduling policy, Nudge, which I prove
to be the first policy to outperform FCFS's asymptotic tail of response time.
 
325 pages
Thesis Committee:Mor Harchol-Balter (Chair)
 Alan Scheller-Wolf
 Anupam Gupta
 Weina Wang
 Michael Mitzenmacher (Harvard University)
 
Srinivasan Seshan, Head, Computer Science DepartmentMartial Hebert, Dean, School of Computer Science
 
 |