Computer Science Department
School of Computer Science, Carnegie Mellon University
Scheduling Solutions for Coping with Transient Overload
Nikhil Bansal, Mor Harchol-Balter
This report supercedes CMU-CS-00-179.
In this paper we investigate system behavior under transient overload. We find that the poor behavior of systems under transient overload can at least partly be attributed to the scheduling policy traditionally used in systems. The traditionally-used scheduling policy is Processor-Sharing, or time-sharing, (PS). We derive analytical approximations as well as simulation results for the performance of a single PS queue under transient overload. Simulation and analysis agree. The number of jobs in the system and the system response times grows quite rapidly during overload, and even when the overload period ends, recuperation is very slow.
We propose a new solution for coping with transient overload: SRPT scheduling of jobs (Shortest Remaining Processing Time). We derive analytical approximations for the performance of a single SRPT queue under transient overload, and validate those approximations with simulation.
We evaluate our PS and SRPT approximations under a realistic job size distribution, a Bounded Pareto with a heavy-tailed property. We find that SRPT performs an order of magnitude better with respect to mean response time and mean queue length.
While SRPT might not seem like the best choice for large jobs, particularly under overload, it turns out that under our realistic workload big jobs do not perform worse under SRPT as compared with PS in expectation. We give intuition for this. Finally we pose some interesting open questions on the topic of starvation of large jobs.