Computer Science Department
School of Computer Science, Carnegie Mellon University


Scheduling Solutions for Coping with Transient Overload

Nikhil Bansal, Mor Harchol-Balter

May 2001

This report supercedes CMU-CS-00-179.

Keywords: Overload, HIGH/LOW, scheduling, processor sharing, shortest processing remaining time, time-sharing, SRPT, priority, unfairness, starvation, heavy-tailed behavior, high variance

For most computer systems, even short periods of overload degrade performance significantly. The number of jobs in the system quickly grows, often exceeding the capacity of the system within just seconds, and response times explode.

In this paper we investigate system behavior under transient overload. We find that the poor behavior of systems under transient overload can at least partly be attributed to the scheduling policy traditionally used in systems. The traditionally-used scheduling policy is Processor-Sharing, or time-sharing, (PS). We derive analytical approximations as well as simulation results for the performance of a single PS queue under transient overload. Simulation and analysis agree. The number of jobs in the system and the system response times grows quite rapidly during overload, and even when the overload period ends, recuperation is very slow.

We propose a new solution for coping with transient overload: SRPT scheduling of jobs (Shortest Remaining Processing Time). We derive analytical approximations for the performance of a single SRPT queue under transient overload, and validate those approximations with simulation.

We evaluate our PS and SRPT approximations under a realistic job size distribution, a Bounded Pareto with a heavy-tailed property. We find that SRPT performs an order of magnitude better with respect to mean response time and mean queue length.

While SRPT might not seem like the best choice for large jobs, particularly under overload, it turns out that under our realistic workload big jobs do not perform worse under SRPT as compared with PS in expectation. We give intuition for this. Finally we pose some interesting open questions on the topic of starvation of large jobs.

27 pages

Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by