Computer Science Department
School of Computer Science, Carnegie Mellon University


Latency-Hiding Work Stealing

Stefan K. Muller, Umut A. Acar

July 2016
Updated April 2017


This is an updated version of
Computer Science Technical Report CMU-CS-16-112

A version of this work appears in the
Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '16).

Keywords: Latency, lantency hiding, parallel computing, work stealing

With the rise of multicore computers, parallel applications no longer consist solely of computational, batch workloads, but also include applications that may, for example, take input from a user, access secondary storage or the network, or perform remote procedure calls. Such operations can incur substantial latency, requiring the program to wait for a response. In the current state of the art, the theoretical models of parallelism and parallel scheduling algorithms do not account for latency.

In this work, we extend the dag (Directed Acyclic Graph) model for parallelism to account for latency and present a work-stealing algorithm that hides latency to improve performance. This algorithm allows user-level threads to suspend without blocking the underlying worker, usually a system thread. When a user-level thread suspends, the algorithm switches to another thread. Using extensions of existing techniques as well as new technical devices, we bound the running time of our scheduler on a parallel computation. We also briefly present a prototype implementation of the algorithm and some preliminary empirical findings.

35 pages

Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by