Computer Science Department
School of Computer Science, Carnegie Mellon University
Practical Parallel Divide-and-Conquer Algorithms
Jonathan C. Hardwick
Specifically, I define the team parallel model, which has four main features: data-parallel operations within teams of processors, the subdivision of these teams to match the recursion of a divide-and-conquer algorithm, efficient single-processor code to exploit existing serial compiler technology, and an active load-balancing system to cope with irregular algorithms. I also describe Machiavelli, a toolkit for parallel divide-and-conquer algorithms that implements the team parallel model. Machiavelli consists of simple parallel extensions to C, and is based around a distributed vector datatype. A preprocessor generates both serial and parallel versions of the code, using MPI as its parallel communication mechanism to assure portability across machines. Load balancing is performed by shipping function calls between processors.
Using a range of algorithm kernels (including sorting, finding the convex hull of a set of points, computing a graph separator, and matrix multiplication), I demonstrate optimization techniques for the implementation of divide-and-conquer algorithms. An important feature of team parallelism is its ability to use efficient serial algorithms supplied by the user as the base case of recursion. I show that this allows parallel algorithms to achieve good speedups over efficient serial code, and to solve problems much larger than those which can be handled on one processor. As an extended example, a Delaunay triangulation algorithm implemented using the team-parallel model is three times as efficient as previous parallel algorithms.