Computer Science Department
School of Computer Science, Carnegie Mellon University
Deadlock Resolution in Pipelined Query Graphs
Vladislav Shkapenyuk, Ryan Williams,
Pipelining is a widely used technique that query execution engines employ to improve individual query execution times. In recently proposed settings, pipelining is used as an alternative to materialization to evaluate query plan graphs, where nodes in a query plan can have multiple parents. Such scenarios include shared table scans, runtime operator sharing, parallel sorting, and pipelined Multi-Query Optimization (MQO) plans. While pipelining in query graphs can increase performance, it can also lead to runtime deadlocks. The existing deadlock solutions focus on total deadlock prevention by statically deciding between pipelining and materialization, and apply only to MQO. The lack of runtime information leads to highly conservative decisions. Formally, this conservatism makes it NP-hard to find an optimal pipelining strategy that materializes a minimum cost set of nodes in a query graph.
We propose a novel dynamic scheme for detecting and resolving deadlocks in pipelined query graphs. Instead of statically determining what nodes to materialize, we pipeline every pipelinable operator in the query plan, resolving deadlock when it arises by materialization. At runtime, more information about producerconsumer behavior is available and as a result, the problem of detecting and resolving deadlock becomes polynomial time solvable, where each particular deadlock resolution is of minimum cost. Our techniques are applicable to any system that uses pipelined query graphs. An evaluation on TPC-H and Wisconsin benchmarks demonstrates cases where our approach eliminates all unnecessary materializations, and minimizes the overall query execution cost.