CMU-CS-21-101
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-21-101

An Evaluation of Compilation-Based PL/PGSQL Execution

Tanuj Nayak

M.S. Thesis

February 2021

CMU-CS-21-101.pdf


Keywords: User Defined Functions, Compilation, Inlining

User Defined Functions (UDFs) are an important analytical feature in modern Database Management Systems (DBMSs) due to their server-side execution proper-ties. These properties allow complex analytical queries to execute without serializing intermediate data over a network. However, query engines often incur significant overheads when executing UDFs due to them being non-declarative in contrast to SQL queries. This contrast causes a lot of context switching between UDF and SQL execution. As a given UDF invokes more SQL queries, these overheads become more noticeable. In this thesis, we investigate the extent to which compilation allow us to overcome such overheads. Compilation for executing SQL queries has become popular in database research in the past decade, especially in the context of main memory DBMSs. It has been shown to deliver significant improvements to query execution performance. We compare the technique of compiling UDFs with query inlining, another recent UDF execution technique. To make this comparison, we implemented a UDF compilation framework in NoisePage, a main-memory compilation-based DBMS. In this framework we compile UDFs into a domain-specific language (DSL)function and evaluated it against query inlining. We find that this framework has greater support across UDF language features than inlining frameworks and allows for more efficient functions. We also observe that our framework compiles functions into DSL primitives that are far more fine-grained and lightweight than most SQL operators. As a result, the SQL operators produced by the inlining approach incur a much larger performance overhead. On iteration-heavy benchmarks, the database system achieved performance gains from 2x to 120x with compilation relative to inlining.

66 pages

Thesis Committee:
Andy Pavlo (Chair)
Todd C. Mowry

Srinivasan Seshan, Head, Computer Science Department
Martial Hebert, Dean, School of Computer Science


Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu