COMPUTER SCIENCE TECHNICAL REPORT ABSTRACTS

CMU-CS-97-108
Computer Science Department
School of Computer Science, Carnegie Mellon University

CMU-CS-97-108

Design and Implementation of Code Optimizations for a Type-Directed Compiler for Standard ML

David Tarditi

December 1997

Ph.D. Thesis

Keywords: Standard ML, optimization, compilation, functional programming, garbage collection, automatic storage management

The trends in software development are towards larger programs, more complex programs, and more use of programs as "component software." These trends mean that the features of modern programming languages are becoming more important than ever before. Programming languages need to have features such as strong typing, a module system, polymorphism, automatic storage management, and higher-order functions. In short, modern programming languages are becoming more important than ever before.

Even though modern programming lanuages are becoming more important than ever before, programmers have traditionally faced a dilemma: programs written in these languages traditionally have had lower performance than programs written in more conventional, but error-prone languages.

In this thesis, I study this problem in the context of one particular modern programming language, Standard ML. Standard ML contains all the language features mentioned previously and more. I use an empirical approach to understand where Standard ML programs spend their time and how to improve the performance of Standard ML programs through better optimization.

The thesis contains two main results. First, I find that a "pay-as-you-go" compilation strategy, where programmers pay for advanced language features only when they use them, is a practical strategy for compiling Standard ML. In fact, this strategy produces better code overall than a strategy that makes advanced language features run fast at the expense of slowing down programs that do not use those language features. Second, I find that compilers for Standard ML should focus on generating good code for the frequently-executed parts of programs. Specifically, just as compilers for conventional languages such as C focus on generating good code for loops, compilers for lanuages such as Standard ML should focus on generating good code for recursive functions.

These results suggest that compilation of modern programming languages such as Standard ML should have a great deal in common with compilation of more conventional languages such as C. First, Standard ML programs that do not use higher-order functions or polymorphism should run just as fast as comparable C programs. Second, Standard ML compilers should apply the same sets of optimizations to recursive functions that more conventional compilers apply to loops.

These results also suggest that programmers should be able to avoid the dilemma mentioned earlier: they should be able to write their programs in modern languages such as Standard ML, confident that they can rewrite parts of the programs in a subset of Standard ML if necessary for efficiency.

288 pages

*Department of Computer Science, University of Toronto, Toronto, Ontario, Canada, M5S 3G4.

Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by reports@cs.cmu.edu