Human-Computer Interaction Institute
School of Computer Science, Carnegie Mellon University


Asking and Answering Questions about the
Causes of Software Behavior

Amy J. Ko

May 2008

Ph.D. Thesis


Also appears as Computer Science Report

Keywords: Debugging, program understanding, Whyline, natural programming, end-user software engineering, reverse engineering, productivity, defect, fault, Alice, Java, Eclipse, program slicing, execution trace, instrumentation, Crystal

Program understanding accounts for the bulk of software development work. Unfortunately, little is known about why it is so difficult. To investigate this problem, multiple developer populations were observed debugging. These studies revealed that developers start by asking questions about program behavior, but must answer by speculating about the code responsible. For example, a developer wondering, "Why didn't this button do anything after I pressed it?" must conceive of a potential explanation such as "Maybe because its event handler wasn't called” and then use breakpoint debuggers, print statements, and other low-level tools that instrument and analyze code to verify their explanation. The studies showed that not only is this process poorly supported by current tools, but also that developers form valid explanations for only 10-20% of their attempts.

A new kind of program understanding tool called a Whyline was developed, which allows a developer to ask "why did" and "why didn't" questions directly about a program's output. In response, the tool determines which parts of the system and its execution are related to the output, also identifying any false assumptions the developer might have about what occurred during the execution of the program. This interaction helps developers avoid speculating about the cause of output, instead allowing the Whyline find code related to the output in question. Three prototypes were developed, supporting Alice (a programming language designed for building interactive 3D worlds), the Java programming language, and a word processing application for end users. In controlled experiments, all three prototypes significantly reduced time to complete debugging tasks (20-800% faster) and significantly increased success rates (by 20-200%).

345 pages

Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by