CMU-ISR-21-101 Institute for Software Research School of Computer Science, Carnegie Mellon University
Improving Patch Quality by Enhancing Key Mauricio Soto February 2021
Ph.D. Thesis
The error repair process in software systems is, historically, a resource-consuming task that relies heavily on manual developer effort. Automatic program repair approaches have enabled the repair of software with minimum human interaction mitigating the burden on developers, reducing the costs of manual debugging and increasing software quality. However, a fundamental problem current automatic program repair approaches suffer is the possibility of generating low-quality patches that overfit to one program specification as described by the guiding test suite andn otg eneralizing to the intended specification. This dissertation rigorously explores this phenomenon on real-world Java programs and describes a set of mechanisms to enhance key components of the automatic program repair process to generate higher quality patches. These mechanisms include an analysis of test suite behavior and their key characteristics for automatic program repair. We analyze the effectiveness of three well-known repair techniques: GenProg, PAR, and TrpAutoRepair, on defects made by the projects' developers during their regular development process, and modify and analyze the impact modifying characteristics such as size, coverage, provenance, and number of failing test cases has on the quality of the produced patches. A second mechanism toward increase patch quality describes a set of research questions aimed at analyzing developer code changes to inform the mutation operator selection distribution. We create a probabilistic model that describes how often human developers choose each of the different mutation operators available to automated repair techniques, and we later use this probabilistic model to create an APR approach informed by this distribution to generate higher quality patches. Finally, the third mechanism describes a repair technique based on patch diversity as a means increase the quality of the best performing patch in a patch population, and an evaluation of patch consolidation as a mechanism to increase patch quality. Some of the main findings in this dissertation are:
134 pages
James D. Herbsleb, Director, Institute for Software Research
| |
Return to:
SCS Technical Report Collection This page maintained by reports@cs.cmu.edu |