CMU-CS-13-112 Computer Science Department School of Computer Science, Carnegie Mellon University
Scaling Concolic Execution of Binary Pongsin Poosankam August 2013 Ph.D. Thesis
Concolic execution is a technique for program analysis that makes the values of certain inputs symbolic, symbolically executes a program's code, and computes a symbolic logical formula to represent a desired behavior of the program under analysis. The computed formula is then solved by a decision procedure to determine whether the desired behavior is feasible and, if so, provide an example program input that satisfies the formula. Concolic execution and similar techniques have widely been applied to a variety of security-related applications including automatic test input generation, vulnerability discovery, exploit generation, signature generation, protocol reverse engineering, and detecting deviations between software implementations. Although there has been a great success in applying it to various security-related applications, a basic implementation of concolic execution only works well on small programs and scaling it to real-world binary programs is difficult. One reason is that programs often contain certain code constructs that are difficult to reason about directly such as loops and encoding functions. Another reason is that the number of symbolic formulas grows drastically in proportion to the size of the program being analyzed. These observations led us to develop three scaling techniques for concolic execution. The first scaling technique, loop-extended concolic execution, focuses on improving the efficiency of concolic execution when analyzing program portions that involve loops. The second technique, decomposition and re-stitching of concolic execution, addresses the issue that arose from the presence of encoding functions, which are difficult to reason about automatically. The third technique uses the state model of the program under analysis to guide concolic execution. Our techniques work on program binaries and do not require the presence of source code or debugging information in the binaries. We apply our scaled concolic execution to a variety of security-related applications. For each of our scaling techniques, we demonstrate that they significantly improve the performance and usability of automatic test input generation and vulnerability discovery, which are previously known applications of concolic execution. We also study unexplored applications of concolic execution in security-related problems such as malware genealogy and protocol model inference.
146 pages | |
Return to:
SCS Technical Report Collection This page maintained by reports@cs.cmu.edu |