OP4: Rethinking Symbolic Execution

Jonas Wagner

Extended static checking and symbolic execution are related techniques. They differ principally in their handling of conditions inside the program. Symbolic execution follows a single path through the program, collecting the constraints encountered into a path condition. On the other hand, static checkers translate constraints into guards. This allows them to reason about many paths simultaneously. Guards can be different for each variable manipulated by the program, reflecting the fact that some variables depend on more constraints than others. Guards evolve as control flow splits and merges again. 

The two techniques can benefit from each other in at least three ways: 

(1) Being scalable but unsound, static checkers are good at finding potential bugs in large applications. Contrary to symbolic execution, though, they cannot provide example inputs that cause the bug to manifest itself (and prove that it is not a false positive). Thus, I propose to combine the scalability of static checkers with execution synthesis based on symbolic execution. For example, ESD1 is an existing tool with the capability of generating an execution (i. e. program input and a thread schedule) that causes a bug to happen. It takes as input a core dump containing limited information about the location of the bug and the program state at this point. Static checkers can potentially provide ESD with much more detailed information, making this a perfect symbiosis. 

(2) Calysto pioneered the concept of exploiting the program structure to improve scalability. These ideas might be applied to symbolic execution as well. In particular, I like the lazy evaluation of functions in Calysto. In symbolic execution, could we jump over a function instead of executing it? The return value of the function (and the other values that the function modifies) would be replaced by a placeholder as in Calysto. Symbolic execution would continue after the function call. When it encounters new conditions, it might be possible to determine their feasibility without knowledge of the exact function result; only when this knowledge is needed, the function needs to be executed. 

(3) The third way in which ideas from static checking could be applied to symbolic execution is related to path constraints. As stated in the introduction, symbolic execution always considers a single path through the program, whereas static checking can reason about multiple paths simultaneously. If symbolic execution would support guard expressions in the spirit of static checking, it would not need to fork at every encountered condition. Instead, it could choose to either fork or model the condition as a guard. Consider an ifstatement whose body only affects a single variable. It might be modeled by adding a guard to just this variable. On the other hand, an if-statement where the two branches lead to very different program flow can be more efficiently handled by forking the program state and updating the path conditions. 

I hope that these ideas will help to eventually make symbolic execution practical. Using these ideas and other optimizations, symbolic execution can become faster and more scalable without sacrificing its strengths, namely soundness and the capability to generate testcases. On the other hand, as seen in point (1) above, static checking can also benefit from symbolic execution.

1 C. Zamfir and G. Candea, Execution synthesis: a technique for automated software debugging, EuroSys 2010