Etienne Kneuss
For symbolic execution, the key property is reachability: we are looking at
path constraints to see if those reaching error conditions are satisfiable. For
this reason we will use as an abstract domain formulas that express the path
constraints. Given the target application of Astrée, the formulas need to be in
a theory that supports numerical operations, arrays, tupples and is decidable.
In order to faithfully represent computations and as is common in symbolic
execution engines, we will use bit-vectors. AI also requires us to define a
partial order on that abstract domain (typically a complete lattice). The
partial order on formulas is given by
As for the transfer functions, they are naturally defined given the abstract
domain and the semantic of each statement. For a language like C, the transfer
function for each statement is given by the corresponding formula supported by
the logic. We assume that the language is already transformed into the
equivalent SSA form. For instance the statement P will yield the postcondition (P ∧ x. Array assignments
can be handled conservatively as in SE. Arithmetic operations are translated
to the corresponding operation on bit-vectors. This has the advantage of
accounting precisely for overflows. The statement _{14} = y_{13})x obtained from the SSA transformation can be
naturally translated to _{18} := Φ(x_{12}, x_{14})P ∧ (x. Branching statements are handled by simply taking
the conjunction of the branch condition with the pre-condition. Another
interesting statement to consider is _{18} = x_{12} ∨ x_{18} = x_{14})assert(C). During the actual
fix-point computation, we treat them as assume(C) statements, and thus
we generate P ∧ C as post-condition. When the fix-point is
computed, we look at all the assert statements and invoke a solver for P
∧ ¬ C. Solving this formula and obtaining a class of inputs
satisfying it will indicate that the assertion may fail using such inputs. It
remains for us to exclude false positives by testing or applying additional
reasoning.
Note that this lattice is of infinite height, and consequently it does not
satisfy the finite ascending chains condition which is necessary to ensure the
termination of the fix-point algorithm in the presence of loops. We must thus
rely on a widening operator to enforce (and more generally speed up)
termination. Strictly speaking, symbolic execution would not terminate in the
presence of such loops as well, so implementing a strict symbolic execution
engine using AI would require us to let it in this ill state. If we decide to
diverge from it for the sake of termination, we can implement a widening
operator One of the big advantages of using AI to develop such a system is that it will allow us to trade precision for (fast) termination in the presence of loops. In many applications, loops are frequent and termination of the analysis is a desired property. Also, when AI terminates and none of the verification conditions generated from the assertions could be violated, we have a guarantee that no concrete execution can violate these assertions (ie. there are no false negatives). Whether it terminates and how fast it does it is directly dependent on the widening function used, which leaves a lot of room for fine-tuning. However, one of the main disadvantages of AI is that it can generate a lot of false positives. |