OP3: One Infinite Loop

Daniel Lupei

Having loops in a program that is to be symbolically executed increases the size of the execution tree exponentially. For instance, executing n iterations over the body of a loop that can generate k paths has the potential of producing up to kn paths. However, in some cases, these paths are very similar and their influence on the state of the application can be deduced using theorem proving support.
In the context of an application analysis that wants to prove that some property P holds for every execution path, we will first try to prove that P is an invariant of the loop. We will then symbolically execute only as many iterations as are necessary to affect the state of the application in a way that allows us to explore all the feasible paths starting with the first instruction after the loop. In other words, assuming that the code after the loop has nb branches whose conditions depend on data produced inside the loop, we will have to consider at most 2nb paths of the loop, in order to exercise all outcomes of those branches.
Depending on the type of analysis, proving that P is an invariant of the loop could be more or less complicated and is practical to do for properties like making sure that no out-of-bounds access happens. Once this stage is complete, one would have to identify the set of memory objects updated inside the loop WrSet(Loop) and intersect it with the set of memory objects referenced by the branch conditions following the loop, RdSet(Branches). Of course, WrSet(Loop) will be a symbolic set whose size and contents will probably depend on the driving factor of the loop, n. As we explore the paths after the loop, for every branch b and for every object o in WrSet(Loop)∩RdSet(b), we would have to determine the minimum number of iterations required such that the condition of the branch becomes true or false.
Let's consider the following example for an analysis that is trying to establish that no out of bounds access happens:

          A = Array[n];
          sum = 0;
          for i = 1 to n
              if A[i] >= 0 then sum += A[i];
              else sum += -A[i];
          if sum  < 0 then out_of_bounds access;
          else ...;

The first stage in our approach would be to prove that no out-of-bounds accesses take place inside the loop which is straightforward to do since AccSet(Loop) = {A[i]| i=1..n}.
In the second stage, since we have a single branch that depends on the WrSet(Loop), we will have to consider at most 2 paths from the total of 2n possible paths. We will then have to rely on a theorem prover to show that the sum of positive number can never be negative, and as a result there is no n such that the “out-of-bounds access” takes place.

While this approach will end up performing a comprehensive analysis across all possible behaviors of the program, it is limited by the power of the theorem prover to reason about the operations performed by the application across sets of objects. In the absence of such support, one could incrementally try iterating for a bounded number of times over the loop in the hope that this will be sufficient for exploring all the branches following it. In case of success the analysis can be declared complete, otherwise we cannot know for sure whether the unexplored paths are infeasible for any n or our bound was too small.