Grammar-based vs. Automated Whitebox Fuzz Testing

Vitaly Chipounov

The first difference between the automated fuzz testing approach and the grammar-based one is that the former associates symbolic values with bytes, while the latter pairs them with tokens. In other words, in the first approach, the constraints operate on bytes, while in the second approach the constraints follow a context-free grammar. Conversely, a concrete value for the first method is a byte, whereas it is some elementary syntactic element of the language described by a grammar in the second method (like a keyword, an identifier, or a parenthesis).

The second difference is the way in which constraint solving is done, i.e., how the new set of inputs is computed for each run of the program. Both methods execute the program with concrete inputs, while collecting the constraints along the executed path. In the automated fuzzing approach, after the program is run, the symbolic engine negates each of these constraints one by one and invokes a constraint solver that generates a new set of program inputs satisfying the new constraints. The grammar-based approach uses a solver that returns the intersection between the context-free grammar and the regular language of the path constraints collected during the run. This intersection forms a language from which the engine will derive a new input string.

The main advantage of automated fuzz testing compared to the grammar-based approach is that it does not require any manual intervention. It is enough to specify a program to be tested and wait for the test cases. The grammar-based approach requires the user to define the grammar describing the inputs the program under test is supposed to accept. This has to be done for each new program with a different input format. Moreover, it requires to identify the routine that parses the input, to make it handle symbolic tokens. However, grammar-based approaches can be much more effective with programs that have to parse their inputs. Indeed, a simple fuzzing approach will almost never go deep in the program, because most of the generated inputs will be rejected by the parser. Thus, a grammar-based approach is well suited to prune unnecessary code paths and to cope with the state explosion problem for this specific type of programs.

Despite its advantages, the drawback of automated fuzz testing is that it relies on a SAT solver. A SAT solver aims to find a satisfying assignment to a boolean expression. This problem is NP-complete, meaning that it is unlikely that an efficient polynomial time algorithm exists for it. In the worst case, the solver could need an exponential amount of time depending on the number of symbolic variables and constraints. This is in contrast to the solver of the grammar-based approach, which computes the intersection of two languages, which is easier to do with a guaranteed efficiency bound.