|
|
|
|
|
by timtadh
3525 days ago
|
|
It is actually called Delta Debugging and was pioneered by Andreas Zeller (https://www.st.cs.uni-saarland.de/dd/). His first paper on it ("Yesterday, my program worked. Today, it does not. Why?") used essentially the technique above to isolate the failure inducing change. Surprisingly, techniques like this work very well and doing better is sometimes very difficult especially for large programs. For instance, in theory, using "Concolic Execution" which combines symbolic execution, constraint solving, and concrete execution one can do better. However, such systems have their own limitations with respect to program complexity. See also the tremendous effectiveness of AFL (American Fuzzy Lop) http://lcamtuf.coredump.cx/afl/ . AFL uses a straight forward technique to mutate code. It doesn't use any fancy program analysis and has been able to find lots of bugs in real programs. |
|
Disclaimer: I've never used AFL myself but I've read articles about it with great interest (especially the crazyweird experiment when it created valid JPG images out of thin air, that was brilliant).
So during fuzzing, it finds lots "interesting" program-inputs (that cause crashes, bugs and weird behaviour). But it also has a different mode, where it tries to minimize these program-inputs to the essential parts that cause the behaviour. Since the bugs found are generated from (semi) randomly mutated inputs, the "interesting" inputs often also contain extraneous data that just happens to be there, but isn't relevant to the particular "interesting" behaviour found.
From what I understand, it uses this minimization mode after fuzzing, between sets of fuzzing runs (for good seed inputs to start with) and perhaps also during fuzzing (not sure). I read about it, but I forgot how it works exactly. It's probably explained on lcamtuf's site or AFL docs. I'm assuming it uses a similar method as proposed by dflock above, iterating deleting random stuff as long as the "interesting" behaviour remains.
I also wouldn't be surprised if non-AFL style fuzzer toolsets have similar test-input minimization tools. They also generate random inputs usually, just not AFL's clever pruning technique of keeping track of previously-seen program states to guide the random search to prefer novel states. Which I believe is the most revolutionary idea that makes AFL perform so uniquely well.