| HN Mirror

I was going to mention AFL as well, but for a different reason (unless I misunderstood your post).

Disclaimer: I've never used AFL myself but I've read articles about it with great interest (especially the crazyweird experiment when it created valid JPG images out of thin air, that was brilliant).

So during fuzzing, it finds lots "interesting" program-inputs (that cause crashes, bugs and weird behaviour). But it also has a different mode, where it tries to minimize these program-inputs to the essential parts that cause the behaviour. Since the bugs found are generated from (semi) randomly mutated inputs, the "interesting" inputs often also contain extraneous data that just happens to be there, but isn't relevant to the particular "interesting" behaviour found.

From what I understand, it uses this minimization mode after fuzzing, between sets of fuzzing runs (for good seed inputs to start with) and perhaps also during fuzzing (not sure). I read about it, but I forgot how it works exactly. It's probably explained on lcamtuf's site or AFL docs. I'm assuming it uses a similar method as proposed by dflock above, iterating deleting random stuff as long as the "interesting" behaviour remains.

I also wouldn't be surprised if non-AFL style fuzzer toolsets have similar test-input minimization tools. They also generate random inputs usually, just not AFL's clever pruning technique of keeping track of previously-seen program states to guide the random search to prefer novel states. Which I believe is the most revolutionary idea that makes AFL perform so uniquely well.