Hacker News new | ask | show | jobs
by moyix 3593 days ago
The paper [1] on AFLFast is, IMO, a great example of where academia shines: carefully looking at how and why something works, developing some theory and a working model, and then using that to get a substantial improvement on the state of the art (and doing a nice evaluation to show that it really works).

[1] https://www.comp.nus.edu.sg/~mboehme/paper/CCS16.pdf

1 comments

The abstract sounds like it. They said with no program analysis, though. I thought program analysis was good enough that it could probably auto-generate tests for every branch in a program, possibly in less time or with more assurance. W as I wrong or is this a parallel subfield?
The question of whether randomized testing or program analysis gives you more coverage of a program is a really interesting one. Böhme actually has an earlier paper that addresses this question: https://www.comp.nus.edu.sg/~mboehme/paper/FSE14.pdf

> We study the relative efficiencies of the random and systematic approaches to automated software testing. Using a simple but realistic set of assumptions, we propose a general model for software testing and define sampling strategies for random (R) and systematic (S₀) testing, where each sampling is associated with a sampling cost: 1 and c units of time, respectively. The two most important goals of software testing are: (i) achieving in minimal time a given degree of confidence x in a program's correctness and (ii) discovering a maximal number of errors within a given time bound n̂. For both (i) and (ii), we show that there exists a bound on c beyond which R performs better than S₀ on the average. Moreover for (i), this bound depends asymptotically only on x. We also show that the efficiency of R can be fitted to the exponential curve. Using these results we design a hybrid strategy H that starts with R and switches to S₀ when S₀ is expected to discover more errors per unit time. In our experiments we find that H performs similarly or better than the most efficient of both and that S₀ may need to be significantly faster than our bounds suggest to retain efficiency over R.

Also, for (say) C programs of moderate size, I believe current program analysis techniques do not achieve anything near 100% branch coverage. In the KLEE paper they manage to get ~90% on average for the GNU coreutils using concolic execution – which is really impressive! But the coreutils are tiny programs. For larger programs the cost of symbolic execution is high and random testing can often get you more bugs faster.

Forgot to say earlier, very insightful comment. I'm not sure if I trust the 2014 paper's claim at first glance. I'm going to have to think on it. Interestingly, their hybrid strategy in the abstract is one of the two approaches I was going to ask you about after your comment. I always think about combining very different methods whenever they're in competition with different strengths. Including program analysis and probabilistic testing.

KLEE result was interesting. I'm wondering about combining program analysis with semi-random testing that generates its values within the ranges, etc that come from program analysis. Might also spread the effort along the control flow graphs and such. Alternatively, as in dynamic analysis, do the kinds of instrumentation in software, annotations or compiler-level, that catch situations that are probably risky combined with input from random testing likely to set it off. Might speed up process but I imagine someone is already doing one or both of these.