|
|
|
|
|
by moyix
3593 days ago
|
|
The question of whether randomized testing or program analysis gives you more coverage of a program is a really interesting one. Böhme actually has an earlier paper that addresses this question: https://www.comp.nus.edu.sg/~mboehme/paper/FSE14.pdf > We study the relative efficiencies of the random and systematic approaches to automated software testing. Using a simple but realistic set of assumptions, we propose a general model for software testing and define sampling strategies for random (R) and systematic (S₀) testing, where each sampling is associated with a sampling cost: 1 and c units of time, respectively. The two most important goals of software testing are: (i) achieving in minimal time a given degree of confidence x in a program's correctness and (ii) discovering a maximal number of errors within a given time bound n̂. For both (i) and (ii), we show that there exists a bound on c beyond which R performs better than S₀ on the average. Moreover for (i), this bound depends asymptotically only on x. We also show that the efficiency of R can be fitted to the exponential curve. Using these results we design a hybrid strategy H that starts with R and switches to S₀ when S₀ is expected to discover more errors per unit time. In our experiments we find that H performs similarly or better than the most efficient of both and that S₀ may need to be significantly faster than our bounds suggest to retain efficiency over R. Also, for (say) C programs of moderate size, I believe current program analysis techniques do not achieve anything near 100% branch coverage. In the KLEE paper they manage to get ~90% on average for the GNU coreutils using concolic execution – which is really impressive! But the coreutils are tiny programs. For larger programs the cost of symbolic execution is high and random testing can often get you more bugs faster. |
|
KLEE result was interesting. I'm wondering about combining program analysis with semi-random testing that generates its values within the ranges, etc that come from program analysis. Might also spread the effort along the control flow graphs and such. Alternatively, as in dynamic analysis, do the kinds of instrumentation in software, annotations or compiler-level, that catch situations that are probably risky combined with input from random testing likely to set it off. Might speed up process but I imagine someone is already doing one or both of these.