Hacker News new | ask | show | jobs
by mrkeen 16 days ago
> To make things even worse, the community that has most thoroughly embraced them are compiler authors, who many programmers think of as being an impossibly skilled elite

The article's approach seems super ad-hoc, leaving you to have to think hard, do all the work, and make all the mistakes.

If you were to go down the other path, you might try dividing and conquering the problem. An arbitrary Pair<A,B> is trivially constructed from an arbitrary A and an arbitrary B. So if you can generate a string, and a number, you could generate a User full of number and string fields. If your generate function accepts a number describing how complex a string to make, then you can also choose how complicated to make your User. That's all shrinking needs to be. Repeatedly trying smaller Ns while the problem still happens (the problem being one of your unit tests - not an additional "interestingness" test you need to write.)

You'll probably way more likely to hit boundary cases by using the structure of the input and making interesting variations that way, rather than hoping you can permute the right bytes from the CLI.

1 comments

For more than 20 years I've been doing automatic test input reduction as part of testing Common Lisp compilers. The reduction is on randomly generated inputs, but they are structured in such a way that reduction always gives a valid program that should (in the absence of compiler errors) not signal an error.

It's a tremendously economical way to test compilers. For a modest and finite investment in testing infrastructure I get an unlimited number of tests. Over the years I've run many billions of test inputs on various Common Lisp implementations, although I'm mostly focusing on sbcl these days. When a bug is found the input quickly reduces to a something small that usually immediately tells the developers where the problem is (usually but not always something introduced recently.)

I also have a testing harness that cobbles together usually erroneous Lisp code and sees if the compiler blows up (the sbcl compiler as designed must never throw an error condition even on erroneous input.) This exploits a corpus of public Common Lisp code, combining and mutating the code in various ways.