| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mqus 1035 days ago
	In the regular testsuite (think CI) you want to have predictable results. Doing them again and again on the same code should give the same results so you can properly see with which code change things got wrong. Maybe it's simpler to explain it the other way around, for every new path your fuzzer(or other randomized test) tests, it also doesn't test a path it tested in a previous run and you probably want to add the failing paths it found to your regular test suite. Don't get me wrong, we should have more randomization, but it's not good everywhere, which might explain why we don't have as much of it.

4 comments

evil-olive 1035 days ago

it's rather easy to have both randomness and reproducibility, though:

generate a random seed, log it, then create an RNG using that "random, but recorded" seed. make sure all randomness used in the test flows from that explicitly-seeded RNG.

then, have an escape hatch where if a seed is provided as an environment variable, it will use that instead of generating one.

if you have a failure occur, you can always re-run with the same seed as a way to reproduce the failure (assuming it was indeed caused by that random seed and not some other factor)

depending on how fast the tests are, it may also be possible to run them multiple times with different seeds. for example, your on-every-commit CI run might run once with a hardcoded seed of 42. or it might run once with a hardcoded seed and once with a random seed.

and meanwhile, you might have a nightly test run that runs that same test suite 100 or 1000 times, with a different random seed each time.

link

jacquesm 1035 days ago

Any half decent fuzzing setup will log what it did prior so you can replay it to the point of failure. This gets a lot harder when you do multiple such runs in parallel.

link

rwmj 1035 days ago

AFL++ logs the specific input that causes the crash. In theory at least replaying the input ought to trigger the crash reproducibly. (Sometimes not the case if the program has lots of threads or is event driven or otherwise stochastic).

link

olluk 1035 days ago

That all true but at some point the combinations of paths explode. It is not possible to write tests for all the combinations then it possible to cover them eventually with some probability. Fuzzing covers more execution path combinations over time.

link

kragen 1035 days ago

hypothesis kind of solves this problem by adding each (minimized) failing input to a file and always running it thereafter

this is a little tricky to integrate into ci

link