| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by simonhughes22 1966 days ago
	It would be great if (in the repo) you could briefly explain what fuzzing in and why you'd need it. I assume it's some sort of obfuscation tool?

3 comments

Twirrim 1965 days ago

There's a good intro here: https://www.microsoft.com/en-us/research/blog/a-brief-introd... and afl++'s main documentation is here https://aflplus.plus/ which talks a bit about it.

The goal is to find bugs in code by throwing random data at it, in as an intelligent fashion as possible. You can do that a few ways:

* Give structured data to mutate a bit.

* Just throw random data at it. You could do this with any binary that accepts data either via stdin or from a file.

* Instrument the code, throw random data at it and see what paths of code get triggered and feed that back into the data generator. Drawback is you need to be able to compile all the code involved, so it gets fully instrumented.

AFL/AFL++ sits in the third camp. You compile your code using it, and it then uses information it gets back to figure out ways to trigger code paths, by applying intelligent mutations. It's possible to, e.g. have code that parses a PNG image file, start AFL++ off with no initial data, and it will fairly quickly start producing valid PNG images.

It's a very effective approach for finding bugs. On the AFL++ site there's a small trophy cabinet, and AFL has a larger one (older project) https://lcamtuf.coredump.cx/afl/.

link

dathinab 1966 days ago

> it's some sort of obfuscation tool?

I didn't expect someone on HN not to know this but then there are not only programmers here I guess ;=)

It's a tool to find bugs. To strongly oversimplify: It throws random inputs at a program until it crashes.

So you could say it's a tool to complement a test suit.

link

wcarss 1965 days ago

https://xkcd.com/1053/

(edit: not to imply you were making fun, your answer was great! But in general: everyone learns things every day. Even every programmer has a day where they learn what fuzzing is)

link

dathinab 1965 days ago

Your right my answer was kinda impolite, I apologize.

link

nano-erud 1965 days ago

This term is not used much. Most know more about "random testing" or "monkey testing", much more common out there. I think fuzzing is used a lot to find security holes and I think it is kind of old, very used yet though, but it is not something that is seen everywhere by programmers outside of systems programming. Not all programmers work in the same field, so it is not uncommon for someone not to know about this. In my case, I associate the term fuzzing with matching, for example.

link

kbenson 1965 days ago

Somewhat. I think it might mostly be that it provides a much greater return for those using languages where incorrectly handled values have a higher chance of causing much worse problems, like C and C++. I think if you write in those languages, or like me you haven't for almost 20 years but you're just still very interested in developments about them because they often seem to illuminate the weird quirks of computing and CPUs, then fuzzing is a much more common thing to have heard about.

Not that fuzzing isn't useful for higher level or managed languages, just that it's extra useful when you throw likely segfaults into the mix.

link

smt1 1965 days ago

Fuzzing is ROI efficient (especially for time invested) even if you don't intend to find a segfault, but just want to see how a program works or performs across different input states either in or out of its usual domain (and you can direct the fuzzing many ways derandomizing it or constraining the search space, or using virtualizer like qemu). I like to think of it as "semantics engineering" with spare CPU cycles.

I use fuzzers with a Redex driver usually, which is unusually great at intelligently driving fuzzers: https://docs.racket-lang.org/redex/index.html

link

Thaxll 1966 days ago

Fuzzing is a technique where you send lot of random or not so random data to the input of a program to see how it reacts, does it crash, does it handle that properly ect ...

For example you want to test your JSON parser, what happens if I send "{", ""\\{" etc ...

link

dwheeler 1965 days ago

Fuzzers can find defects, including vulnerabilities, that might be missed by other tools. AFL used a newer technique, called being "coverage guided", that turned out to be a remarkable improvement. As a coverage guided tool it monitors how many times various code branches are taken, and if the count is different than what has seen before, the input is considered "more interesting". AFL++ inherits this capability.

An impressive demo (from AFL) is that it was able to figure out the required format for a JPEG file given only one text file (which is not a JPEG file): https://web.archive.org/web/20201210022938/https://lcamtuf.b...

If you're fuzzing open source software, you might consider applying to OSS-Fuzz https://github.com/google/oss-fuzz which provides a lot of free compute power to run fuzzers (so that vulnerabilities can be found & fixed).

link

not2b 1965 days ago

The technique has been used for at least two decades in hardware verification, though the terminology is different. If you search the literature, you'll find terms like "constrained functional verification", "coverage directed test generation", "functional coverage directed test generation", and the like. The technique is the same, random testing, with mutation to try to hit more and more coverage points.

link

pfdietz 1965 days ago

It goes back af least that far in software, with the original fuzzing work from U. Wisc and McKeeman's "Differential Testing for Software". Those are blackbox techniques; AFL's advance was using a general grey box approach.

link

not2b 1965 days ago

The hardware approach isn't blackbox, it explicitly uses the reachable state space and constraint solving to reach more coverage points, to do this the exact circuit representation is needed.

link