Hacker News new | ask | show | jobs
by gorgonzolachz 1587 days ago
I'm cautiously excited/optimistic for this.

That said, I wonder if this feature will end up seeing all that much usage? Once your inputs are sanitized/bounded in a Golang application, it's pretty hard to get the language to do things it wasn't meant to do - and if the fuzzing system is built like the unit testing system, the number of useful fuzzing cases you can run won't be that large.

I've always thought that fuzzing primarily benefits E2E/integration testing, and that with modern languages' type systems and lack of pointer arithmetic the usage of fuzz testing is useful for niche cases (embedded programming and cryptography come to mind). The examples in the article (integer overflows, truncated input, invalid unicode) may be issues, but they won't break the integrity of the program or cause catastrophic failure (assuming there are no panic() calls in the code) due to Go's type system.

4 comments

The type system can't catch nil dereferences or out-of-bounds accesses, which (in my experience) are the most common causes of runtime panics in Go. I assure you, there are many ways to make a Go program crash. :) If your program is handling untrusted input -- and almost every useful program does -- then you really should be fuzzing your input handlers.
I have been a big proponent of generative testing / fuzzing, where rather than hard-coding inputs you simply always generate them. You can see it in Haskell’s QuickCheck or Clojure’s test.check, but there are probably many others.

It is absolutely useful outside of just a few parsers; a type system is not able to catch all bugs. Just a recent example was a scheduling system I was working on, where the input fuzzing was able to put the system in states that I did not anticipate.

Practically speaking, with fuzzing, you’re changing tests from manually crafting certain inputs and validating the results, to defining “laws” about how your system should behave.

Here’s a good resource on the types of tests you can develop using this technique: https://fsharpforfunandprofit.com/posts/property-based-testi...

> rather than hard-coding inputs you simply always generate them

Why not both?

In any case, a rudimentary implementation is actually in Go's standard library already for a long time: https://pkg.go.dev/testing/quick@master

Though, note:

> The testing/quick package is frozen and is not accepting new features.

> Why not both?

Absolutely fine and often good enough, these tests are typically very simple and easier to reason about, and make more sense in a whole range of situations (eg regression tests).

However, I would take a single input fuzzing test over one that uses hard coded inputs.

> Why not both?

Go's fuzz tester takes this approach. When a failing input is found, it's added to the source code directory with the intent of you checking it in.

testing/quick is not coverage-driven and so not good at truly "interesting" input cases. It's OK for simple invariants (e.g. anything arithmetic with an inverse) but I would not trust it to tell me anything interesting about a parser. At this stage I might even rely on the fuzzer to test simple invariants because the tooling is nicer.
There is a limit to the errors a compiler can detect. Here is an example where fuzzing is required.

I'm the author of [qjson](https://github.com/qjson). It converts a human readable json text (qjson) into json. The intended use is for configuration files or data input.

For instance, numeric values may be simple math expressions. I used go-fuzz and it detected right away that I forgot to deal with division by zero in these math expressions.

It is impossible for the compiler to detect such errors in the program. It fully depends on the data, and when it is complex, the risk of errors are high.

> I forgot to deal with division by zero in these math expressions. It is impossible for the compiler to detect such errors in the program.

I'm sure there's a general class of problem where it really is impossible, but you haven't found it.

All you needed to prevent this is dependent types, which Go doesn't have. With dependent types the compiler sees a = b / c and it immediately can conclude that c's type is non-zero, since if it was zero that's a divide-by-zero error. Having refined the type to non-zero, an attempt earlier to load it with a value despite not being sure if that value is zero will fail. The buggy qjson won't compile.

I have the same question as well. Dare say it seems to be anti-go to package such niche capabilities in the stdlib, but maybe their direction is changing?
> seems to be anti-go to package such niche capabilities in the stdlib

Go seems to tend heavily on the side of packaging things into the stdlib.

It has stuff like "net/rpc/jsonrpc" (a json-rpc 1.0 impl, even though you should use grpc or json-rpc 2.0, both of which are not in the stdlib). It has crap like net/smtp and archive/tar. It has image/jpeg for some unknown reason.

If you look at rust, basically half of go's stdlib exists as external crates (even if they're officially maintained, akin to the golang.org/x/ packages in go). Everything from json, to http, to image support in rust is outside the stdlib, and honestly that has ultimately let rust evolve better support for each of those things.

C, C++, (to a lesser extent) java, D, etc.. basically every language in a similar space to Go has a much smaller and more compact stdlib.

The go stdlib clearly doesn't err on the side of being small.

Deno is another project that is trying to add more to a stdlib. In some ways this is really fantastic because in the JavaScript ecosystem there is so much fatigue in deciding what libraries to use. On the flip side, if you couple too much into the stdlib then you’re limiting the huge benefit of an open market.
> It has image/jpeg for some unknown reason.

I don't know why they added this and the other image packages specifically, but I used this the other day and it was nice to not have to bring in a separate package.

I mean, they realized their mistake for other image formats, hence: golang.org/x/image: https://pkg.go.dev/golang.org/x/image

Now it's really confusing to have to figure out to import stdlib image for jpeg, but x/image for bmp.

For bonus points, check out "image/draw" vs "golang.org/x/image/draw".

Fuzz testing is probably the #1 software security innovation of the last 10 years. It's "niche" only because it's currently hard to do. For people that have set it up, it's essential. With Go 1.18, it's easy for anyone to set it up, and a lot of people are going to find a lot of dumb bugs. Pretty much everyone that has written fuzz tests for their software has found at least one crash. I found one in a program with 100% test coverage within minutes of writing the first fuzz test. Sometimes you miss dumb things, and the human-written "I'm going to trick my program into malfunctioning" tests can simply forget things. (In my case, I had a few branches with similar logic. The unit tests did run all the code in the branches, but only tested the boundary case for two of the three. The fuzz testing found the third case immediately.)

Crashes in Go vary in severity (memory isn't usually corrupted, code isn't usually modified, the other goroutines don't stop serving requests because you probably have "recover" somewhere up the chain), but at the very least, by identifying the input that can cause crashes, you gain the ability to turn invalid input into an easy-to-understand error message, saving users and operators time and frustration. Users can retry the request with valid input. Operators don't have to freak out about log spam. And, of course, sometimes the crashes are a big deal; maybe you missed the "recover", or maybe you're calling out to unsafe code that DOES corrupt memory.

To me, fuzz testing is the headline feature in the 1.18 release. And it's the release that introduces generics.

Seems completely in line to me:

- go is a broad stdlib langage

- go has a bespoke toolchain with little to no hooks, meaning anything which needs to integrate directly into the build process or to instrument the runtime benefits extensively from being brought in-tree e.g. profilers, sanitisers, and (guided) fuzzers

- finally go’s “niche” are mainly network daemons and CLI utilities, so lots of interacting with network streams and file processing, which are the main use cases for fuzzing