Jazzer brings modern fuzz testing to the JVM

Y	Hacker News new \| ask \| show \| jobs

	Jazzer brings modern fuzz testing to the JVM (blog.code-intelligence.com)
	68 points by lrngjcb 1954 days ago

8 comments

fhenneke 1954 days ago

I'm one of the engineers behind Jazzer and happy to answer any questions about it.

We also have a blogpost that talks about the most interesting technical aspects of Jazzer: https://blog.code-intelligence.com/engineering-jazzer

link

layer8 1954 days ago

I couldn’t find any information on what specific kinds of errors are recognized (except JNI memory handling), or how (mechanism) one specifies to the tool what constitutes an error. Can you shed some light on that, or give a pointer to relevant documentation?

link

fhenneke 1953 days ago

By default, uncaught exceptions and memory issues in JNI libraries are reported as "crashes".

Additionally, Jazzer provides a hooking framework that can be used to implement domain-specific sanitizers for logic bugs. See https://blog.code-intelligence.com/engineering-jazzer#user-c... for an example. Part of the reason for open-sourcing Jazzer has been to get the discussion started on what kind of "sanitizers" are needed to unlock the full potential of Java fuzzing.

link

layer8 1953 days ago

Thanks!

link

fhenneke 1953 days ago

If you want to fuzz a Java web app, our commercial platform CI Fuzz (of which Jazzer is one part) has built-in detectors for the typical vulnerabilities such as SQL injections: https://blog.code-intelligence.com/sql-fuzzing

link

saagarjha 1953 days ago

Psst…

> The trampoline first pushes an address pointing to the addr & 0xFFF-th entry in a "sled" of 0xFFF=4096 ASM ret instructions to the (native) stack and then performs a direct jump (also called a "tail call") to the sanitizer callback.

0xFFF=4095 ;)

link

fhenneke 1953 days ago

Good catch, thanks ;-) I will update the post.

link

kodablah 1954 days ago

A little while back I wrote something similar[0]. Basically I applied AFL principles to the JVM by similarly implementing bytecode instrumentation in the lightest way I could and having "passes" of sorts that manipulated inputs using stages like AFL does. The readme explains the implementation details (I don't really maintain it or use it anymore and I never even published it to Maven, so it has old invalid jitpack links, but the code is quite solid).

0 - https://github.com/cretz/javan-warty-pig

link

invokestatic 1954 days ago

Interesting. I had a project that I wanted to use libFuzzer with custom instruction instrumentation. I never quite figured out how to pass back the custom instrumentation data back to libFuzzer.

This project seems to do just that by calling __sanitizer_cov_trace_cmp4. In retrospect, this seems like the obvious solution, and quite brilliant of this project to do that!

link

ekiwi 1954 days ago

If you are interested in fuzzing your Java code, you should also have a look at the JQF project which directly integrates with junit tests: https://github.com/rohanpadhye/JQF

link

serjd 1953 days ago

We are aware of JQF and the jUnit integration is the best part there. We opted to taking the "Fuzzed Data Provider" approach to be more compatible to the approach in C/C++, Go and Python...

link

ekiwi 1953 days ago

How do you deal with structured formats, like XML? In JQF you would just write a XML generator (see their examples). If you just use the "sequence of bytes" approach as AFL does, then a lot of your inputs might be immediately rejected by the parser.

link

fhenneke 1952 days ago

The FuzzedDataProvider (docs at https://codeintelligencetesting.github.io/jazzer-api/com/cod...) offers many of the functions you would need to write such a generator. If there is something missing that could be generally useful, we can always add it.

link

serjd 1952 days ago

We use our internal grammar generator similar to libprotobuf mutator. For the OSS solution, we recommend to use libprotobuf mutator though. The reason to abstract this is that we don't want to write the grammars for a single programming language only.

link

asicsp 1954 days ago

I feel the current title "Jazzer brings modern fuzz testing to the JVM" should include "open source" as well, since article title is "Fuzz Testing for JVM is now Open Source"

link

jgalt212 1954 days ago

Does anyone have any fun stories about fuzzers they ran that broke production systems that were inadvertently connected to the system under test?

link

khaledyakdan 1954 days ago

We've actually had a project where the customer had a testing environment for their web application. The fuzzer overwhelmed the system and we were asked to slow the fuzzer down so that the system can handle the load.

link

The_rationalist 1954 days ago

This talks about mutation testing, how does this compare to pitest? It would be nice to run Jazzer on core JVM projects such as Graalvm, spring, apache projects, etc

link

serjd 1953 days ago

In pitest mutations are seeded into your code, then your tests are run. The assumption is: If your unit test don't fail after changed code, it may indicate an issue with the test suite.

In fuzz testing, the mutations are seeded into the inputs. Depending on the fuzzing approach, those might be seeded from random, patterns, application behavior, etc. Jazzer is based on libFuzzer, meaning that it's feedback-loop is based on which coverage metrics are reported during run-time.

Integrating important JVM projects is work in progress ;-)

link

The_rationalist 1953 days ago

Very interesting, thanks! It's seems like a great Idea to reuse libFuzzer, I just hope that the JNI overhead isn't too big.

Openjdk 16 has https://openjdk.java.net/jeps/389 But it's not obvious if it improve performance

link

fhenneke 1952 days ago

Thanks for the link, I wasn't aware of this new feature!

Our coverage instrumentation does not rely on JNI calls, only the libFuzzer callbacks do, so the overhead shouldn't be too substantial. It's certainly not a proper benchmark, but one core on my laptop can fuzz the more non-trivial examples at around 10,000 exec/s. We are also working on some further performance improvements.

link

bArray 1954 days ago

I've not personally ever tried fuzzing - is there some nice introduction to the concept?

link

serjd 1953 days ago

Maybe a bit biased opinion here, but you could start with this blog post, and see whether you go more into C/C++ fuzzing or web fuzzing from there:

https://blog.code-intelligence.com/the-magic-behind-feedback...

https://github.com/google/fuzzing

link