Hacker News new | ask | show | jobs
by pag 3627 days ago
I worked on the cyber reasoning system (CRS) at Trail of Bits for our entry into the Cyber Grand Challenge [1]. Some slides describing the system are here [2].

Specifically, I implemented our fuzzer. I created a dynamic binary translator [3] that emulated the DECREE [4] operating system and x86 arhcitecture. It had the Radamsa [5] mutator built-in, along with a number of other simpler mutators.

I think our fuzzer out-performed our competitors, though I am biased ;-) The fuzzer was single-threaded, but could perform more than a million fuzz/mutate-execute (with coverage) iterations every two hours. Before I optimized it, it beat the pants off PIN [6]. We ran many such fuzzer processes concurrently. They would saturate the CPUs, and actually performed no I/O because I emulated all I/O in memory ;-) This was key to us achieving such high-throughput.

Our fuzzer wasn't super smart (though Radamsa is), but it benefited a lot from a feedback loop with our symbolic executors [7]. The symbolic executors would produce inputs that would then get fuzzed. These inputs could feed back into the symbolic executors, etc. That added more brains to our system.

All in all, we ran the CRS across something like 300 large EC2 nodes (across three availability zones). Per node, 8 or so fuzzers processes were running constantly for 24 hours. I'd ballpark that as 28.8 billion mutate+execute cycles.

In conclusion, the key for us was to make a production-quality, high-throughput fuzzer that did only one thing really well and really fast, then complement it with other more sophisticated tools like symbolic executors.

[1] https://blog.trailofbits.com/2015/07/15/how-we-fared-in-the-... [2] http://infiltratecon.com/archives/Slides_Artem_Dinaburg.pdf [3] https://en.wikipedia.org/wiki/Binary_translation [4] https://github.com/CyberGrandChallenge/libcgc [5] https://github.com/aoh/radamsa [6] https://software.intel.com/en-us/articles/pin-a-dynamic-bina... [7] https://en.wikipedia.org/wiki/Symbolic_execution

2 comments

It's impressive how many resources you threw at problem! We (Shellphish) had very similar results by using AFL [4] for fuzzing and angr [5] for symbolic execution (we published our approach at NDSS in February [1]) on around 300 cores. Of course, we procrastinated pretty heavily, ended up hacking our CRS together in three weeks, and it was absurdly rough around the edges and didn't get anywhere near your crash numbers during the qualifying event itself. As we discuss in the paper, our experiments with the impressive numbers were carried out afterwards, in less chaotic conditions.

It's interesting that the approaches taken by us [1], you [2], and ForAllSecure [3] for the CQE (at least on the exploitation side) were so similar. I've talked with two other teams that had an analogous setup (as well as two other teams, who did quite well, that took a very different route). I guess some great minds think alike!

As a side note, in the ToB blog post, you talk about wanting to join up with another team to be able to play in the final event. Did you guys end up finding a partner? It'd be interesting to face your CRS again next month :-)

[1] https://www.internetsociety.org/sites/default/files/blogs-me... [2] https://blog.trailofbits.com/2015/07/15/how-we-fared-in-the-... [3] https://blog.forallsecure.com/2016/02/09/unleashing-mayhem/ [4] http://lcamtuf.coredump.cx/afl/ [5] http://angr.io

We ran three versions of our system, because in the last few weeks/days before the quals we had a bunch of regressions. One of those regressions was caused by a single line change in my fuzzer :-p On the twitter account we used to track our progress, we used rapper names for each version. One version was two weeks old, one was up-to-date with the fixed fuzzer, and one had an experimental version of one of our symbolic executors that produced <var> tags. That didn't work.

We also under-utilized those nodes :-( Each node had at at least 4 idle cores wasting our money. Our resource allocation mechanism was naive.

I looked through some of the stuff released byt DARPA after the event and they released some of our PoVs as official PoVs. If you hex-decode them, you'll see something like "bad seed to Radamsa"!! That was a bug in how I would invoke Radamsa -- sometimes I'd pass it a seed that was way too big.

We tried to team up with every team but ForAllSecure. No one wanted to have our name on their ticket, or they were just fishing for details :-/ We've done a bit of work on the system since, getting it to work on Linux programs via a "port" of parts of libc to DECREE.

You didn't approach us, either; bummer! I doubt we'd have been down to team up outright, but maybe some of you guys could have done an internship in our lab if you really wanted to work on the CGC? I guess it's ancient history now, though.

Resource utilization is definitely tricky. But man, the amount of resources you had is just mind-boggling! I just realized it's even more cores than DARPA gave us for the final event! I'm impressed you guys managed to keep it all running smoothly (at least, it seemed that way from our lab, where there was complete chaos as our system fell over and crapped itself repeatedly for the first few hours).

Hey Zardus, I'm sure the info reached you. What team are you? We approached everyone either indirectly through broadcast emails or directly through me calling/emailing.

I'm not sure an internship would have been what we were looking for. If I had to guess, that's probably why we didn't pursue it further.

Interestingly, Radamsa is implemented in Scheme, then transpiled to C. Originally I played around with invoke Radamsa as a server (it's normal usage model), but this wasn't ideal because I wanted to use it at varying granularities, which would have meant multiple invocations per mutate/execute cycle. What I ended up doing was to take the compiled Scheme, get rid of all the syscalls, link it directly into the fuzzer program binary, and turn its `main` function into something I could call directly!
Badass tech and fuzzing results in your main comment. :) What Scheme or transpiler did you use, though?