Hacker News new | ask | show | jobs
by tialaramex 45 days ago
They've only linked a few tickets, so of course maybe when we see all 271 actual distinct things the insight won't apply but all those I examined ended up as some C++ code with a nasty bug in it.

Firefox is written in several languages, only about 25% of it is in C++ but every single one of these issues seems to touch the C++.

3 comments

A general limitation of this approach is that it is only as good as your validator, and there's nothing easier to validate than a test case that creates, say, an AddressSanitizer use-after-free. For subtler issues will we have to more specific validators or will the LLM become better at coming up with other dangerous conditions it will verify? We'll see.
> A general limitation of this approach is that it is only as good as your validator, and there's nothing easier to validate than a test case that creates, say, an AddressSanitizer use-after-free

Sure, but, surely AddressSanitizer would also detect the same problem in the C or Rust which together also make up about 25% of Firefox so... ?

It's possible Mythos is a lot better at finding vulnerabilities in C++ code than it is for other languages. After all, these models are also based on pre-existing security analysis.

From what I can tell, a lot of these bugs were hardly C++-specific, they just happened in C++ code. Even the most secure Rust can't magically catch things like TOCTOU issues.

> Even the most secure Rust can't magically catch things like TOCTOU issues

I suppose it depends what the word "magically" means. A ToCToU race is because you imagined things wouldn't change but they did and in Rust you actually do write fewer patterns with this mistake because of the Mutable xor Aliased rule. If we have at least one immutable reference to a Goose then Rust isn't OK with anybody mutating the Goose, your safe Rust can't do that and unsafe Rust mustn't do that. So the ToCToU race caused by "Oops I forgot somebody else might change the Goose" is less likely because you were made to wrestle with this problem during design - the safe Rust where you just forgot about this doesn't compile.

It's because they verified the bugs using AddressSanitizer so by construction it was only ever going to find C++ bugs.
But there is AddressSanitizer for Rust and for C too right? As I understand it AddressSanitizer consumes LLVM IR, so from its point of view some C, C++ or Rust is all the same, and presumably also if you are a famous Russian streamer and you hand wrote LLVM IR instead of using a real programming language that too?
Yes I was including C in "C++". I dunno how much C Firefox uses.

And I presume you can run AddressSanitizer with Rust but given Rust is memory safe by default, it's only going to find issues in `unsafe` code which is a tiny tiny fraction of most code. Google had a blog post a few months ago where they managed to put some actual numbers on this, because they almost shipped one Rust memory safety bug.

The lesson for other projects is very different if the reason these are all C++ bugs is just "We didn't ask Mythos for the bugs in Rust" versus if the difference is that asking Mythos for similar bugs in the Rust is futile because it won't find any.

Some of this is tempered if the pattern is that Mythos finds bugs mostly in dusty old C++ but the rates are much, much lower in newer C++, the reverse of Google's earlier finding for human researchers.

> The lesson for other projects is very different if the reason these are all C++ bugs is just "We didn't ask Mythos for the bugs in Rust" versus if the difference is that asking Mythos for similar bugs in the Rust is futile because it won't find any.

The answer is both of those. They didn't ask for bugs in the Rust code because it wouldn't have found any. They've explicitly set it up to only look for memory safety bugs. It's not going to find any in a memory safe language.

As long as the memory-safe subset of Rust is used exclusively.
> It's not going to find any in a memory safe language.

I mean, it's not supposed to find any in the unsafe language either, but that's why it was used.

Firefox not only uses unstable Rust features (via the exemption mechanism the same way Linux does it, trained professionals, closed course, do not attempt at home) it also presumably has some volume of its own explicitly unsafe Rust and so there's no reason this could not be checked, and what makes the difference here is whether it was or was not.