Hacker News new | ask | show | jobs
by olliej 810 days ago
> There are a lot of use-after-frees and out-of-bounds accesses, buffer overflow in there.

Yes, and they’re in the runtime itself, which rust cannot protect you from. Rust cannot protect lifetime enforcement for GC objects any more than C++ already does, it can’t protect you against OoB when the reason for the OoB is the runtime is wrong about the object size, etc.

Rust does not magically make it impossible to have errors, it makes it harder by default, but the cases where these go wrong are already largely using c++ to provide the same level of memory safety rust can in the environment.

The easiest way to understand this is if you use `vec` you won’t get unsafe oob, but if there’s a bug in `vec` rust (or any language) cannot protect you. Eg if there’s a JVM bug that breaks arrays then the fact that Java is memory safe isn’t relevant.

3 comments

Also worth pointing out that the specific problem area, highly optimized runtimes for interpreted/JIT-compiled languages, the borrow checker doesn't really have much to offer. Rust's safe memory paradigm more or less requires an "owner" for every pointer, and by definition the arbitrary graph of pointers in the executed code aren't going to have that. Any such runtime is going to be built on a metric ton of unsafe.
I would argue that this is trying to fit a square peg in a round hole: The JS objects should not be considered "owners" for each other in the Rust ownership sense. The Isolate / heap uniquely owns all items within it: The arbitrary graph of "references" between objects are merely data for the garbage collection algorithm and keys that can be used to get items from the heap. They have no real way to actually access the item "pointed to" by the reference directly. All access must go through the Isolate (at least in a lifetime sense).

This would allow writing the system with either very little or no unsafe code (depending on the accessing method and heap structure chosen), and Rust's borrow checker would actually correctly tell you that you cannot hold an Array's buffer borrowed while calling some unknown function because the function can mutate anything in the heap and thus needs to take exclusive access to it.

EDIT: Also note that V8's "sandboxed pointers" access all already goes through the Isolate indirectly: All compressed pointers as part of their decompression use a "base" pointer. The same could be written (with unsafe) in Rust in such a way that the "base" pointer is the heap pointer that carries all ownership of the heap, and now decompressing pointers would be equal to taking a borrow of the whole heap.

This is not strictly true. If you took a magic Rust wand to the whole V8 codebase that converted it into exactly the equivalent Rust code, Rust would definitely not help you one bit. But the code would also be filled with unsafe blocks. Many commenters have been saying that this is because it must be, otherwise the engine couldn't be written. For the JIT side of things I definitely agree. But as for the runtime's static implementations (those written in C++/Rust) it is entirely possible to write the runtime in safe Rust which definitely would catch eg. the simple example error they use in the blog post.

One option would be to use an existing gc crate: It checks at runtime that Rust's borrowing rules are upheld, so it would abort the program instead of doing the OoB access. This is of course not nice but it is memory safe. Obviously this would also make the engine less performant as now we're doing extra work on every read and write into a GC object.

Another option is to let go of our idea about GC objects holding mutable pointer access to one another. V8 already uses offsets from a 4 GB offset to find the item; these are "compressed pointers" because they always know they need to just upshift a bit to get the actual pointer, and because C++ doesn't care about multiple objects holding pointers to one another. A Rusty alternative to this would be that the "compressed pointers" are considered only 32-bit offsets from some base pointer that is held by the Isolate: Now Rust would not allow these to be actual pointers or become references. Instead you'd need to implement some API at the Isolate that gives you a reference to an item based on that offset, and the reference's lifetime is determined by your Isolate's lifetime.

Now, any call to JavaScript is always (in a theoretical sense) capable of mutating anything within the Isolate's heap. This means that calling a JS function would require an exclusive `&mut Isolate` reference: This now means that Rust understands that it cannot hold a reference to the Array's buffer (which it got a reference to from the Isolate heap) during a call to a JS function.

This sort of API would internally need some unsafe because it is a pointer offset we're doing here. If you don't like that, you can also go a step further and build your heap as a collection of vectors and then have your "pointers" be a combination of a type key and an index an index into that type's vector. With this sort of heap structure there is no unsafe usage needed as getting a reference is just borrowing from a Vec within the heap.

These "Rusty" ways of building the heap offer some interesting philosophical thoughts to ponder: The V8 way is kind of a faithful structuring of the ECMAScript concept of "object ownership" in C++: Items refer to each other directly, and can even do it recursively. Ownership of the memory is just kind of... there. It's obvious, right? This object refers to that, so it owns it. Except maybe if they're recursive and not accessible elsewhere... I mean, just don't think about it! (Unless you're building the GC algorithm.) The safe Rust heap structures make the memory ownership quite concrete: The Isolate owns all of its GC objects / everything in its heap. GC objects and references between them are NOT actual memory ownership relations! This JS object referring to that JS object means nothing to the memory ownership. A JS object exists as long as the Isolate thinks it exists, even if no other object refers to it.

So, the C++ / unsafe Rust way to write the Heap tries to unify JS ownership logic and the host language ownership logic into one. When these two don't agree, bugs can happen (like in the simple example provided: JS Array semantics written naively in C++ cause issues because C++ has different expectations). The safe Rust way instead throws the JS ownership logic out of the host language entirely and forces the engine to implement that runtime ownership logic on its own.

Source: I am writing a JS engine in (safe) Rust using the "heap is a collection of vectors of Ts" method.

Sure, but a bug in Rust's vec is unlikely at this point & thus as long as you're in safe Rust you have no possibility of a memory error, which isn't the case for C++ vectors.

It can't protect you from lifetime issues with GC objects, but it can for almost everything else you're doing. They indicate 50% vulns are JIT and 50% are memory safety issues in the runtime, where GC is only part of it. If the bulk of the runtime issues are around GC lifetime confusion, I agree that Rust maybe wouldn't help. It might help to make sure you don't misuse the GC machinery which might be a significant mitigation, but given the bugs I've seen in the field around integrating with the GC, I doubt Rust would help with that class of bugs.

>Sure, but a bug in Rust's vec is unlikely at this point & thus as long as you're in safe Rust you have no possibility of a memory error

It has nothing to do with the built-in data structures because it doesn't even exist in the same space as them. The flaws themselves are in an algorithm's reasoning, it's not an issue that exists because somewhere in the codebase there's an out-of-bounds access on a vector. The issues are caused by said flawed reasoning generating bad machine code with erronious pointer arithmetic. Note that it's the reasoning itself generating bad pointer arithmetic, not pointer arithmetic that exists explicitly in the codebase.

It's the kind of problem you need proof systems to solve. A substructural type system (or a near-approximation like Rust's ownership semantics) is simply not robust enough for the problem domain, you need full blown dependent types for this kind of thing, something that can guarantee logical safety.

ATS can handle the job, but Rust can't.

Have you read the article? It's about the mitigation system (the heap sandbox) they have in place to limit the bugs in the JIT can inevitably generate, not about improving the security of the generated code.

Correct, Rust won't help with the JIT part. But it would help with the sandbox escape which is the 2nd exploit that has to be paired with a JIT exploit now. As they noted, these sandbox escapes are primarily dealing with trivial memory safety issues that Rust would just make impossible to begin with, thereby significantly raising the efficacy of the sandbox mechanism.

The problem is closer to that of an attacker that can write unsafe{} blocks than it is to attackers finding a bug in Vec.
The problem with the JIT? Sure. I'm talking about the sandbox escapes which is what you now need to pair JIT exploits with to get an RCE in the renderer. Rust would help eliminate those sandbox escapes more effectively than trying to continue to harden the C++ codebase.
You’re missing the point. You’re right there are unlikely to be bugs in vec, but there are also unlikely to be bugs in std::vector or WTF::Vector both of which error out on OoB (chrome/v8 uses hardened libc++).

I was using `vec` as an example of runtime code that is fundamentally implemented in unsafe code. The errors that are being discussed are errors in the runtime - eg the unsafe{} blocks of rust. It’s very difficult to write code in v8/blink (or JSC/webkit) that interacts with the relevant JS runtime in ways that make the code unsafe - just as you cannot normally interact with `vec` in a way that causes a memory safety error - however the runtime’s implementation of the safe interface is still has to eventually perform unsafe operations. The bugs that you see in V8, JSC, etc are almost invariably in code that would necessarily be unsafe region in rust that would not be preventable in rust.

Another example: `Arc`, `Rc`, and `Box` etc all allocate memory, and all your rust code can be built on those, and be safe (assuming no bugs in the refcounting, no compiler lifetime errors, etc), but the allocator beneath them still has to do everything correctly and the operations it performs are largely unsafe. There’s nothing rust can do to prevent a logic error from returning overlapping pointers. You can create lots of abstractions to make it harder to screw up, but you are the runtime at this point so the code that is requiring safety rules is also the thing specifying those rules. Eg if the erroneous state/logic that leads to an incorrect allocation is the same state/logic you are testing against to ensure you aren’t making an erroneous allocation. You can see how that impacts the safety profile of the code.

When JSC or V8 have a use after free vulnerability it’s almost always a runtime error because the overwhelming majority of allocations made by both engines are via their own GCs, and so definitionally should be sound. But if there’s a bug in the runtime (a missing barrier, or a scanning error in JSC), then objects can be erroneously collected and that’s how a UaF happens. There’s nothing rust or any safe language can do to make those errors impossible or unexploitable. All the runtime can do is structure the code to make errors as hard as possible, in rust that means minimizing the amount of time in unsafe{}, and add mitigations such that any error that does happen is hard to exploit.

When V8 and JSC have buffer overflows it’s because the metadata for an object says “there is this much memory available” but that is incorrect. Again rust cannot protect against this: you’re in the position of a `vec` with incorrect bounds information.

And that goes on for all the types of bug class. The vast majority of the security benefits rust offers for a language and vm runtime are available - and used - in c++. The bugs are in the code that would necessarily be unsafe{}.

Now in blink/webkit the moment you get beyond the relevant JS runtime you run straight into the standard C++ nightmare that rust, swift, JS, C#,… prevent so that’s another thing altogether.

You've said a lot and a lot of it is accurate, but none of it really applies to the sandbox which is what this blog post is about. Here's the most relevant part of the article:

> This code makes the (reasonable) assumption that the number of properties stored directly in a JSObject must be less than the total number of properties of that object. However, assuming these numbers are simply stored as integers somewhere in the JSObject, an attacker could corrupt one of them to break this invariant. Subsequently, the access into the (out-of-sandbox) std::vector would go out of bounds. Adding an explicit bounds check, for example with an SBXCHECK, would fix this.

> *Encouragingly, nearly all "sandbox violations" discovered so far are like this*

Emphasis mine.

This sandbox is about injecting an indirection layer to protect against those JIT issues (which Rust doesn't help with) from being used to escape the isolate's memory. What that means is that JIT issue has to be combined with a sandbox escape to have the same exploit as just the JIT without the sandbox.

Thus, sandbox escapes are a real concern & a critical part of the security model. Those happen because of a memory safety issue in the C++ code, not because of the JIT. That's 100% by design because if the JIT bypassed the sandbox the sandbox wouldn't do anything.

A Rust sandbox written with `#![forbid(unsafe_code)]` wouldn't have these issues. It might still because even safe Rust isn't 100% guaranteed to be sound due to compiler bugs, but now you're having to pair a JIT issue with a compiled Rust memory safety issue which is much much harder. It's going to be at least an order of magnitude more reliable than C++ even with a hardened libc++. That being said, I don't believe the sandbox alone would be enough. I believe `JSObject::GetPropertyNames` is in the runtime & that again isn't directly invoked by JIT nor is it code that requires unsafe.