Hacker News new | ask | show | jobs
by egnehots 810 days ago
They say that Rust is not enough and dismiss it quickly:

> V8 vulnerabilities are rarely "classic" memory corruption bugs (use-after-frees, out-of-bounds accesses, etc.) but instead subtle logic issues which can in turn be exploited to corrupt memory. As such, existing memory safety solutions are, for the most part, not applicable to V8. In particular, neither switching to a memory safe language, such as Rust, nor using current or future hardware memory safety features, such as memory tagging, can help with the security challenges faced by V8 today.

But looking at the awesome list they provided:

https://docs.google.com/spreadsheets/d/1lkNJ0uQwbeC1ZTRrxdtu...

There are a lot of use-after-frees and out-of-bounds accesses, buffer overflow in there...

3 comments

> There are a lot of use-after-frees and out-of-bounds accesses, buffer overflow in there.

Yes, and they’re in the runtime itself, which rust cannot protect you from. Rust cannot protect lifetime enforcement for GC objects any more than C++ already does, it can’t protect you against OoB when the reason for the OoB is the runtime is wrong about the object size, etc.

Rust does not magically make it impossible to have errors, it makes it harder by default, but the cases where these go wrong are already largely using c++ to provide the same level of memory safety rust can in the environment.

The easiest way to understand this is if you use `vec` you won’t get unsafe oob, but if there’s a bug in `vec` rust (or any language) cannot protect you. Eg if there’s a JVM bug that breaks arrays then the fact that Java is memory safe isn’t relevant.

Also worth pointing out that the specific problem area, highly optimized runtimes for interpreted/JIT-compiled languages, the borrow checker doesn't really have much to offer. Rust's safe memory paradigm more or less requires an "owner" for every pointer, and by definition the arbitrary graph of pointers in the executed code aren't going to have that. Any such runtime is going to be built on a metric ton of unsafe.
I would argue that this is trying to fit a square peg in a round hole: The JS objects should not be considered "owners" for each other in the Rust ownership sense. The Isolate / heap uniquely owns all items within it: The arbitrary graph of "references" between objects are merely data for the garbage collection algorithm and keys that can be used to get items from the heap. They have no real way to actually access the item "pointed to" by the reference directly. All access must go through the Isolate (at least in a lifetime sense).

This would allow writing the system with either very little or no unsafe code (depending on the accessing method and heap structure chosen), and Rust's borrow checker would actually correctly tell you that you cannot hold an Array's buffer borrowed while calling some unknown function because the function can mutate anything in the heap and thus needs to take exclusive access to it.

EDIT: Also note that V8's "sandboxed pointers" access all already goes through the Isolate indirectly: All compressed pointers as part of their decompression use a "base" pointer. The same could be written (with unsafe) in Rust in such a way that the "base" pointer is the heap pointer that carries all ownership of the heap, and now decompressing pointers would be equal to taking a borrow of the whole heap.

This is not strictly true. If you took a magic Rust wand to the whole V8 codebase that converted it into exactly the equivalent Rust code, Rust would definitely not help you one bit. But the code would also be filled with unsafe blocks. Many commenters have been saying that this is because it must be, otherwise the engine couldn't be written. For the JIT side of things I definitely agree. But as for the runtime's static implementations (those written in C++/Rust) it is entirely possible to write the runtime in safe Rust which definitely would catch eg. the simple example error they use in the blog post.

One option would be to use an existing gc crate: It checks at runtime that Rust's borrowing rules are upheld, so it would abort the program instead of doing the OoB access. This is of course not nice but it is memory safe. Obviously this would also make the engine less performant as now we're doing extra work on every read and write into a GC object.

Another option is to let go of our idea about GC objects holding mutable pointer access to one another. V8 already uses offsets from a 4 GB offset to find the item; these are "compressed pointers" because they always know they need to just upshift a bit to get the actual pointer, and because C++ doesn't care about multiple objects holding pointers to one another. A Rusty alternative to this would be that the "compressed pointers" are considered only 32-bit offsets from some base pointer that is held by the Isolate: Now Rust would not allow these to be actual pointers or become references. Instead you'd need to implement some API at the Isolate that gives you a reference to an item based on that offset, and the reference's lifetime is determined by your Isolate's lifetime.

Now, any call to JavaScript is always (in a theoretical sense) capable of mutating anything within the Isolate's heap. This means that calling a JS function would require an exclusive `&mut Isolate` reference: This now means that Rust understands that it cannot hold a reference to the Array's buffer (which it got a reference to from the Isolate heap) during a call to a JS function.

This sort of API would internally need some unsafe because it is a pointer offset we're doing here. If you don't like that, you can also go a step further and build your heap as a collection of vectors and then have your "pointers" be a combination of a type key and an index an index into that type's vector. With this sort of heap structure there is no unsafe usage needed as getting a reference is just borrowing from a Vec within the heap.

These "Rusty" ways of building the heap offer some interesting philosophical thoughts to ponder: The V8 way is kind of a faithful structuring of the ECMAScript concept of "object ownership" in C++: Items refer to each other directly, and can even do it recursively. Ownership of the memory is just kind of... there. It's obvious, right? This object refers to that, so it owns it. Except maybe if they're recursive and not accessible elsewhere... I mean, just don't think about it! (Unless you're building the GC algorithm.) The safe Rust heap structures make the memory ownership quite concrete: The Isolate owns all of its GC objects / everything in its heap. GC objects and references between them are NOT actual memory ownership relations! This JS object referring to that JS object means nothing to the memory ownership. A JS object exists as long as the Isolate thinks it exists, even if no other object refers to it.

So, the C++ / unsafe Rust way to write the Heap tries to unify JS ownership logic and the host language ownership logic into one. When these two don't agree, bugs can happen (like in the simple example provided: JS Array semantics written naively in C++ cause issues because C++ has different expectations). The safe Rust way instead throws the JS ownership logic out of the host language entirely and forces the engine to implement that runtime ownership logic on its own.

Source: I am writing a JS engine in (safe) Rust using the "heap is a collection of vectors of Ts" method.

Sure, but a bug in Rust's vec is unlikely at this point & thus as long as you're in safe Rust you have no possibility of a memory error, which isn't the case for C++ vectors.

It can't protect you from lifetime issues with GC objects, but it can for almost everything else you're doing. They indicate 50% vulns are JIT and 50% are memory safety issues in the runtime, where GC is only part of it. If the bulk of the runtime issues are around GC lifetime confusion, I agree that Rust maybe wouldn't help. It might help to make sure you don't misuse the GC machinery which might be a significant mitigation, but given the bugs I've seen in the field around integrating with the GC, I doubt Rust would help with that class of bugs.

>Sure, but a bug in Rust's vec is unlikely at this point & thus as long as you're in safe Rust you have no possibility of a memory error

It has nothing to do with the built-in data structures because it doesn't even exist in the same space as them. The flaws themselves are in an algorithm's reasoning, it's not an issue that exists because somewhere in the codebase there's an out-of-bounds access on a vector. The issues are caused by said flawed reasoning generating bad machine code with erronious pointer arithmetic. Note that it's the reasoning itself generating bad pointer arithmetic, not pointer arithmetic that exists explicitly in the codebase.

It's the kind of problem you need proof systems to solve. A substructural type system (or a near-approximation like Rust's ownership semantics) is simply not robust enough for the problem domain, you need full blown dependent types for this kind of thing, something that can guarantee logical safety.

ATS can handle the job, but Rust can't.

Have you read the article? It's about the mitigation system (the heap sandbox) they have in place to limit the bugs in the JIT can inevitably generate, not about improving the security of the generated code.

Correct, Rust won't help with the JIT part. But it would help with the sandbox escape which is the 2nd exploit that has to be paired with a JIT exploit now. As they noted, these sandbox escapes are primarily dealing with trivial memory safety issues that Rust would just make impossible to begin with, thereby significantly raising the efficacy of the sandbox mechanism.

The problem is closer to that of an attacker that can write unsafe{} blocks than it is to attackers finding a bug in Vec.
The problem with the JIT? Sure. I'm talking about the sandbox escapes which is what you now need to pair JIT exploits with to get an RCE in the renderer. Rust would help eliminate those sandbox escapes more effectively than trying to continue to harden the C++ codebase.
You’re missing the point. You’re right there are unlikely to be bugs in vec, but there are also unlikely to be bugs in std::vector or WTF::Vector both of which error out on OoB (chrome/v8 uses hardened libc++).

I was using `vec` as an example of runtime code that is fundamentally implemented in unsafe code. The errors that are being discussed are errors in the runtime - eg the unsafe{} blocks of rust. It’s very difficult to write code in v8/blink (or JSC/webkit) that interacts with the relevant JS runtime in ways that make the code unsafe - just as you cannot normally interact with `vec` in a way that causes a memory safety error - however the runtime’s implementation of the safe interface is still has to eventually perform unsafe operations. The bugs that you see in V8, JSC, etc are almost invariably in code that would necessarily be unsafe region in rust that would not be preventable in rust.

Another example: `Arc`, `Rc`, and `Box` etc all allocate memory, and all your rust code can be built on those, and be safe (assuming no bugs in the refcounting, no compiler lifetime errors, etc), but the allocator beneath them still has to do everything correctly and the operations it performs are largely unsafe. There’s nothing rust can do to prevent a logic error from returning overlapping pointers. You can create lots of abstractions to make it harder to screw up, but you are the runtime at this point so the code that is requiring safety rules is also the thing specifying those rules. Eg if the erroneous state/logic that leads to an incorrect allocation is the same state/logic you are testing against to ensure you aren’t making an erroneous allocation. You can see how that impacts the safety profile of the code.

When JSC or V8 have a use after free vulnerability it’s almost always a runtime error because the overwhelming majority of allocations made by both engines are via their own GCs, and so definitionally should be sound. But if there’s a bug in the runtime (a missing barrier, or a scanning error in JSC), then objects can be erroneously collected and that’s how a UaF happens. There’s nothing rust or any safe language can do to make those errors impossible or unexploitable. All the runtime can do is structure the code to make errors as hard as possible, in rust that means minimizing the amount of time in unsafe{}, and add mitigations such that any error that does happen is hard to exploit.

When V8 and JSC have buffer overflows it’s because the metadata for an object says “there is this much memory available” but that is incorrect. Again rust cannot protect against this: you’re in the position of a `vec` with incorrect bounds information.

And that goes on for all the types of bug class. The vast majority of the security benefits rust offers for a language and vm runtime are available - and used - in c++. The bugs are in the code that would necessarily be unsafe{}.

Now in blink/webkit the moment you get beyond the relevant JS runtime you run straight into the standard C++ nightmare that rust, swift, JS, C#,… prevent so that’s another thing altogether.

You've said a lot and a lot of it is accurate, but none of it really applies to the sandbox which is what this blog post is about. Here's the most relevant part of the article:

> This code makes the (reasonable) assumption that the number of properties stored directly in a JSObject must be less than the total number of properties of that object. However, assuming these numbers are simply stored as integers somewhere in the JSObject, an attacker could corrupt one of them to break this invariant. Subsequently, the access into the (out-of-sandbox) std::vector would go out of bounds. Adding an explicit bounds check, for example with an SBXCHECK, would fix this.

> *Encouragingly, nearly all "sandbox violations" discovered so far are like this*

Emphasis mine.

This sandbox is about injecting an indirection layer to protect against those JIT issues (which Rust doesn't help with) from being used to escape the isolate's memory. What that means is that JIT issue has to be combined with a sandbox escape to have the same exploit as just the JIT without the sandbox.

Thus, sandbox escapes are a real concern & a critical part of the security model. Those happen because of a memory safety issue in the C++ code, not because of the JIT. That's 100% by design because if the JIT bypassed the sandbox the sandbox wouldn't do anything.

A Rust sandbox written with `#![forbid(unsafe_code)]` wouldn't have these issues. It might still because even safe Rust isn't 100% guaranteed to be sound due to compiler bugs, but now you're having to pair a JIT issue with a compiled Rust memory safety issue which is much much harder. It's going to be at least an order of magnitude more reliable than C++ even with a hardened libc++. That being said, I don't believe the sandbox alone would be enough. I believe `JSObject::GetPropertyNames` is in the runtime & that again isn't directly invoked by JIT nor is it code that requires unsafe.

Type confusion is also a very common attack against JS runtimes and V8 specifically. Of course, it's not trivial to build a high-performance JS runtime without playing around with pointer types pretty liberally, so I can understand saying "Rust won't fix this" in regards to those attacks.

But those attacks would basically not be possible against a runtime built on top of Java or C#.

Yes because the attack would be against the .net or Java VM.

The JVM - especially in the era of applets - had an illustrious history of VM bugs. We don’t know how bad they would have been because in the era of extremely complex exploits applets essentially do no exist. Neither .net nor the jvm are exposed to the degree of attacks the js engines are, and there’s no strong reason to believe they don’t have similar bugs today.

I'm not singing the praises of the JVM here, it's just a simple fact that if you implement your runtime in a higher level language you're exposed to a smaller number of potential vulnerabilities. Unchecked array dereferences turn into bounds-checked array dereferences; unchecked typecasts turn into checked typecasts. Null pointer dereferences turn into null reference exceptions. Etc.

Of course once you start jitting native code, all of that is off the table. Unless you jit to java/.net bytecode, I guess.

No, you're missing the point. The whole point is you're implementing the runtime that defines the safety semantics. Your proposal is essentially "implement your JS engine GC on top of the JVM by just using the JVM's GC", i.e. don't implement the GC yourself. The unsafe code is now the JVM GC, and you've just moved the problem from "implement the JS engine's GC" to "Implement the JVM's GC", and they same problems continue to exist.

I am really struggling to understand where this gap in understanding is occurring. It does not matter what environment or language you implement a JS engine (or whatever) in. The attacker is going to attack the unsafe portion of the runtime. If you build you JS engine on top of the JVM, then the attacker is not going to attack your JS engine's runtime, they attack the JVM's.

The JVM, .NET, etc runtimes are not doing anything different to what the JS engine runtimes are doing, and aren't magically free of the same bugs. If anything they're probably doing less to protect from or prevent attacks, because they have a much much smaller attack surface (because they aren't generally exposed to everything on the internet) and the reason attackers have to target the JS engine runtime is because the JS sandbox does not allow the general system access "correct" and completely uncompromised .NET or JVM code have. Attacks on the JVM and .NET generally mean "convince the VM to load correct code that does something that a specific app/service is not meant to do but the VM generally allows applications to do", whereas a JS VM does not allow an attacker to do anything outside of the JS sandbox, so they must compromise the runtime.

It may be easier to understand if we try to present this in a different way:

JSC can be compiled as an interpreter for any cpu architecture because there is a fall back C backend for the interpreter code generator, so you can compile JSC to WASM. Then you could make a version of webkit than executed all JS through the WASM build of JSC running under the native JSC runtime. You've now built your JS engine on top of a safe runtime (WASM), but it should hopefully be obvious that an attacker is simply going to continue targeting the native JSC runtime.

People have previously shipped JS runtimes on top of .NET and the JVM. It's not a question of 'who writes the GC', it's more fundamental.

If you JIT your JavaScript down into raw native code that bangs rocks together to dereference pointers, you need to make sure your generated code handles pointers correctly. You need to make sure to get all your bounds checks right, etc.

Sure, the JVM could somehow have a 30-year-old bug in its array bounds checks. But if you're JITing javascript to an IR that doesn't have raw pointers and instead uses strongly-typed object references and bounds-checked arrays, you have automatically closed off a whole category of defects. At the point where you're saying "sure, but what if the JVM messes up array bounds checks?" you might as well be asking whether v8 can really afford to rely on read-only pages and guard pages for its security sandbox. What if the kernel is broken?

I mentioned type confusion attacks in particular because they're a class of attack that generally doesn't work against java or .net applications because values can't change type arbitrarily during execution. Local variables and parameters have known types, object type casts are checked, array elements are typechecked before being stored, etc. Obviously you pay a cost for this, and if you have threads the ABA problem rears its head, but JS is single-threaded by design.

Between hosting JS on the JVM or in WASM, WASM is probably a safer choice since it's such a constrained sandbox. But the JS runtime you're running inside of the WASM sandbox is still built in C, banging rocks together to dereference pointers. Hopefully you're running a modern security-hardened JS runtime inside that sandbox, and you haven't turned off all the security mitigations thanks to wasm's lack of page protections.

> People have previously shipped JS runtimes on top of .NET and the JVM. It's not a question of 'who writes the GC', it's more fundamental.

Yes. it is.

That's literally the whole point.

The bugs in this post are bugs in the runtime - the implementation of the Gc, the implementation of the object metadata.

If you build your JS engine on top of a safe/managed environment the attacker is not interested in attacking logic bugs in your JS engine, they're target the runtime. All you have done is move the problem from "the attacker exploited bugs in the JS runtime, how do we prevent those?" to "the attacker exploited bugs in the Java (or whatever) runtime, how do we prevent those?". The problem is that at some point any safe language (java, rust, or even - as here - javascript) has a runtime that has to be implemented in an unsafe environment, and that is what is being attacked.

The JVM and .NET are not magical, they have the same bugs - albeit with significantly less hardening and mitigations - as JS engines.

What you are saying is that the JS engine should be written in Java (or whatever) so it's safe. But now how do you fix the JVM? Maybe rewrite that in C#/.NET? But then you have to fix the .net VM? Maybe rust? of course then we need to ensure that's safe so we should run that all under wasm. Of course that means your back at the JS engine you started with.

V8 is a runtime for JS exactly like the JVM is a runtime for Java and CLR is for C#. Which means that whatever sandboxing V8 needs, the JVM and the CLR would need it as well. I don't know what makes you think that the JVM and the CLR have already solved the problem, but not V8.
Would be interesting if they took hardening ideas from kernels that try to solve this (e.g. https://security.apple.com/blog/towards-the-next-generation-...).
The attacks would not be possible against a runtime written in JavaScript as well, by that reasoning.
That's called self-hosting, and it's widely used in JS runtimes to implement various built-ins instead of writing them in C++. It provides superior safety and the ability to inline builtins into their callers.
Haha, I wish I had come up with that response :)
You didn't read the article, then. They clearly explain how even if Rust were used for the entirety of v8, there would still be memory corruption, because the memory corruption is happening in code that is JIT compiled.
I think they did because all the vulnerabilities in the hardening they talk about is because of C++ memory safety & would be fixed by Rust (i.e. their hardening technique doesn't target JIT exploits themselves).
The whole article is about exploits that leverage the compiler itself, with details.
No, it mentions that as an introduction, and then talks about the system for mitigating them, which also has bugs which they admit are of the simple kind that a memory-safe language would prevent.
No, this very much does help protect against JIT exploits.

JIT code contains code that accesses the data structures they are sandboxing. By sandboxing those objects, the JIT code is limited in what it can do.

This might help you understand: An example the article gives is if an optimization pass has a bug that forgets a check. Then it may emit JIT code that will access a data structure that it should not. But, thanks to this sandboxing, that object cannot be outside the sandbox, nor refer to anything outside the sandbox, so a JIT exploit is limited in what it can achieve.

My point was that the sandbox escape that is now required to exploit a JIT issue has nothing to do with JIT or things Rust won't help with. Indeed, the vast majority of sandbox escapes they've found are straight-up basic memory safety issues that Rust would protect against much better than trying to harden C++. Again, there's a real switching cost and 2nd system syndrome to consider, so I'm not saying "switch V8 to Rust" but ignoring that conversation wholesale is disingenuous, especially when it's a bait and switch (i.e. Rust doesn't help with JIT issues, here's this sandbox idea that does, except we wrote the sandbox in C++ & memory safety exploits in the sandbox/runtime are easily found & paired with the JIT exploit).