Hacker News new | ask | show | jobs
by cmeacham98 811 days ago
I think you're missing the point - abusing a vector in this way is not a safe interface and is instead a good way to introduce several memory safety bugs (for example, UAFs) that the normal rust memory model explicitly prevents.
3 comments

> abusing a vector in this way is not a safe interface and is instead a good way to introduce several memory safety bugs (for example, UAFs)

No, as long as you don't use the unsafe keyword, an out-of-bounds vector access won't lead to use-after-free or other memory safety bugs.

Out-of-bounds access is not required for the pseudo-UAF we're talking about here. Deleting a node in the middle of a linked list will leave a "hole" in the backing Vec. You cannot shift the next elements down to fill the hole because that will invalidate all handles to them. If the backing Vec holds the nodes directly, as TFA's implementation does, then there is no way to mark the hole as a hole. So any bug where a different node's handle accidentally ends up accessing this hole instead will lead to that code observing a "freed" node.

One workaround is to make the backing Vec hold Option of Node instead so that deleting a node can set the hole to None, in which case the bug I described above has the opportunity to unwrap() and panic instead of silent UAF. Though you'll also need additional tracking for those holes so that you can fill them with new nodes later, at which point you'd be better off using a proper freelist / slab instead of a Vec anyway (as TFA also mentions).

Well, we're not talking about "pseudo-UAF", we're talking about actual-UAF and actual-memory-safety.

You use scare quotes around "freed" for a reason: the data has not actually been freed.

The bug you're talking about is a logic error. It could be a bad bug, depending on circumstances, but there's no memory safety issue here.

>You use scare quotes around "freed" for a reason: the data has not actually been freed.

Who said it hasn't? I would assume such a node to have been given to `std::ptr::drop_in_place`. Not doing that would be a leak until the list as a whole was dropped.

I don't mean a literal UAF, but a "use array index after free" because you're using indexes (which only have bounds checking) as heap pointers.

Rust's borrow checker doesn't account for when you re-implement parts of memory management as array indexes.

Safe rust is a turing complete language. That implies that it is possible to build an emulator for any other programming language in safe rust. That emulated language and programs running on it can have memory-safety issues, even if the emulator itself cannot be corrupted.

When writing such an emulator the natural way to set up the memory is to use an array, and the natural way to implement pointers is indices into that array. In other words, this pattern is part-ways there to creating an emulator with internal safety issues.

However guaranteeing that any corruption will be contained to the array is certainly a lot better than nothing.

Rust's safety mechanisms don't exist just to prevent bugs from escalating into security issues, they exist to prevent the whole class of bugs related to reference handling from being present in the first place. That's supposed to mean programs that work more consistently. Handwaving techniques that lead to programs panicking as "Well at least it doesn't become a security concern" is missing the forest for the trees.
This is generally true, but afaik there are still some (fairly complex) ways to write memory unsafe code in safe rust https://faultlore.com/blah/everyone-poops/
How is reusing a freed index not a UAF? If I roll my own allocator I can still get UAFs even though the memory accessed is not yet free'd.
Because that's not what "UAF" means. Also not what "freed" means.

To have a UAF, there has to be memory that is actually freed, and you have to attempt to access that memory. No memory is freed here (in the OP's implementation). Even if it was, at worst you'd get a panic for trying to access past the end of the Vec.

None of that is a UAF or a memory safety issue. It's just a logic bug.

Nope but "handle confusion" safety perhpas?
No, that's incorrect. While yes, it's true that you can point two different nodes at the same array index, or mismanage your array indices in a variety of ways, that is a logic error. It will indeed make your program behave incorrectly (and depending on what it's doing, that may have security implications), but there is no memory safety issue. No one is using anything after freeing it; if you try to access an index that is past the end of the Vec, it will panic. Panicking, while undesirable, is memory-safe.

You may think that's a difference without distinction, but "memory safety" and "use after free" have specific definitions, and this ain't them.

You later clarified that by "memory safety bugs" you don't actually mean "memory safety bugs," but rather "use array index after free." But that isn't a memory safety bug. (It might be a denial of service bug or a logic bug, but because of bounds checks, it isn't a memory safety bug.) So no, I'm afraid I haven't missed the point at all.

Could you please read the link I shared? There's all sorts of nuance in the README. And there is absolutely no pretending in my comment or in the link I shared that using indices instead of pointers has zero downsides.

It is a memory safety bug - by using this "indexes as pointers" methodology it is possible to write code where two different owners simultaneously believe they are the sole owner of an object.

Writing that using normal pointers is impossible in safe rust (barring a compiler bug, which do exist but are rare).

You don't get two owners that way. You get something slightly less powerful than owners, which still has all its preconditions satisfied: you can check that it is in bounds, and if so you will find an element of the expected type. You cannot corrupt the underlying allocator's data structures, you cannot resize the array when there are outstanding pointers to its elements, and you cannot violate the type system.

Rust's goal was never to exclude all shared mutability. (Otherwise why support things like locks?) Rather, it excludes only the kinds of shared mutability that open the door to undefined behavior. The point of all these sorts of "workarounds" is that, because there is no longer a single unrestricted kind of shared mutability, you now get to pick which version of restricted mutability you want: reference counting vs bounds checking vs ghostcell's higher-rank lifetimes vs whatever else.

> two different owners simultaneously believe they are the sole owner of an object.

Not in the sense of ownership that matters in Rust. The Vec owns the data. A node with an index in it does not own that data. It merely refers to it.

The key here is when you answer the question, "what happens when I screw it up the indices?" And the answer is logic errors or a panic. Neither of those are memory safety issues.

Can you share a program where this results in UB without using `unsafe`?

Please also consider what I was responding to:

> you’re just laundering the unsafe pointer arithmetic behind array indexing

The definition of memory safety is not "code that does not result in UB".
So just to be clear here, the progression is:

"memory safety bugs" -> "for example, UAFs" -> "I don't mean a literal UAF" -> "use array index after free" -> 'memory safety is not "code that does not result in UB"'

I mean, you can define "memory safety" to be whatever you want it to be, but the definition everyone else uses (including Rust) is absolutely connected with undefined behavior. More than that, the entire context of this thread assumes that definition. Rust certainly does. And if you are going to use a different definition than everyone else, at least have the courtesy to provide it.

If people used your definition, then it would be wrong to, for example, say that "Java is a memory safe programming language." But that is, as far as I know, widely regarded to be a true statement.

This sort of disagreement is profoundly irritating, because I made it exceptionally clear what I meant from the get-go. All you had to do was respond and say, "oh, it sounds like we are just using different definitions of the term 'memory safety.' if we use your definition, I agree with what you said."

It is the definition of memory safety that Rust uses.

It would be easier to discuss whatever non-UB failure modes you have in mind, in the context of Rust, if you used a different term.

Maybe not, but it's also not whatever unconventional definition you've come up with here.