Hacker News new | ask | show | jobs
by scottlamb 458 days ago
> Rust doesn't help here; you necessarily must do all stores in potentially-mirrored memory as volatile (and possibly loads too), else you can have arbitrary spooky-action-at-a-distance issues, as, regardless of &[T] vs &mut [T] or whatever language-level aliasing features, if the compiler can see that two addresses are different (which they "definitely" are if the compiler, for one reason or another, knows that they're exactly 4096 bytes apart) it can arbitrarily reorder them, messing your ring buffer up.

Hmm, as I think about it, I see your point about LLVM's optimizer potentially "knowing" memory hasn't changed that really has if it inlines enough even if it's never put into the same &mut [T] as the other side of the mirror (and two improperly aliased &mut [T] are never constructed).

But as an alternative to doing all the stores in a special way (and loads...don't see how doing a volatile store to one side of the mirror is even sufficient to tell it the other side of the mirror has changed)...it'd be far more practical if the caller could use a (not mirrored) &mut [T]. Couldn't you have an std::ops::IndexMut wrapper that returns a guard that has a DerefMut into &mut [T] and on Drop creates a barrier for these kinds of optimizations via `std::arch::asm!("")`? [1] Then LLVM has to assume all memory changed in that barrier.

Regarding the more specific crate issues: I found these crates a while ago and hadn't looked extensively in their implementation. Thanks for pointing these out; I will have to look more closely if/when I ever decide to actually use this approach. I was leaning toward no anyway because of the other factors I mentioned. As an alternative, I was thinking of having a ring buffer + a little extra bit at the end that is explicitly copied from the start as needed. The maximum length of one message I need a contiguous view of is far less than the total buffer size, so only a fraction of the buffer would need to be copied.

> vmcircbuf just exposes the mutable mirrored reference, resulting in [1] in release builds.

Yuck, noted, clearly wrong to give the whole thing as a `&mut [T]`.

> slice_deque has many open issues about unsoundness.

I see at least couple of those, which seem to be "just" the usual unsafe-done-wrong sorts of things (double frees) rather than anything inherent to the mirrored buffer.

[1] https://stackoverflow.com/questions/72823056/how-to-build-a-...

1 comments

Yeah, an asm marked as memory-clobbering is the proper thing; not the first time I've forgotten that volatile entirely doesn't imply anything to other memory. (in fact, doing "((volatile uint8_t*)x)[0] = 0xaa;" in my godbolt link in a sibling thread still has the optimization happen). Don't know how exactly it interacts with aliasing rules; maybe you'd have to explicitly pass the mutable reference to the asm as an input, otherwise it'd be illegal for the asm to change it and so the compiler can still assume it isn't? or, I guess, not have any references live during the asm call is the proper thing.

Probably indeed possible to do it with proper guards (the pre-pooping your pants issue is probably not a problem if you also have the asm guard in drop?).

> I see at least couple of those, which seem to be "just" the usual unsafe-done-wrong sorts of things (double frees) rather than anything inherent to the mirrored buffer.

Yeah, possible. I was just saying that from the perspective of proving that all the ring buffers not taking extreme care are incorrectly implemented.

> Don't know how exactly it interacts with aliasing rules; maybe you'd have to explicitly pass the mutable reference to the asm as an input, otherwise it'd be illegal for the asm to change it and so the compiler can still assume it isn't? or, I guess, not have any references live during the asm call is the proper thing.

I don't know either, but really it's the opposite half of the buffer you want to tell it may have changed, so I imagine it doesn't matter even if you still have the `&mut [T]` live.

Maybe the extra guard I described isn't necessary either; the DerefMut could directly return `&mut [T]` but set a `barrier_before_next_access` on the ring, or you could just always have the barrier, whatever performs best I guess.