Hacker News new | ask | show | jobs
by Arnavion 391 days ago
>I’ve also dealt with pain implementing similar interfaces in Rust, and it really feels like you end up jumping through a ton of hoops (and in some of my cases, hurting performance) all to satisfy the abstract machine, at no benefit to programmer or application.

I've implemented what TFA calls the "double cursor" design for buffers at $dayjob, ie an underlying (ref-counted) [MaybeUninit<u8>] with two indices to track the filled, initialized and unfilled regions, plus API to split the buffer into two non-overlapping handles, etc. It certainly required wrangling with UnsafeCell in non-trivial ways to make miri happy, but it doesn't have any less performance than the equivalent C code that just dealt with uint8_t* would've had.

1 comments

What is the reason for explicitly tracking the initialized-but-unfilled portion? AIUI there's no harm in treating initialized bytes as uninitialized since you're working with u8, so what do you actually gain by knowing the initialized portion? I mean, at an API level you gain the ability to return a &mut [u8] for the initialized portion, but presumably anyone actually trying to write to this buffer either wants to copy in from a &[u8] or write to a &mut [MaybeUninit<u8>].
It's for the sake of providing a &mut [u8] to the initialized region for callers that cannot work with &mut [MaybeUninit<u8>] for whatever reason, eg if they wanted to use this with a std::io::Read impl. In particlar, Read impls are discouraged but not prohibited from reading the content of the [u8] given to them, so calling them with a [MaybeUninit<u8>] transmuted into a [u8] is not generally safe.
Ah right, std::io::Read, I didn't think about that. And so it needs to explicitly track the initialized portion because otherwise calling .ensure_init() every time you want to do a read becomes expensive, that makes sense. And presumably this is also why the Buffer trait from this article cannot ever replace BorrowedBuf, because Buffer is not compatible with the Read trait without explicitly initializing the buffer every time (though I suppose it could still be added on top of BorrowedBuf if you add a new method init_len() that returns the length of the already-known-to-be-initialized prefix of parts_mut()).

EDIT: adding init_len() isn't good enough, we'd need an ensure_init() method too that returns a &mut [u8], and with those we could impl Buffer for BorrowedCursor, but if Read::read_buf() took this modified Buffer trait the default impl would be very inefficient when using a &mut [MaybeUninit<u8>] and that becomes a performance footgun.