Hacker News new | ask | show | jobs
by dochtman 2437 days ago
Not sure I understand the part about raw pointers. As far as I understand, Rust references will surely turn into pointers at the LLVM IR level?
4 comments

They do. The issue may be that references don't support pointer arithmetic – that is, given `x: &u32` you cannot get `x[1]`. The normal approach is to use a reference to a slice, `x: &[u32]`, and the default bounds-checked indexing. Rust does let you do unchecked indexing on slices (with `unsafe`), but it may be more efficient to avoid indices entirely in favor of pointer arithmetic. LLVM can often optimize index calculations into pointer arithmetic, but not always.

Edit: Although, on rereading the above post, I see diamondlovesyou did mention indices, so... not really sure what's going on.

True. I should have explained that better. If you're accessing an array, you'll have to do the pointer offset-ing yourself (otherwise you're using a slice and all the checking that entails). Thus, you can't use a reference type, because reference types don't have `<*const T>::add` like pointers do (also, `&T as usize` is invalid; you have to go through a pointer type first). I suppose I assumed you'd be accessing an array/slice; references otherwise are fine.
Rust references should in general optimize better because they give stronger aliasing guarantees.

Even for slices, using get_unchecked(1..) to get a smaller subslice without bounds checking might be better than pointer arithmetic as long as the slice lengths get optimized away (i.e. they are never used and never passed to non-inlined functions).

> Rust references should in general optimize better because they give stronger aliasing guarantees.

AFAIK this does not work atm due to a codegen bug in llvm (which can also affect code using restrict in C in some cases). This bug will get fixed one day, but most likely another bug will be revealed at this point… This part of LLVM was never really used in C as much as in Rust, so they keep finding bugs in it. Hopefully it will get fine in the long run, but I'm not holding my breath.

You can just pass a reference/mut ref to the first element of the slice. This is actually how the generated kernels in my Rav1e PR do it, just for the aliasing reason you mentioned.
You can use iterators to avoid unnecessary bounds checking on every element in a slice and you can still get an index to the value. Something like this:

``` for (i, val) in my_slice.iter().enumerate() { let x = *val + 9999; } ```

Not really in this case; Rav1e runs over blocks of image planes (like say a 16x16 block of a specific 720p color channel), so there is no continuous slice to iterate over.
Maybe you can use chunks_exact() and chunks_exact_mut(). The _exact versions allow lifting the bounds checks out of the loop and gave me some great performance boosts in image processing code.
These blocks are non-continuous. It operates over a part of a row, where the start of the next row is the start of the current row + some stride.

I mean maybe? But I probably wouldn't anyway for this case as a slice reference is actually a "fat" pointer (ie twice the size of a normal pointer) and the length of the slice won't be used (the block size is known per kernel); LLVM might delete the length part anyway.

These are automatically generated kernels, so readability isn't the primary concern here.

Once const-generics are released, I feel like you should be able to create your own "fixed-size slice" type, which uses the constant parameter for "bounds checking". I imagine that would be a lot more optimiser-friendly...