Hacker News new | ask | show | jobs
by adrian17 2 hours ago
> Also, as you're using full double/f64-precision all the time, you're leaving a fair bit of performance on the table

There's another issue that popped up on my quick naive profiling run: std::shared_ptr<Material> in the HitRecord/HittableLightSample is assigned/copied and destroyed a lot, and somehow these refcount operations show up as half of all samples on my profile (presumably because even if there's no hit and the pointer stays nullptr, the smart pointer still must check if there's anything to deallocate).

1 comments

Yeah, passing std::shared_ptr by value in a multi-threaded setup can have a lot over overhead due to them being copied and destroyed a lot, and the fact that the atomic ref count value modifications effectively cause a write back to cache and can cause contention.

Should pass them by const refs really to avoid this.

Or for a better alternative, just use plain old indices rather than shared pointers.

The scene is only going to be loaded / unloaded all at once, you can just load the data into contiguous arrays and index from them. No need to use shared_ptr since lifetimes aren't that complex.

Or just raw pointers, indeed.

std::shared_ptrs can also (because they're implicitly for sharing) alias, so the compiler has to assume the worst and emit loads in other cases, and there's no way (unless a newer C++ version has introduced it and I haven't noticed?) to use '__restrict__' with shared ptrs.