Hacker News new | ask | show | jobs
by tomvbussel 3681 days ago
Yup, I meant paging geometry. The problem with just doing a two level BVH is that some models can be very large, while others may be much smaller, which makes it hard to reserve memory for the geometry cache (even if it just fits one model per thread). Which means that you can store fewer rays in memory, and I'm not really a fan of constantly writing rays to disk. I was thinking of just cutting off subtrees of the BVH and storing those on disk, which makes it possible to assign just a few megabytes of memory to each thread for caching geometry. Not sure how efficient that would be though, probably a bit faster than rebuilding part of the BVH every time (especially when using high quality BVHs with spatial splits). Shouldn't be too bad with large ray batches, sorting and ray streams.

They do mention some tests with "out-of-core rendering of scenes with massive amounts of geometry" in the paper, but there isn't much info on it. AFAIK their patents mention streaming in geometry too.

I know that buying more RAM is probably easier, but it would be interesting to render very large scenes on commodity hardware. I guess paging in geometry is just too much of a hassle, you constantly have to stream in geometry to memory; to compute surface derivates, to do subsurface scattering, to sample arbitrary geometry lights, etc.

1 comments

Why would you need per-thread geometry caches? That doesn't really make sense (memory duplication)... What you'd do is just map the geometry + bvh into a serialisable format (like vraymesh) such that they can be fread() in one go pretty quickly. You could put heuristics in to not page out very small meshes...

But unless you batch and re-order rays (and I'm not convinced doing this is worth it if you're doing more than 3/4 bounces, as the incoherence just makes things a nightmare), there's no point really doing this (unless you're happy with very slow performance - even mmapping stuff is slow).

The per-thread caches are not strictly necessary, but you don't want threads to block because they can't stream in geometry to memory. You want some upper bound on the amount of memory each thread needs for geometry (each thread could try to ray intersect a different mesh). Dividing geometry into fixed sized chunks gives you this upper bound (max N chunks per thread).

Ray re-ordering is indeed a nightmare, but sorting very large ray batches (millions of rays) into coherent ray streams is less of a hassle, and it should enable coherent geometry access (which amortizes the cost of those reads, not sure to what extent).

But that doesn't really scale - unless the per-thread stuff is small (and then you'd have to page on a sub-mesh level - i.e. split up the mesh spatially and page it as you traverse through it for internal scattering or something), if you double the number of threads, the memory used for threads by doubling the number of threads will obviously double as well.

For stuff like hair where you need loads of scattering events or for SSS doing that (paging subsections of shapes) is just going to thrash any geometry cache you have unless they're quite big.

And I'm still not convinced by ray sorting: it means your light integration is extremely segmented into artificial progressions as you expand down the ray tree, and makes anything regarding manifold exploration (MLT or MNEE) close to impossible to do well.