| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Derek_MK 2606 days ago

A sort of ELI5 on this for those that may not be familiar with any of this:

Ray tracing algorithms require checking to see if a ray will intersect with objects in a scene. This takes a LOT of time, and if you want this to be interactive ray tracing (e.g. in a game, where you need a frame drawn in a very short time), you have to find ways to reduce the number of objects you're comparing the ray against.

A lot of the time this includes culling algorithms, which essentially give you very quick ways to say that a number of objects have literally no way of being intersected by a ray, so you can ignore them. There's also things like bounding volume hierarchies (BVHs), which says things like, if a ray doesn't intersect with a given massive invisible cube that contains objects, then it can't possibly intersect things contained in that cube.

This is a bit different of a solution. If I'm understanding it right, this involves simplifying the representation of the objects themselves, so that something that was once a lot of triangles to be checked individually, can now be checked as one object. This means that a) you don't have to check for as many intersections, and b) it uses a lot less memory.

2 comments

Jasper_ 2606 days ago

There's also the huge issue of computing the material shading once you've intersected the triangles. Rasterization has the convenient property that all pixels in the draw are coherent with the same material. That isn't true with raytracing, especially diffuse materials where the rays scatter everywhere. So your texture lookups are much more random compared to rasterization. This is why Disney's Hyperion added "ray sorting", to bring back some memory coherency to the problem. The current approaches to real-time raytracing, like NVIDIA's RTX, do not have ray sorting.

The benchmarks on scenes I have seen have shown that triangle intersection is only 5-10% of the time spent, with shading and texture bandwidth being the rest. So I'm skeptical that accelerated geometry tests will solve the raytracing problem.

link

berkut 2606 days ago

Not quite the full picture...

Yes, rays bouncing off diffuse/glossy surfaces (incoherent rays) do mean texture accesses are much more random, but at the same time it means you can approximate them very easily, by loading much lower mipmap levels, or even switching to a "constant" overall colour after a certain number of bounces.

Whilst Hyperion does utilise ray sorting due to this, it was also due to Disney's insistence on using PTex for their texture mapping (instead of using UVs), which itself requires highly coherent texturing points to be performant.

Other CG/VFX companies are using pathtracing without ray sorting with Arnold, and many others are using Renderman RIS which does a limited amount of Ray batch sorting, but that pretty much tails off after 2/3 bounces, and you're left with ray/shading batches of 1, so Renderman's implementation of ray / shading point sorting is much more limited.

Whilst it is true that in some cases, ray / primitive / BVH intersection proportion of overall render time is as low as 5-10%, it's more normally in the 15-40% range, but this obviously depends on scene complexity and shading complexity. And this is for high-end VFX with highly complex materials: For interactive rendering where you could bake down a lot of materials to single maps (or even primvar mesh vertex attributes like Manuka does), the shading overhead would be a lot less.

link

hughes 2606 days ago

It still sounds a lot like BVHs. Maybe I need to read it more carefully, but I can't tell the difference.

link

corysama 2606 days ago

AFAICT, they use whatever BVH you want above the individual mesh level. Then, within the mesh, they use a kd-tree with 2 parallel, overlapping half-spaces at each node. The triangles are stored as a bunch of triangle strips in a linear array. Each node of the kd-tree refers to span of the triangle strips. The spans hierarchically get smaller the further down the tree you go, but technically you can run through a linear array of triangles for any node. They stop subdividing at 4 triangles because SIMD.

Did I miss anything?

Looks like the primary speed up is due to memory footprint reduction --especially because the largest model is big enough in the "standard BVH" that it hits swap during rendering. Not sure what their "standard BVH" representation is that's 6X larger than triangle strips.

link