Hacker News new | ask | show | jobs
by rrss 2237 days ago
This was really fun to read, thanks fsanglard.

> This is correlated with the warning nvcc issued. Because the raytracer uses recursion, it uses a lot of stacks. So much actually that the SM cannot keep more than a few alive.

Stack frame size / "local memory" size doesn't actually directly limit occupancy. There's a list of the limiters here: https://docs.nvidia.com/gameworks/content/developertools/des.... I'm not sure why the achieved occupancy went up after removing the recursion, but I'd guess it was something like the compiler was able to reduce register usage.