|
|
|
|
|
by g0b
1493 days ago
|
|
I'm in fact talking about post-Volta hardware there, but this is not about forward progress, I meant using __ballotsync() and getting it wrong (ie waiting on the __activemask() from outside an if, but only in one branch of the if, meaning some of the threads will never participate in the sync) will deadlock the GPU. It's a powerful (since _different_ locations statically can sync with each other), but also risky abstraction to expose, as compared to GLSL where it's impossible to deadlock anything by using subgroup intrinsics. |
|