Hacker News new | ask | show | jobs
by dragontamer 501 days ago
This is a GPU focused article.

The GPU will almost always execute f and g due to GPU differences vs CPU.

You can avoid the f vs g if you can ensure a scalar Boolean / if statement that is consistent across the warp. So it's not 'always' but requires incredibly specific coding patterns to 'force' the optimizer + GPU compiler into making the branch.

2 comments

It depends. If the code flow is uniform for the warp, only side of the branch needs to be evaluated. But you could still end up with pessimistic register allocation because the compiler can’t know it is uniform. It’s sometimes weirdly hard to reason about how exactly code will end up executing on the GPU.
f or g may have side effects too. Like writing to memory. Now a conditional has a different meaning.

You could also have some fun stuff, where f and g return a boolean, because thanks to short circuit evaluation && || are actually also conditionals in disguise.

Side effects will be masked, the GPU is still executing exactly the same code for the entire workgroup.