|
|
|
|
|
by vilya
4980 days ago
|
|
GPUs are SIMD machines, so they're executing the same instruction simultaneously on all the active cores. That means if you have code which branches, it has to mask out the cores which follow branch B while it executes branch A; then has to mask out all the cores which follow branch A while it executes branch B. In other words, if at least one core follows each side of the branch, it has to execute both branches. If all cores branch in the same direction, you don't get that penalty. A large part of optimising for the GPU comes down to arranging your data and code so that this can happen. |
|