|
|
|
|
|
by dragontamer
541 days ago
|
|
Therein lies SIMDs advantage. The instruction pointer is all synchronized, providing you with fewer states to reason about. Then GPUs mess that up by letting us run blocks/thread groups independently, but now GPUs have highly efficient barrier instructions that line everyone back up. It turns out that SIMDs innate assurances of instruction synchronization at the SIMD lane level is why warp based coding / wavefront coding is so efficient though, as none of those barriers are necessary anymore. |
|