Hacker News new | ask | show | jobs
by clamchowder 1337 days ago
The problem is loop overhead matters on AMD, because AMD's compiler doesn't unroll the loop. Nvidia's does, so it doesn't matter for them.
1 comments

unroll with #pragma unroll?