|
|
|
|
|
by spenczar5
862 days ago
|
|
We are so, so, so far away from compilers that could automatically help you, say, rewrite an operation to achieve high warp occupancy. These are not trivial performance optimizations - sometimes the algorithm itself fundamentally changes when you target the CUDA runtime, because of complexities in the scheduler and memory subsystems. I think there is no way that you will see compilers that advanced within 3 years, sadly. |
|