|
|
|
|
|
by goldenkey
2816 days ago
|
|
Cuda has
Cooperative Groups now on Volta and Turing architectures. This allows for synchronization between entire workgroups rather than just locally. So you can pretty much keep your entire job on the GPU even if it involves multiple kernels. Really important for complex jobs where performance is a must. https://devblogs.nvidia.com/cooperative-groups |
|