| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by goldenkey 2816 days ago
	Cuda has Cooperative Groups now on Volta and Turing architectures. This allows for synchronization between entire workgroups rather than just locally. So you can pretty much keep your entire job on the GPU even if it involves multiple kernels. Really important for complex jobs where performance is a must. https://devblogs.nvidia.com/cooperative-groups