| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by LoganDark 990 days ago
	IIRC, ease of implementation (for the GPU kernels), and cross-compatibility (the same bytecode can be loaded by multiple models of GPU).

2 comments

ealloc 989 days ago

How is CUDA-C that much easier than OpenCL? Having ported back and forth myself, the base C-like languages are virtually identical. Just sub "__syncthreads();" for "barrier(CL_MEM_FENCE)" and so on. To me the main problem is that Nvidia hobbles OpenCL on their GPUs by not updating their CL compiler to OpenCL 2.0, so some special features are missing, such as many atomics.

link

LoganDark 988 days ago

Never used it myself, these are just the main reasons I've heard from friends.

link

jacobgorm 989 days ago

The ease of implementation using CUDA means that your code because effed for life, because it is no longer valid C/C++, unless you totally litter it with #ifdefs to special case for CUDA. In my own proprietary AI inference pipeline I've ended up code-generating to a bunch of different backends (OpenCL SpirV, Metal, CUDA, HLSL, CPU w. OpenMP), giving no special treatment to CUDA, and the resulting code is much cleaner and builds with standard open source toolchains.

link

LoganDark 989 days ago

> The ease of implementation using CUDA means that your code because effed for life

yes, yes it absolutely does. establishing market dominance as everyone wants to use CUDA but almost nobody wants to write their kernel twice.

link