| > OpenCL is not a disaster at all. I'll probably need to be more specific. OpenCL 1.0 through 1.2 is fine, but fell hopelessly behind NVidia's CUDA efforts. NVidia CUDA has more features that lead to proven performance enhancements. OpenCL 2.0 was the "counterpunch" to bring OpenCL up to CUDA-level features. However, OpenCL 2.0 is virtually stillborn. Only Intel and AMD platforms support OpenCL2.0. Intel Xeon Phi are relatively niche (and their primary advantage seems to be x86 code compatibility anyway. So I doubt you'd be running OpenCL on them). AMD OpenCL 2.0 support exists, but is rather poor. The OpenCL 2.0 debugger simply is non-functional and you're forced to use lol printfs. That leaves OpenCL 1.2. Its okay, but it is years behind what modern hardware can do. Its atomic + barrier model is strange compared to proper C++11 Atomics, its missing important features like device-side queuing, shared virtual memory, unified address space (no more copy/paste code just to go from "local" to "private" memory), among other very useful features. > Even today OpenCL is a viable solution for GPU OpenCL 1.2 is a viable solution. An old, crusty, and quirky solution, but viable nonetheless. OpenCL 2.0+ is basically dead. And I think only Intel Xeon Phi supports the latest OpenCL 2.2. I bet you there are more Vulkan compute shaders out there than there are OpenCL 2.0. Indeed, there are rumors that the Khronos project is going to be focusing on Vulkan compute shaders in the future. > The "single source" argument is completely overrated. Furthermore, you can have single source in OpenCL putting the code in strings. I like my compile-time errors to be during compile-time. Not during run-time on my client's system. Compiler-bugs in AMD drivers are fixed through device driver updates (!!!) which makes practical deployment of plain-text OpenCL source code far more of a hassle in practice. Consider this horror story: a compiler bug in some AMD Device Driver versions which cause a segfault on some hardware versions. This is not theoretical: https://community.amd.com/thread/160362. In practice, deploying OpenCL 1.2 code requires you to test all of the device drivers your client base is reasonably expected to run. ----- But that's not the only issue. "Single Source" means that you can define a singular structure in a singular .h file and actually have it guaranteed to work between CPU-code and GPU-code. Data-sharing code is grossly simplified and is perfectly matched. The C++ AMP model (which has been adopted into AMD's ROCm platform) is grossly superior. You specify a few templates and bam, your source code automatically turns into CPU code OR GPU-code. Extremely useful when sharing routines between the CPU and GPU (like data-packing or unpacking from the buffers) With that said, AMD clearly cares about OpenCL and the ROCm platform looks like it strongly supports OpenCL through then near term, especially OpenCL 1.2 which seems to have a big codebase. However, if I were to do any project these days, I'd do it in ROCm's HCC / single-source C++ system or CUDA. OpenCL 1.2 is useful for high-compatibility but has major issues as an environment. |