| > CUDA Runtime: The runtime library (libcudart) that applications link against. That library is actually a rather poor idea. If you're writing a CUDA application, I strongly recommend avoiding the "runtime API". It provides partial access to the actual CUDA driver and its API, which is 'simpler' in the sense that you don't explicitly create "contexts", but: * It hides or limits a lot of the functionality. * Its actual behavior vis-a-vis contexts is not at all simple and is likely to make your life more difficult down the road. * It's not some clean interface that's much more convenient to use. So, either go with the driver, or consider my CUDA API wrappers library [1], which _does_ offer a clean, unified, modern (well, C++11'ish) RAII/CADRe interface. And it covers much more than the runtime API, to boot: JIT compilation of CUDA (nvrtc) and PTX (nvptx_compiler), profiling (nvtx), etc. > Driver API ... provides direct access to GPU functionality. Well, I wouldn't go that far, it's not that direct. Let's call it: "Less indirect"... [1] : https://github.com/eyalroz/cuda-api-wrappers/ |
Probably the worst part of this: for the most part, in practice, it will work just fine. Until it doesn’t. You will have lots of fun debugging subtle bugs in a closed-source black box, which reproduces only against certain driver API header versions, which potentially does not match the version of the actual driver API DSO you’ve dlopened, and which only produces problems when mixed with certain Linux kernel versions.
(I have the exact opposite opinion; people reach too eagerly for the driver API when they don’t need it. Almost everything that can be done with the driver api can be done with the runtime API. If you absolutely must use the driver API, which I doubt, you should at least resolve the function pointers through cudaGetDriverEntrypointByVersion.)