|
|
|
|
|
by ckitching
701 days ago
|
|
[I work on SCALE] CUDA has a couple of extra problems beyond just any other programming language: - CUDA is more than a language: it's a giant library (for both CPU and GPU) for interacting with the GPU, and for writing the GPU code. This needed reimplementing. At least for the device-side stuff we can implement it in CUDA, so when we add support for other GPU vendors the code can (mostly) just be recompiled and work there :D.
- CUDA (the language) is not actually specified. It is, informally, "whatever nvcc does". This differs significantly from what Clang's CUDA support does (which is ultimately what the HIP compiler is derived from). PTX is indeed vastly annoying. |
|
You might also find raw c++ for device libraries saner to deal with than cuda. In particular you don't need to jury rig the thing to not spuriously embed the GPU code in x64 elf objects and/or pull the binaries apart. Though if you're feeding the same device libraries to nvcc with #ifdef around the divergence your hands are tied.