CUDA is general purpose compute, but nvidia also releases cudnn which all the major libraries use because it is fast and good (if a little complex). There’s efforts underway to have a comparable library on open source general compute packages but none as mature or effective as cudnn so people just pay nvidia to use that in practice, which lets them invest even more in pulling ahead.
As an aside, I’ve been kinda surprised that this has existed for as long as it has, but I am probably biased and think Ml acceleration is more important than most large business do today.
CUDA is for GPGPU (general purpose GPU) which includes machine learning.
Vulkan is a primarily for graphics but does have options for GPGPU too. Vulkan is however not like OpenGL in that it's fairly close to the hardware in terms of abstraction.
Vulkan is not a suitable API for even implementing Khronos's very own SYCL on top of. SYCL requires shared virtual memory capabilities that Vulkan just doesn't have.
As an aside, I’ve been kinda surprised that this has existed for as long as it has, but I am probably biased and think Ml acceleration is more important than most large business do today.