Hacker News new | ask | show | jobs
by noogle 1175 days ago
How much of a moat is CUDA?

It's indeed ages beyond any of their competitors. However, most ML/DS people interact with CUDA via a higher-level framework. In recent years this community has consolidated around a few (and even only one platform, PyTorch) framework. For some reason AMD had not invested in platform backends, but there is no network effect or a vendor lock-in to hinder a shift from CUDA to ROCm if it is supported equally well.

1 comments

There is an enormous investment beside the training side. Once you have your model, you still need to run it. This is where Triton, TensorRT, and handcrafted CUDA kernels as plugins come in. There is no equivalent on ROCm for this (MIGraphX is not close).
Models are re-trained periodically (months, weeks, even days), and new architectures/implementations come all the time. If a better algorithm appears, practitioners will adopt a new platform (e.g. Transformers for NLP models), so many systems can already plug-in new tools. GPUs are very expensive so there is also a strong incentive to make this little effort.
Yes, but this just makes a frictionless runtime for inference even more important (which is something that does not exist in a comparable form for AMD).