| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by floatngupstream 1189 days ago
	There is an enormous investment beside the training side. Once you have your model, you still need to run it. This is where Triton, TensorRT, and handcrafted CUDA kernels as plugins come in. There is no equivalent on ROCm for this (MIGraphX is not close).

1 comments

noogle 1188 days ago

Models are re-trained periodically (months, weeks, even days), and new architectures/implementations come all the time. If a better algorithm appears, practitioners will adopt a new platform (e.g. Transformers for NLP models), so many systems can already plug-in new tools. GPUs are very expensive so there is also a strong incentive to make this little effort.

link

floatngupstream 1188 days ago

Yes, but this just makes a frictionless runtime for inference even more important (which is something that does not exist in a comparable form for AMD).

link