| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by exged 2100 days ago
	To be fair, it's not uncommon for a ML researcher / engineer to use tens (~$10/hr on cloud, $100k from Nvidia) or even hundreds (~$100/hr on cloud, $1M from Nvidia) of GPUs to speed up their iteration time. If there was a way to spend half as much on hundreds of AMD GPUs instead that would be a huge win, well worth even months of the researcher's time. The catch is that ML software stacks have had hundreds if not thousands of man-years of effort put into things like cuDNN, CUDA operator implementations, and Nvidia-specific system code (eg. for distributed training). Many formidable competitors like Google TPU have emerged, but Nvidia is currently holding onto its leadership position for now because the wide support and polish is just not there for any of the competitors yet.