|
|
|
|
|
by elorant
829 days ago
|
|
CUDA is a big reason for their moat. And that's not something you can build in a couple of years no matter how money you can throw on it. Without CUDA you have a chip that runs on premise without anyone having a clue how good that is which is supposedly what Google does. Your only offering is cloud services. As big as this is, corporations would want to build their own datacenters. |
|
I think nobody had the time to port any of these architectures away from CUDA because: * the leaders want to maintain their lead and everyone needs to catch up asap so no time to waste, * and progress was _super_ fast so doubly no time to waste, * there was/is plenty of money that buys some perceived value in maintaining the lead or catching up.
But imo: 1. progress has slowed a bit, maybe there's time to explore alternatives, 2. nvidia GPUs are pretty hard to come by, switching vendors may actually be a competitive advantage (if performance/price pans out and you can actually buy the hardware now as opposed to later).
In terms of ML "compilers"/frameworks, afaik there's:
* Google JAX/Tensorflow XLA/MLIR, * OpenAI Triton, * Meta Glow, * Apple PyTorch+Metal fork.