|
|
|
|
|
by kkielhofner
814 days ago
|
|
It's been talked to death but non-CUDA implementations have their challenges regardless of use case. That's what first-mover advantage and > 15 years of investment by Nvidia in their overall ecosystem will do for you. But support for production serving of inference workloads outside of CUDA is universally dismal. This is where I spend most of my time and compared to CUDA anything else is non-existent or a non-starter unless you're all-in on packaged API driven Google/Amazon/etc tooling utilizing their TPUs (or whatever). The most significant vendor/cloud lock-in I think I've ever seen. Efficient and high-scale serving of inference workloads is THE thing you need to do to serve customers and actually have a chance at ever making any money. It's shocking to me that Nvidia/CUDA has a complete stranglehold on this obvious use case. |
|
That's quite literally unacceptable.