|
|
|
|
|
by smarterclayton
848 days ago
|
|
There is a lot of work to make the actual infrastructure and lower level management of lots and lots of GPUs/TPUs open as well - my team focuses on making the infrastructure bit at least a bit more approachable on GKE and Kubernetes. https://github.com/GoogleCloudPlatform/ai-on-gke/tree/main and https://github.com/google/xpk (a bit more focused on HPC, but includes AI) and https://github.com/stas00/ml-engineering (not associated with GKE, but describes training with SLURM) The actual training is still a bit of a small pool of very experienced people, but it's getting better. And every day serving models gets that much faster - you can often simply draft on Triton and TensorRT-LLM or vLLM and see significant wins month to month. |
|