We're starting to use k8s as a small team because the simpler offerings with GPUs available don't meet our needs. It's clear they're either built for someone else or are less reliable than an EKS cluster would be.
I'd encourage you to look at the problem space and evaluate if ECS or an external abstraction layer (like Ray) meets your needs.
I've seen both work in completely separate domains (e.g. inference on real time video streams vs. model building) -- but obviously ymmv, tech is a big domain and pretending I understand exactly what you're doing would be silly. Sometimes there is a real answer to the why!
Ah, well today I learned about ECS. I guess we’ll migrate to that once I need to add complexity to our EKS setup.
I’m new to this stuff, so it’s hard to dig through all of the possible different solutions.
I looked into Ray a bit but it seemed a little too complicated vs. just running a CUDA accelerated docker container. Most of the streamlined solutions in this space are not made for full stack web developers deploying a service that happens to need a GPU. They’re for ML devs who are trying to own the production side of their part of the product.
I've seen both work in completely separate domains (e.g. inference on real time video streams vs. model building) -- but obviously ymmv, tech is a big domain and pretending I understand exactly what you're doing would be silly. Sometimes there is a real answer to the why!