|
|
|
|
|
by mnahkies
1279 days ago
|
|
It's not quite lambda, but GKE auto pilot supports GPU workloads, so it could be a relatively easy way to do this. You could have a rest service sticking incoming requests into a queue, and then a processor deployment picking off the queue using the GPU resource requests / spot instances. You'd probably also want something to be scaling the processor deployment replicas based on the queue depth and your budget. I haven't compared the pricing to EKS so unsure if it would really be better financially, but it would avoid having to manage scaling up/down GPU nodes explicitly. https://cloud.google.com/kubernetes-engine/docs/how-to/autop... |
|