Hacker News new | ask | show | jobs
by jfuawdfaw 460 days ago
> A lot of this seems like the fault of the ALB, is it?

People forget to enable pod readiness gates.

1 comments

Pod Readiness Gates, unless I'm missing something, only help on startup.

Unless something has changed since I last went digging into this. You will still have the ALB sending traffic to a pod that's in terminating state, unless you do the preStop bits I talked about in the top of the thread.

https://kubernetes-sigs.github.io/aws-load-balancer-controll...

> Pod Readiness Gates, unless I'm missing something, only help on startup.

Also allows graceful rollout of workload.

> You will still have the ALB sending traffic to a pod that's in terminating state

The controller watches endpoints and will remove your pod from target group on pod deletion.

You don't need the preStop scam as long as your workload respects SIGTERM and does lame-duck.

> You don't need the preStop scam as long as your workload respects SIGTERM and does lame-duck.

Calling it a scam is a bit much.

I think having to put the logic of how the load balancer works into the application is a crossing of concerns. This kind of orchestration does not belong in the app, it belongs in the supporting infrastructure.

The app should not need to know how the load balancer works with regards to scheduling.

The ALB Controller should be doing this. It does not, and so we use preStop until/unless the ALB controller figures it out.

Yes, the app needs to listen for SIGTERM and wait until it's outstanding requests are completed before exiting - but not more than that.

Just curious:

- so if pod goes to terminating state

- with gates enabled, alb controller should remove it from targets instantly coz it listens to k8s api pod changes stream ?

In my experience there was ALWAYS some delay even a small one in High Frequency systems which caused 500s.

Which we solved with internal apigateway, aws+iptables+cni was always causing issues in every setup without it.