Hacker News new | ask | show | jobs
by stackskipton 458 days ago
In fairness to Kubernetes, this partially due to AWS and how their ALB/NLB interact with Kubernetes. So, when Kubernetes starts to replace Pods, the Amazon ALB/NLB Controller starts reacting, however, it must make calls to Amazon API and wait for ALB/NLB to catch up with changing state of the cluster. Kubernetes is not aware of this and continues on blindly. If Ingress Controller was more integrated into the cluster, you wouldn't have this problem. We run Ingress-Nginx at work instead of ALB for this reason.

Thus, this entire system of "Mark me not ready, wait for ALB/NLB to realize I'm not ready and stop sending traffic, wait for that to finish, terminate and Kubernetes continues with rollout."

You would have same problem if you just started up new images in autoscaling group and randomly SSH into old ones and running "shutdown -h now". ALB would be shocked by sudden departure of VMs and you would probably get traffic going to old VMs until health checks caught up.

EDIT: Azure/GCP have same issue if you use their provided ALBs.

1 comments

Nginx ingress has the same problem, it's just much faster at switching over when a pod is marked as unready because it's continuously watching the endpoints.

Kubernetes is missing a mechanism for load balancing services (like ingress, gateways) to ack pods being marked as not ready before the pod itself is terminated.