Hacker News new | ask | show | jobs
by deathanatos 458 days ago
It shouldn't. I've not had the braincells yet to fully internalize the entire article, but it seems like we go wrong about here:

> The AWS Load Balancer keeps sending new requests to the target for several seconds after the application is sent the termination signal!

And then concluded a wait is required…? Yes, traffic might not cease immediately, but you drain the connections to the load balancer, and then exit. A decent HTTP framework should be doing this by default on SIGTERM.

> I yelled into the void about this once and I was told that this was inevitable because it's an eventually consistent distributed system.

Yeah, I wouldn't agree with that either. A terminating pod is inherently "not ready", that not-ready state should cause the load balancer to remove it from rotation. Similarly, the pod itself can drain its connections to the load balancer. That could take time; there's always going to be some point at which you'd have to give up on a slowloris request.

2 comments

The fundamental gap in my opinion, is that k8s has no mechanism (that I am aware of) to notify the load balancing mechanism (whether that's a service, ingress or gateway) that it intends to remove a node - and for the load balancer to confirm this is complete.

This is how all pre-k8s rolling deployment systems I've used have worked.

So instead we move the logic to the application, and put a sleep in the shutdown phase to account for the time it takes for the load balancer to process/acknowledge the shutdown and stop routing new traffic to that node.

K8s made simple things complicated, yet it doesn't have obvious safety (or sanity) mechanisms, making everyday life a PITA. I wonder why it was adopted so quickly despite its flaws, and the only thing coming to my mind is, like Java in 90s: massive marketing and propaganda that it's "inevitable"..
> put a sleep in the shutdown phase to account for the time it takes for the load balancer to process/acknowledge the shutdown and stop routing new traffic to that node.

Again, I don't see why the sleep is required. You're removed from the load balancer when the last connection from the LB closes.

That’s how you’d expect it to work, but that’s not how pod deletion works.

The pod delete event is sent out, and the load balancer and the pod itself both receive and react to it at the same time.

So unless the LB switchover is very quick, or the pod shutdown is slow - you get dropped requests - usually 502s.

Try googling for graceful k8s deploys and every article will say you have to put a preStop sleep in

Most http frameworks don't do this right. They typically wait until all known in-flight requests complete and then exit. That's usually too fast for a load balancer that's still sending new requests. Instead you should just wait 30 seconds or so while still accepting new requests and replying not ready to load balancer health checks, and then if you want to wait additional time for long running requests, you can. You can also send clients "connection: close" to convince them to reopen connections against different backends.
> That's usually too fast for a load balancer that's still sending new requests.

How?

A load balancer can't send a new request on a connection that doesn't exist. (Existing connections being gracefully torn down as requests conclude on them & as the underlying protocol permits.) If it cannot open a connection to the backend (the backend should not allow new connections when the drain starts) then by definition new requests cannot end up at the backend.

The server in http is limited in its ability to initiate connection closures. Remember that when you close a connection in TCP, that sends a FIN packet, and the other end of the connection doesn't know that that's happened yet and might still be sending data packets. In http, the server can request that the client stop using a connection and close it with the "connection: close" header. If the server closes the connection abruptly, there could be requests in flight on the network. With http pipelining, the server may even receive requests on the same connection after sending "connection: close" since they could have been sent by the client before that header was received. With pipelining, the client needs to close the TCP connection to achieve a graceful shutdown.