|
|
|
|
|
by lucacasonato
1441 days ago
|
|
A bit of a blunt statement on my part. There is monitoring on a multitude of other connection related issues (eg TLS handshake failures, missing SNI, etc). We should have had monitoring for this specific failure where the load balancer did not have any healthy backends, but as mentioned in the post, the load balancer was programmed in way that this should never have been able to happen in the first place (as the LB should have un-advertised itself if there are no unhealty backends). We are capable of learning from past mistakes though, and as such we'll make sure to add more monitoring for these kinds of scenarios so we can be alerted to a root cause earlier. We will do better. |
|