Hacker News new | ask | show | jobs
by AtNightWeCode 1438 days ago
"... (a TCP load balancer). It does not record any diagnostics about dropped connections, nor does it have a return channel to return diagnostic information to the user (unlike HTTP loadbalancers, which can return a response header)."

And there is no API monitoring apparently.

1 comments

A bit of a blunt statement on my part. There is monitoring on a multitude of other connection related issues (eg TLS handshake failures, missing SNI, etc). We should have had monitoring for this specific failure where the load balancer did not have any healthy backends, but as mentioned in the post, the load balancer was programmed in way that this should never have been able to happen in the first place (as the LB should have un-advertised itself if there are no unhealty backends).

We are capable of learning from past mistakes though, and as such we'll make sure to add more monitoring for these kinds of scenarios so we can be alerted to a root cause earlier. We will do better.

API Monitoring is the practice of making calls to an API to check it. Live end-to-end tests. We do at least ping for every API in every region. Still hard to pinpoint these issues sometimes.