Hacker News new | ask | show | jobs
by tinco 2395 days ago
Fun fact, on GCP Kubernetes you can have green lightbulbs on every single dashboard, and your entire site can be down anyway.

Our CI/CD was leaking "review" deployments, I forgot about them until one day I upgraded a node and the entire site went down, even though everything was green. Turned out there is some sort of naximum amount of nginx entries in ingress and we were hitting it. That was some frantic debugging, solution was just to delete the spurious review deployments.

1 comments

This is true of anything, not just k8s. There is always a possibility of all system monitors to be green but the apps dead/bad for some reason. This is why you need more than system monitors. You need app monitoring as well.
That is called broken monitoring.