|
|
|
|
|
by lethain
2534 days ago
|
|
(Stripe infra lead here) This was a focus in our after-action review. The nodes responded as healthy to active checks, while silently dropping updates on their replication lag, together this created the impression of a healthy node. The missing bit was verifying the absence of lag updates. (Which we have now.) |
|