Hacker News new | ask | show | jobs
by jread 302 days ago
For data and control plane I can determine the issue from the API request/response logs (i.e., network timeout, 5xx, etc.). Network tests are trickier and we don't have a great way to validate failure cause each of those events (i.e., we don't capture a traceroute on failure), other than to evaluate results from multiple endpoint combinations (e.g., AWS us-east-1 to us-west-1 fails while us-east-2 to us-west-1 succeeds).