Hacker News new | ask | show | jobs
by benjbrooks 769 days ago
Update:

Thanks for your recommendations everyone! We decided to go the route of measuring successful hits on the endpoint associated with our docs (inside our monitoring service). That's the default health check associated with our load balancer, so it gets hit periodically (built-in cron job). We just added a signoz alert that is triggered if the sum of those successful calls over the past X seconds falls below a threshold.

1 comments

> That's the default health check associated with our load balancer, so it gets hit periodically (built-in cron job).

Be aware that you’re still leaving quite a bit of surface area unmonitored: any sort of issue between your clients and the load balancer could break and not fire an alert. DNS configuration, firewall issue, SSL certificates, network outage, etc etc.

If you’re really trying to assert that some HTTP resource is publicly reachable, it’s still a good idea to have some external testing service periodically hitting your endpoint, running some assertions on the response, and alerting you if it’s failing for more than X minutes. (We do this; see my other reply.)

Hope that helps!