Hacker News new | ask | show | jobs
by jongleberry 2798 days ago
At Dollar Shave Club, we use CircleCI 2's Scheduled Workflows to run "monitors" against our production services every minute. These are idempotent, analytics-disabled API & Browser (via Puppeteer) tests that we also run as CI tests on every commit.

We send all monitor metrics to DataDog. When a monitor fails, the appropriate teams will get a Slack notification with the full stack trace. A DataDog monitor will also be triggered, alerting the appropriate teams.

For browser monitors, we upload screenshots and Puppeteer tracing files to S3, then share links within each Slack hook. This allows people to figure out what's going on just by clicking links in Slack.

We were planning to improve this setup in the future, but it's good enough for us right now. For example, CircleCI goes degrades frequently so we sometimes get spotty coverage. We basically spend < $200/month with CircleCI to monitor about 300 APIs/pages every minute.

You can read more here:

- https://engineering.dollarshaveclub.com/monitor-all-the-thin...

- https://circleci.com/blog/how-dollar-shave-club-3x-d-velocit...

- https://github.com/dollarshaveclub/monitor