Hacker News new | ask | show | jobs
by yibg 2527 days ago
Outliers are where the interesting stuff happens, and outliers happen to individual instances. Aggregates are useful but can be very misleading. You can have milliseconds 99 percentile latency with ~1% of requests timing out.

I wouldn’t alert on a single machine having CPU issues, but I’m definitely interested in a small collection of individual machines all having CPU issues at the same time.