Hacker News new | ask | show | jobs
by StillBored 4053 days ago
The article while interesting doesn't even attempt to talk about identifying _WHY_ he is experiencing latency spikes.

It seems to me that the first step is to instrument the heck out of your application. That way you know _WHY_ it sometimes takes longer to respond than at other times. I understand this is harder in a GC'ed/JIT'ed language, but its not really an excuse.

Good instrumentation is helpful for far more than tracking and isolating problems in ones software stack. The project I work on has pretty extensive request level instrumentation which tracks each request at multiple points as they are processed. More than once the first signs that a piece of hardware is failing is the disk or network IO's start to have unusual latency patterns. Often software performance regressions show up as highly variable database requests, etc..