Hacker News new | ask | show | jobs
by evan_miller 4280 days ago
Hi, post author here. The slight difference in i/o (specifically: writes) was the trigger. I talked a little more about that here: http://corner.squareup.com/2014/09/logging-can-be-tricky.htm...

And here: https://news.ycombinator.com/item?id=8359556

Even on the problematic host we only saw this latency issue in the 99th percentile. That is: even on the problem host 99 out of 100 queries were served as expected and only 1 out of 100 saw this additional latency.

1 comments

Well, yeah, I noticed you guys' response to one of the comments on the blog post indicated that the problem machine had a different workload (additional tasks or something). That caused the additional writes, which then caused the latency for the main app on the box.

I think your point still stands about logging, being cautious about blocking I/O calls, etc. But, it seems the bigger point is one of how your overall system is architected, which proccesses run where, dedicating like nodes to their tasks vs. potential quality/consistency issues arising from having some pull double-duty, etc.

Those seemed to be the source of the real issue here.

Sort of. The catch is that even a very small write, say just a few megabytes, can drastically change the cost of an fsync(). On my test aws VM even writing just 4 megabytes one time is enough to trigger the problem. Even on an otherwise fully isolated system a few megs may be written from time to time, for example by a management agent like chef or puppet. Or by an application deploy copying out new binaries.

For example, here I reproduce the problem on a completely isolated machine: https://news.ycombinator.com/item?id=8359556

IMO the real issue is that a competent logging framework doesn't block app code to sync the log to disk. The buffer should be swapped out under lock, and then synced in a separate thread. Yuck.
The downside is of course that if you crash hard, the most valuable log entries are the ones least likely to be on-disk afterwards.
Which is why logging to disk on the server is BAD, have your log framework write to stdout and have upstart/systemd/whatever handle writing to a remote syslog server or whatever your fancy is.
Good points. I got something out of it on both fronts.