Hacker News new | ask | show | jobs
by evan_miller 4285 days ago
Sort of. The catch is that even a very small write, say just a few megabytes, can drastically change the cost of an fsync(). On my test aws VM even writing just 4 megabytes one time is enough to trigger the problem. Even on an otherwise fully isolated system a few megs may be written from time to time, for example by a management agent like chef or puppet. Or by an application deploy copying out new binaries.

For example, here I reproduce the problem on a completely isolated machine: https://news.ycombinator.com/item?id=8359556

2 comments

IMO the real issue is that a competent logging framework doesn't block app code to sync the log to disk. The buffer should be swapped out under lock, and then synced in a separate thread. Yuck.
The downside is of course that if you crash hard, the most valuable log entries are the ones least likely to be on-disk afterwards.
Which is why logging to disk on the server is BAD, have your log framework write to stdout and have upstart/systemd/whatever handle writing to a remote syslog server or whatever your fancy is.
Good points. I got something out of it on both fronts.