Hacker News new | ask | show | jobs
by bbuchalter 2963 days ago
Could you expand on what you mean by "weirdness" on VM disk I/O in the context of database storage?
1 comments

The storage has anomalously high latency and throughput variance with some patterns that you don't see with non-virtualized storage and a modest degradation in average performance. This is expected, but it makes it difficult to schedule I/O efficiently. This is more noticeable if you are doing direct I/O because having a VM intercept your storage access defeats the purpose.

What was surprising is that the direct I/O behavior appears to be conditional on whether you are accessing the storage through a file system. My database kernel is block device agnostic, using files and raw devices interchangeably via direct I/O. Against expectations, when we accessed the same virtualized storage as raw block devices, the behavior was like bare metal even though we are running the exact same operations over the same direct I/O interface in a VM. Basically, the only difference was the file descriptor type.

I'm guessing that file systems are virtualization aware to some extent and access through them is actively managed; raw device accesses are VM oblivious and simply passed through by the storage virtualization layer.