|
|
|
|
|
by pgaddict
35 days ago
|
|
Interesting paper. I only started reading / digesting it, but: - I'm not sure how to interpret the Figure 1. It says "Flash writes (KB) per page", but it doesn't really say which page sizes were used. AFAIK MySQL has 16K by default, PostgreSQL has 8K, LeanStore has 4K, but that which makes the numbers a bit hard to compare. - Likewise, I'm a bit unsure about the doublewrite buffering in Postgres, described as "indirect". Postgres doesn't really do doublewrite (we really should, I think), we write pages to WAL and then to data files. I assume that's what is meant by "indirect" in the paper. But this very much depends on the checkpoint frequency and write pattern, as the FPI is written only for the first page change. I wonder if the results in the paper consider this. Maybe the workload is such that it always hits the page just once between checkpoints (i.e. a worst case). Also, the WAL part is nicely sequential, which should play nice with SSDs. |
|
The caption of Figure 1 lists the page size used by each system (i.e., the default configuration).
We use different page sizes across systems, and as you said, it is a bit difficult to compare them directly apples-to-apples. This is actually intentional! Because this also exposes the B-tree index-level write amplification effects. In that sense, Figure 1 kind of suggests that larger page sizes may not necessarily be great for write amplification.
And yes, you are correct regarding Postgres. Instead of having a separate doublewrite buffer file, Postgres relies on WAL full-page writes, which indirectly trigger additional checkpoint writes, so the effect is not entirely straightforward to quantify. To explain that, we discuss how we measured DB WAF for Postgres in Section 10.7 (“How can we calculate DB WAF on other DBMSs?”) of the appendix version: https://arxiv.org/pdf/2603.09927
Regarding the WAL part, yes, the WAL itself is nicely sequential and should generally behave well on SSDs. But once it gets mixed with small random writes that are eventually persisted to flash, it will unfortunately still likely suffer from SSD WAF.