| > As such I'm occasionally bemused at the sort-of monoculture here around Postgres, where if Postgres doesn't have it, it may as well not exist. FWIW, I, as a medium-long term PG developer, are also regularly ... bemused by that attitude. We do some stuff well, but we also do a lot of shit not so well, and PG is succeeding despite that, not because of. > Relatedly, the other interesting thing is the chatter about fsync. I know on Windows that's not the mechanism that's used, and out of curiosity I looked deeper into what MS-SQL does on Linux, and indeed they were able to get significant improvement by leveraging similar mechanisms to ensure the data is hardened to disk without a separate flush (see https://news.ycombinator.com/item?id=43443703). They contributed to kernel 4.18 to make it happen. Case in point about "we also do a lot of shit not so well" - you can actually get WAL writes utilizing FUA out of postgres, but it's advisable only under somewhat limited circumstances: Most filesystems are only going to use FUA writes with O_DIRECT. The problem is that for streaming replication PG currently reads back the WAL from the filesystem. So from a performance POV it's not great to use FUA writes, because that then triggers read IO. And some filesystems have, uhm, somewhat odd behaviour if you mix buffered and unbuffered IO. Another fun angle around this is that some SSDs have *absurdly* slow FUA writes. Many Samsung SSDs, in particular, have FUA write performance 2-3x slower than their already bad whole-cache-flush performance - and it's not even just client and prosumer drives, it's some of the more lightweight enterprise-y drives too. Edit: fights with HN formatting. |