Hacker News new | ask | show | jobs
by gruez 1830 days ago
The kernel can't optimize that because sqlite is specifically requesting it to force a write.
1 comments

Yes but you can configure the kernel to ignore that, and by default it does.

For example, way back in the day, to get more life out of my laptop during college, I configured the kernel to only write to disk once an hour or when the buffer filled up. That effectively meant I was only writing to disk once per hour when I shut down to change classes.

The modern linux kernel doesn't actually write to disk when fsync is called. It buffers the writes in a cache. Also, the SSD itself has a cache.

There are lots of abstractions between SQLite and the disk.

>The modern linux kernel doesn't actually write to disk when fsync is called

Source for this? This seems to be contradicted by the man page for fsync

https://man7.org/linux/man-pages/man2/fdatasync.2.html

       fsync() transfers ("flushes") all modified in-core data of (i.e.,
       modified buffer cache pages for) the file referred to by the file
       descriptor fd to the disk device (or other permanent storage
       device) so that all changed information can be retrieved even if
       the system crashes or is rebooted.  This includes writing through
       or flushing a disk cache if present.  The call blocks until the
       device reports that the transfer has completed.
>I configured the kernel to only write to disk once an hour or when the buffer filled up. That effectively meant I was only writing to disk once per hour when I shut down to change classes.

Sounds great until you get a kernel panic or random shutdown, in which case you potentially get file corruption and/or data loss.

> The modern linux kernel doesn't actually write to disk when fsync is called. It buffers the writes in a cache.

Do you have a reference for this? That would break every ACID database that I'm aware of, including sqlite and postgresql. There has been a lot of work in the last few years to fix data durability issues with fsync (e.g. https://lwn.net/Articles/752063/), so I would be very surprised to hear that fsync is now a no-op.

> you can configure the kernel to ignore that, and by default it does.

> The modern linux kernel doesn't actually write to disk when fsync is called.

This is false.

Almost all open source databases' durability guarantees are based upon fsync (including SQLite, Postgres, MySQL, and so on). fsync will result in the corresponding underlying storage flush commands. You configure Linux to ignore fsync, but this is is not the default, on any Linux distribution I'm aware of. It would not make any sense.

> The modern linux kernel doesn't actually write to disk when fsync is called. It buffers the writes in a cache.

That's not true, you can tell in many ways but one of the easiest is because fsync is quite slow and noisy (on hard drives).

I would be a bit disappointed if the kernel implementation for HDD and SSD is exactly the same.
For a SATA SSD, I would be surprised if it was different.