Hacker News new | ask | show | jobs
by ifdefdebug 1830 days ago
For instance, when reading this sqlite came immediately to my mind and how much a 10000 loop of inserts without begin/commit or some preparing pragmas would wreck a ssd... (forces a full sync between each two inserts)
2 comments

Not really though, because your kernel would most likely abstract that away and bunch up the writes.
The kernel can't optimize that because sqlite is specifically requesting it to force a write.
Yes but you can configure the kernel to ignore that, and by default it does.

For example, way back in the day, to get more life out of my laptop during college, I configured the kernel to only write to disk once an hour or when the buffer filled up. That effectively meant I was only writing to disk once per hour when I shut down to change classes.

The modern linux kernel doesn't actually write to disk when fsync is called. It buffers the writes in a cache. Also, the SSD itself has a cache.

There are lots of abstractions between SQLite and the disk.

>The modern linux kernel doesn't actually write to disk when fsync is called

Source for this? This seems to be contradicted by the man page for fsync

https://man7.org/linux/man-pages/man2/fdatasync.2.html

       fsync() transfers ("flushes") all modified in-core data of (i.e.,
       modified buffer cache pages for) the file referred to by the file
       descriptor fd to the disk device (or other permanent storage
       device) so that all changed information can be retrieved even if
       the system crashes or is rebooted.  This includes writing through
       or flushing a disk cache if present.  The call blocks until the
       device reports that the transfer has completed.
>I configured the kernel to only write to disk once an hour or when the buffer filled up. That effectively meant I was only writing to disk once per hour when I shut down to change classes.

Sounds great until you get a kernel panic or random shutdown, in which case you potentially get file corruption and/or data loss.

> The modern linux kernel doesn't actually write to disk when fsync is called. It buffers the writes in a cache.

Do you have a reference for this? That would break every ACID database that I'm aware of, including sqlite and postgresql. There has been a lot of work in the last few years to fix data durability issues with fsync (e.g. https://lwn.net/Articles/752063/), so I would be very surprised to hear that fsync is now a no-op.

> you can configure the kernel to ignore that, and by default it does.

> The modern linux kernel doesn't actually write to disk when fsync is called.

This is false.

Almost all open source databases' durability guarantees are based upon fsync (including SQLite, Postgres, MySQL, and so on). fsync will result in the corresponding underlying storage flush commands. You configure Linux to ignore fsync, but this is is not the default, on any Linux distribution I'm aware of. It would not make any sense.

> The modern linux kernel doesn't actually write to disk when fsync is called. It buffers the writes in a cache.

That's not true, you can tell in many ways but one of the easiest is because fsync is quite slow and noisy (on hard drives).

I would be a bit disappointed if the kernel implementation for HDD and SSD is exactly the same.
For a SATA SSD, I would be surprised if it was different.
Fortunately most people aren't running OLTP workloads on client SSDs. That's mostly done on enterprise SSDs that have much higher endurance. That said even on client SSDs you can probably get away with running such workloads as long as you're not doing them 24/7.
More important than the higher rated endurance (and perhaps contributing a bit to that rating) is the fact that the typical enterprise SSD has power loss protection capacitors for its RAM, so it can cache and combine writes in RAM safely.