Hacker News new | ask | show | jobs
by johncolanduoni 1761 days ago
Unless you need to do something crazy, like run two nodes of your application. But only Google needs to do that, right?
1 comments

You don't even need to reach two nodes before SQLite becomes grossly inadequate. Even on a single node: SQLite's paradigm of global locks leads to poor performance when multiple threads write to the same table.

You could be a single-node 4-core $5/month VPS instance and run into this issue. SQLite requires "exclusive" access to a table to handle writes (meaning when writing, no other thread can be reading the table). Especially if your transactions start to become complex.

In contrast, MySQL and PostgreSQL allow for simultaneous reads while writes are occurring.

I believe that SQLite in Write-Ahead Log (WAL) mode does not have this issue.
The question is if simultaneous operations really speed up your application.

It is not as if a 4-core machine can do 4 times the DB work if you only allow it.

Memory access, disk access .. they all have their specific behaviour when you try to do things simultaneously. In the worst case, things will just get serialized on a lower level, even if multiple CPU cores send and/or request data simultaneously.

We don’t have to talk in abstract about this, you can benchmark these things or just look at the many ones run by others. For example, some benchmarks around some Postgres scaling improvements: https://wiki.postgresql.org/images/e/e8/FOSDEM2012-Multi-CPU....

If you don’t need anything Postgres offers, by all means stick with sqlite. But your pessimism about being stuck with operations “serialized on a lower level” is not empirically justified for most of the databases that support multi-threading.

> It is not as if a 4-core machine can do 4 times the DB work if you only allow it.

If Thread#1 goes "lock. write. unlock", then that means Thread#2, #3, and #4 all have to wait until Thread#1 is done with the write.

Even if Thread#2/#3/#4 have the data in their CPU-cache, they have to wait for Thread#1 to be complete.

--------

If Thread#1 is writing to a __hard drive__, that means Thread#2, #3, and #4 are waiting on a hard drive read, when they could have instead been reading from L3 cache.

SQLite's model scales extremely poorly in practice. You run into all sorts of problems like this.

That is why I mentioned the "worst case".

You are describing the best case. Where everything is in a cache close enough to the CPU that it is not impacted by the other CPUs data access.

I'm talking about relatively simple cases like spinning up a phpbb3 (web forum) instance. The minute you have multiple users writing comments at the same time from different apache/php instances is the minute your SQLite database starts to fall over.

Every write (aka: every comment post) is an exclusive write on SQLite. A full lock that prevents other processes from reading. (Ex: User#1 writes a comment, but User#2 hits refresh. These two items will be sequential if you use SQLite... when normally they'd be concurrent in most other databases)

------

SQLite is an exceptionally good database for many purposes. But it has its weaknesses. There's a reason why MySQL and PostgreSQL exist after all.

Check SQLite WAL mode: single write, multiple simultaneous reads. https://sqlite.org/wal.html