Hacker News new | ask | show | jobs
by marktangotango 2494 days ago
When you have a write-heavy workload with multiple servers that need to write concurrently to a shared database (backend to a website), you would probably want to choose something that has a client-server model instead like PostgreSQL

It's easy to get really stellar concurrent performance out of SQLite using a many reader, single writer model (ie many threads, single process). In testing we did it easily surpassed Postgres.

3 comments

I had great success using SQLite as the backing store for fast sharing and reindexing of a domain specific search engine component my company licensed. Before we developed our own algorithms and moved to Elasticsearch I was playing with augmenting the licensed component with the full text search capabilities of SQLite. The whole thing ended up being shuffled around via Gluster so I was able to offload the responsibility for sharing shards to it.

My experience pretty much matched what you describe and it was such a great opportunity to really lean on a fabulous piece of software. Given this was many years ago, before ES was stable and SOLR wasn't working well for us, I think it was the right choice. These days, ES is good enough that if I had to do it all over I'd go straight to it. None the less, I don't think SQLite should be ignored as an option when you have high levels of control over data access patterns.

I really like this idea.

Does anyone use this seriously in production on a typical web service? I wonder about how eg backups and stuff like that work out in real scenarios.

That's really interesting.

I've always been a big fan of SQLite and this is the one challenge I've always faced.

Can you give some more insights as to how you achieved that?

For reads just create a new connection for every request (be sure to set connection properties for wal etc). Then create a global (or something equivalent to global, like a singleton) connection to serve as the writer and put a mutex around it when doing writes. Easy, scales like a mofo.
What does "mofo" mean here. Can you give us a rough estimate on the transaction rate you achieved with this setup? My own experience and all independent benchmarks I can find seem to indicate a limit of 100-1000TPS on reasonable hardware.

Note that you can "batch" up many inserts into a transaction, which gives you a high "nominal" insert rate but still only ~100 actual transactions or so per second. To see why this is not the most useful number when comparing to a database like Postgres, consider that the limiting factor in a SQLite/Posgres design are cache flushes which outweight the costs of actually writing the data, so the number of rows per batch is mostly arbitrary; using this metric you can always claim a huge insert performance by choosing a suitable N. Also, if you do the batching, you of course loose some of SQLite's consistency/durability guarantees for your writes, which is probably fine if you didn't need them in the first place, but begets the question if an embedded ACID database is the best tool for the job at hand.

Test it yourself, you can try all the different combinations of journalling mode and synchonous modes. I'm not here to proscribe anything to anyone. Every use case is different.
> What does "mofo" mean here

Motherfucker

I think they meant what makes it a positive `mofo`, if you will.
Ah yes of course. Not deleting my comment because I think my misunderstanding is funny.