Hacker News new | ask | show | jobs
by dest 2492 days ago
Interesting pick from one of the links in the article:

"SQLite has fantastic write performance as well. By default SQLite uses database-level locking (minimal concurrency), and there is an “out of the box” option to enable WAL mode to get fantastic read concurrency — as shown by this test. But lesser known is that there is a branch of SQLite that has page locking, which enables for fantastic concurrent write performance."

https://blog.expensify.com/2018/01/08/scaling-sqlite-to-4m-q...

6 comments

Author of SuperSQLite here. SQLite is under-utilized in my opinion! Because of the name, people seem to have the misconception that it can't be used for production workloads. It's true that it generally works well for embedded use cases, but it can be used for so much more and has an extremely simple codebase without a lot of bloat.
Some guy on IRC told me they made a world scale DNS server work because of sqlite.
Would you recommand it on the server side as well?

For example to store user credentials.

You could, plenty of sites do! It really depends on your write workload, but I would say it would scale pretty well for most small to medium size SaaS companies unless your product is Google scale.
When you have a write-heavy workload with multiple servers that need to write concurrently to a shared database (backend to a website), you would probably want to choose something that has a client-server model instead like PostgreSQL

It's easy to get really stellar concurrent performance out of SQLite using a many reader, single writer model (ie many threads, single process). In testing we did it easily surpassed Postgres.

I had great success using SQLite as the backing store for fast sharing and reindexing of a domain specific search engine component my company licensed. Before we developed our own algorithms and moved to Elasticsearch I was playing with augmenting the licensed component with the full text search capabilities of SQLite. The whole thing ended up being shuffled around via Gluster so I was able to offload the responsibility for sharing shards to it.

My experience pretty much matched what you describe and it was such a great opportunity to really lean on a fabulous piece of software. Given this was many years ago, before ES was stable and SOLR wasn't working well for us, I think it was the right choice. These days, ES is good enough that if I had to do it all over I'd go straight to it. None the less, I don't think SQLite should be ignored as an option when you have high levels of control over data access patterns.

I really like this idea.

Does anyone use this seriously in production on a typical web service? I wonder about how eg backups and stuff like that work out in real scenarios.

That's really interesting.

I've always been a big fan of SQLite and this is the one challenge I've always faced.

Can you give some more insights as to how you achieved that?

For reads just create a new connection for every request (be sure to set connection properties for wal etc). Then create a global (or something equivalent to global, like a singleton) connection to serve as the writer and put a mutex around it when doing writes. Easy, scales like a mofo.
What does "mofo" mean here. Can you give us a rough estimate on the transaction rate you achieved with this setup? My own experience and all independent benchmarks I can find seem to indicate a limit of 100-1000TPS on reasonable hardware.

Note that you can "batch" up many inserts into a transaction, which gives you a high "nominal" insert rate but still only ~100 actual transactions or so per second. To see why this is not the most useful number when comparing to a database like Postgres, consider that the limiting factor in a SQLite/Posgres design are cache flushes which outweight the costs of actually writing the data, so the number of rows per batch is mostly arbitrary; using this metric you can always claim a huge insert performance by choosing a suitable N. Also, if you do the batching, you of course loose some of SQLite's consistency/durability guarantees for your writes, which is probably fine if you didn't need them in the first place, but begets the question if an embedded ACID database is the best tool for the job at hand.

Test it yourself, you can try all the different combinations of journalling mode and synchonous modes. I'm not here to proscribe anything to anyone. Every use case is different.
> What does "mofo" mean here

Motherfucker

I think they meant what makes it a positive `mofo`, if you will.
In case anyone is curious "BEGIN CONCURRENT" is what is being discussed here (page level locks). There is also a WAL2 mode which is basically the WAL mode operating in a A/B hot swap mode - to facilitate checkpointing without holding up writes.

SQLite has progressed a lot in the last few years. It is no longer advisable to mock it over SQLITE_BUSY et al; you will come unstuck on the Internet very quickly ;-)

I knew Oracle offered something like that (implemented by grafting the SQLite frontend/VM on top of Berkeley DB). But it'd be really cool to see page-level locking appear in standard SQLite version.
Dumb questions: if BedRockDB has all of these huge benefits over using stock-SQLite, why hasn't SQLite merged in all of the changes into SQLite? Why does BedRockDB have to exist as a separate fork?
I think it is because the changes are, currently, considered esoteric and slightly experimental in the sense they want to reserve the right to make breaking changes to them. SQLite has a relatively slow cadence when it comes to things like this. Other reasons may be that they're still working on the tests to validate and support those features - as well as documentation. I am hopeful BEGIN CONCURRENT and WAL2 will make it into the amalgamation at some future date.
> SQLite has a relatively slow cadence

They intend to support it till 2050 - https://sqlite.org/lts.html - which includes keeping file format and API backwards compatibility. That accounts for the more careful decision making since things added now have to be supported for another 30 years.

uh.... concurrent writes? southparkejaculation.gif

Unfortunately, I use SQLite through Python so I'm stuck with the system version :-(