Hacker News new | ask | show | jobs
by niftich 3647 days ago
I was unfamiliar with this project and assumed it was a hosted service at first. Not so, this is a local application, so an embedded database makes sense.

It took until the very last paragraph for the blog post to make that point.

4 comments

FWIW, I've run SQLite in a few production sites (low 6 figure pageviews per month) and it has worked fantastically. If you understand and work with the limitations, it really is amazing how much you can get out of it.

I'm actually surprised WordPress hasn't ever moved over to it for ease of installation/deployment - WordPress and PHP seem more likely to trip over in most deployments I've seen before SQLite would.

> I'm actually surprised WordPress hasn't ever moved over to it for ease of installation/deployment

For the same reasons Wordpress hasn't moved to Postgres or doesn't include feature X,Y or Z. Wordpress has to remain relatively stable. A change in the database means breaking Wordpress ecosystem, as many plugins add tables to the DB. There is very little abstraction when it comes to Wordpress API and its interaction with the DB. Wordpress doesn't ship with an ORM.

What are the limitations? I love SQLite, and have vaguely heard that it has issues with concurrency, but what, exactly? I use it on production for www.tithess.gr and it seems to be working beautifully, no concurrency problems whatsoever there.

What problems should I be expecting in a multi-access scenario? I've never had that question answered adequately.

Only one process can have a SQLite database file open for writes at a time. Multiple processes/threads can read, however.

That's about it. I'm hesitant to describe it as an "issue" with concurrency because that has connotations that it's a problem with SQLite that the authors should address. Rather, it's part of the simplicity that is a key design feature of SQLite.

As SQLite's own documentation[0] says, it doesn't compete with client/server databases like Oracle/Postgres/MySQL - it competes with fopen() (edit: not a system call, see below), hand-maintained serialization/pickling formats, etc.

0: https://www.sqlite.org/whentouse.html

> it competes with the fopen() system call

Nitpick: fopen() is not a syscall, it's part of stdio which is in libc (in user mode). The nearest POSIX syscall equivalent is open(2), but stdio does things like buffering and formatting in user mode on top. It's also more portable to non-Unix because stdio is in the C language spec.

I had heard this phrase before as simply "it competes with fopen()".

Andrew has already spoken to the single writer matter. Another issue is it's not very easy to scale horizontally at all. You could shard, but you're not going to be doing replication very easily (although a project does exist to provide replicated SQLite - https://github.com/rqlite/rqlite - but then you might as well finally use Postgres or whatever).

My attitude to this is if my project can be working well and turning a profit within SQLite's limitations, we can worry about migrating to a different stack later on. The same reason I use Ruby for almost everything initially too.

Consider multi-user wordpress site for example. You can have one SQLite DB per user - there is simply no information to share between users. If so you can split all your users between multiple backend machines and do per user request routing - like one machine serving users from A to F. SQLlite has cheap DB loading sequence so you can open/close it for each page request. That would be perfect horizontal scaling as all your users will not fight for single DB access.
well, you kinda just move the problem to another layer. You'll face same horizontal scaling issues once you need to scale on each user.
Or just remove one [DB] layer completely. Such databases can be stored directly on machines executing http requests. And if you need to rebuild DB structure you don't need to stop the world - do them one by one.

Of course, all this depends on amount of data you need to store for each user and informational structure of the app.

You cannot have multiple writers to a SQLite database. Only a single file descriptor may be open with 'write' access. As long as you can get the performance you need out of a single writer, then you're good!
Clarification: You can have multiple connections (aka file descriptors) open on the same database file for writing at the same time. But only one can be actively writing at a time. SQLite uses file locking to serialize writes. Multiple writers can exist concurrently, they merely have to take turns writing.
WAL allows multiple readers and writers:

https://www.sqlite.org/isolation.html

Yes, but in some cases it could be faster to work in sole- writer mode. It is kind of cooperative multitasking demonstrating by Node.js.
I thought sqlite didn't allow any readers while a write was in progress, but apparently that is no longer the case https://www.sqlite.org/wal.html
It only supports a subset of ALTER TABLE and the work around for the limitations is very annoying: https://www.sqlite.org/lang_altertable.html
I've run into this issue before with Django's automatically generated migrations breaking in SQLite. As a workaround, I rm my local test db or make the change myself.
The worst thing that I experienced was actually the lack of basic support for routine schema alterations that you find in other DMBS's. I don't have time to recall exactly but relatively mundane things like altering the names or data types of columns required basically dropping and rebuilding the columns / tables from scratch. It really made the ongoing maintenance quite painful compared to other DBMS's.
Your entire database is in one file, so you can only write from one process at a time. You can read from multiple processes at a time though.
From other limitations there can be the aspect of security. If you have a more complex application / set of applications you may want to use different database users for different purposes. Maybe some audit / append-only tables as well.

With sqlite whoever controls the app can do whatever they want with the database file.

Two questions:

Wouldn't a "full" RDBMS like Mysql/Postgres offer a ton of benefits over SQLite (like the features to handle edge cases as they arise) to the point where, even if SQLite would work, SQLite still wouldn't be the ideal choice?

Does wordpress completely abstract the database or do third-party plugins use their own interfaces to the database, in which case a migration to SQLite would break a lot of them?

Wordpress does not abstract the database, so migration to SQLite would break tons of plugins.

Wordpress is also a PHP application, and it's common to have multiple PHP processes running in parallel, which makes it much harder to use SQLite.

Wordpress does attempt some abstraction of the database via its various wrapper classes/functions like wpdb, WP_Query, etc. It's definitely not ORM-level abstraction by any stretch of the imagination, though.
Yeah I've written many small websites which, such as a dynamic DNS service[1], which use SQLite as their sole storage.

Providing there aren't too many updates at once, such that locking becomes a problem, it works really really well.

[1] - http://dhcp.io/

Thanks; I've added a phrase to the intro section to mention that beets is a desktop app.
Agreed. Not that the 3 items are not relevant, but the fact that it's a local application sounds to me like the most decisive argument in favor of an embedded DB.
Presumably the target audience is people asking why his application does not use MySQL, and not people who have never heard of it at all.