Hacker News new | ask | show | jobs
by firebacon 2497 days ago
About the well-tested bit: Per it's own documentation, SQLite has a massive test suite. [*Not all of] the test suite is actually open source though, so the overlap between commenters selling you on how well tested SQLite is and those that have seen how the sausage is made is probably [close to] zero.

However, pointing this or any of the other practical shortcomings of SQLite out on hacker news is blasphemy and will invariably get you downvoted into oblivion by people who (apparently) never ran into them.

3 comments

Part of the SQLite test cases are open source (the logic tests), but indeed unfortunately the biggest part are not. You can purchase a license for these tests though, so saying nobody has seen them is likely incorrect.

Either way, the fact that it is DO-178B certified for use on aircraft should tell you something about the reliability of it. It is certainly more reliable than most other software, OSS or otherwise.

I am curious what practical problems you encountered with SQLite. Could you perhaps expand on that?

What are some other shortcomings of SQLite? I'm genuinely curious even though the below probably sounds like some text from the ~Rust~ Sqlite evangelism strike force.

With the default settings which I semi-affectionately refer to as paranoid mode, an untuned database can start to have worse performance after getting 500,000+ records going. Then things like indexes, RediSQL, and WAL mode start being more necessary rather than just best-practices.

But if you set your pragmas correctly and so on, SQLite scales up just fine. I haven't done a large scale Sqlite base simply due to caution around needing to use a 'real' database in production like SQL server or Postgres, Maria etc. Sqlite is excellent for ephemeral databases to be created, seeded with test data, run tests against, and deleted in repeatable automated testing.

Based on the link to the Expensify article, it sounds like Sqlite can scale up even better than Sql server under some circumstances.

But I have barely tried using it in production because of that aforementioned caution. What are some pitfalls to watch out for?

the elephant in the room with SQLite is that they refuse to support most forms of ALTER TABLE, that is, to be able to change the structure of a schema.

Their rationalization for this is that the way SQLite stores data, this is more efficiently performed by simply creating a new table and copying all the rows from the old table into the new one, and that one would want to batch all the table changes together rather than emitting individual ALTER statemnts.

from the POV of people who make tools like my own Alembic Migrations, this is an annoyingly insufficient answer because the logistics of recreating table schemas and copying data over is much more complicated than just emitting an ALTER directive. I'd like if SQLite had at least the ability to be extended to support a third party "ALTER" plugin that would under the hood run the intricacies of copying the tables around, rather than pushing this out to the tool creators. It doesn't really matter in most cases, for the use cases used by SQLite, that ALTER would be inefficient.

instead, my users bugged me for years to solve this problem and I have to maintain this thing https://alembic.sqlalchemy.org/en/latest/batch.html which I mostly hate completely.

SQLite's typing model is also very idiosyncratic and is based on a naming convention approach, which I think most users of SQLite don't understand very well, because it works in a completely strange way based on looking for substrings inside of the completely arbitrary names you can assign to types. I can create a column with the datatype ELEPHINT and that is a legitimate datatype which will store integers. there's also the "INTEGER PRIMARY KEY " / "INTEGER PRIMARY KEY AUTOINCREMENT" silliness but that's a relatively mild poor API compared to things MySQL does all over the place, I suppose.

It's hard to point those out without coming across as a hater, and because they might seem so obvious. Trying...:

With the way transactions and durability work in SQLite, a normal setup will not do more than on the order of ~100 transactions per second without adding additional layers of complexity. Which is completely fine for it's intended usecase of being an embedded, lightweight database and of course is not due to some bad engineering decisions, but due to inherent tradeoffs in how transactions are handled.

So basically it's all fine until you try to use SQLite for something for which it is not a good fit, like a large volume of inserts. Other solutions exist that are conceptually better equipped for this usecase out of the box (Postgres/MySQL/Elastic/etc). Not a shortcoming in SQLite per-se, but rather a practical shortcoming that comes up when you try to use it for the wrong job.

Case in point is TFA which takes the theoretical SQLite database size (140TB) limit and runs with claiming that is the useful limit on how much data you can store in SQLite. LOL! That value is when SQLite page IDs start overflowing and not an estimate on how much data can be usefully handled in a single SQLite database.

SQLite is, by the nature of what it is, not a good fit for big data volumes. Try loading 100TB of data into SQLite and try to run even a single query. Even assuming that you have a hyper-fast SSD, a single query will take days to complete! But it still get's mentioned as a potential solution for that problem every so often. In this post even...

So you could consider those practical shortcomings or ill-advised usage. It depends on the definition/perspective I guess. At any rate, SQLite, even if generally being excellent, is not the panacea as which it sometimes gets sold here on HN. No database is, there are just too many tradeoffs involved.

Why are portions of the test suite closed source?
They charge money for it...

Note that I'm not saying all of the test suite should be open source. It's clearly a valid/cool business model and SQLite is, for a lot of use cases, an excellent piece of software that I have often used myself. The criticism was honestly more directed towards the cherry-picking of facts that sometimes happens in discussions on hacker news, especially those related to SQLite.