Hacker News new | ask | show | jobs
by supriyo-biswas 635 days ago
> I’m really interested in this model where each application server has a copy of a SQLite database file which is then replaced on a scheduled basis.

BTW, this is also the model used by all CDNs, where the global configuration file containing the certificates, HTTP routing rules etc. for all customers will be updated into into a single-file b-tree structure*, and that "bundle" is distributed among all edge locations frequently.

* I'm yet to see someone use sqlite for this purpose, it's usually DBM style databases like LMDB or Kyoto Cabinet.

5 comments

> Kyoto Cabinet

Now, that's a name I've not heard in a long time.

Are people still using Kyoto Cabinet in new projects? Are people still using DBM-style storage generally? I thought that whole branch of the evolutionary tree had sort of died out.

> Are people still using Kyoto Cabinet in new projects?

Cloudflare used to use Kyoto Cabinet[1] and moved to LMDB[1] in 2020; other implementations that I'm familiar with (but don't have a link to share) also use LMDB.

> Are people still using DBM-style storage generally?

It's fairly common in these scenarios, as well as the underlying key-value store for popular software like Consul[3].

[1] https://blog.cloudflare.com/kyoto-tycoon-secure-replication/

[2] https://blog.cloudflare.com/introducing-quicksilver-configur...

[3] https://github.com/hashicorp/consul/issues/8442

Cloudflare moved from lmdb to RocksDB for production.

https://blog.cloudflare.com/moving-quicksilver-into-producti...

Yeah that plus zeromq. Very curious.. I always thought T/K cabinet and similar would become more popular with the adaption of ORMs
I’ve worked on a project a long time ago where we did this with BerkeleyDB files. BDB was used to store configuration data that was frequently looked up.

Periodically we would run a full sync to replace the database. Between the periodic full syncs, we had a background process keep changes applied on a rolling basis.

All-in-all, it worked pretty well at the time! The full database file sync ensured a bad database was timeboxed and we got a bootstrapping mechanism for free.

The query engine in SQLite can be weak. In particular where JOINs across multiple columns are concerned. You really do need to be aware of this when designing indexes into SQLite files.

In any case, SQLite would serve this solution, but almost certainly with a measurable level of inefficiency built in.

This is the first I've heard of SQLite JOIN performance being "weak". I just spent 10 minutes scouring the annals of the web and didn't turn up anything relevant. Are there any additional links or other information you can share about this claim?

Edit: @akira2501: SQLite comes with different tradeoffs, sometimes superior and other times less so, depending on the use case. Blanket statements without evidence are unproductive in progressing the conversation..

Weeellll... I think it's safe to say SQLite's planner is simple, and trends towards "you get what you wrote" rather than having more sophisticated runtime monitoring and adjusting.

But as with all things SQLite, the answer is generally "it's easy to predict so just improve your query" and/or "there's a plugin for that" so it tends to win in the end with a bit of care.

https://www.sqlite.org/queryplanner-ng.html

SQLite's planner is anything but simple. It has a beam search over different plans, basically, it approximates full NP-hard solution search process using bounded space and time.

Seriously?

https://www.sqlite.org/optoverview.html

Chapter 2. Chapter 7. Chapter 8.

It should be _zero_ surprise to you that SQLite is not as powerful as other full database query engines.

I don’t understand how reading that documentation page makes it clear that other relational engines have better performance for joins.
I'm with you. I've encountered numerous situations where the query planner caused issues. I don't know if the alternatives are better, but it's definitely an issue. I've written multiple databases wrapping SQLite, and multiple bindings to SQLite. There are dragons.
I had to use sqlite once for something that would do a ton of read queries on a particular table. Sometimes just swapping the order of operands to an and operation generated a completely different query plan. Had to scour their optimizer document and painstakingly check the query plans of, and then manually tune, each sql statement.

But then it was quite fast.

> will be updated into into a single-file b-tree structure

I'm not knowledgeable on this, but my understanding was a b-tree is a way of sorting values that could be ordered in a certain way. Like this would be a b-tree of IDs

```

            [8]

           /   \

      [3, 5]   [10, 12]

     / | \     / | \  

  [1] [4] [6,7] [9] [11, 13]
```

You traverse by comparing your needle to the root node and going left or right depending on the results.

How is that done with configuration options? That seems like it would just be a regular hashmap which is already efficient to read. What would a b-tree of key/values even look like that wouldn't be less efficient than a hashmap?

Each number in your btree would actually be a key-value pair. So you can find the key fast, and then you have the value.

Databases including SQLite usually use b+tree for tables (a variant where only the leaves have data, the interior nodes only have keys) and regular btrees for indexes.

A hash table makes sense in memory. If it's loaded just right for fast access, it has holes - empty entries. That makes little sense if you are building a file that will be transferred to many places over the internet. Bandwith waste would be significant.

So it might seem that simply enumerating the data (sorted or not) would be a better option for a file. (After all, the receiver will read everything anyway.) I guess frequent updates make this inefficient, so a tree helps.