Hacker News new | ask | show | jobs
by macspoofing 633 days ago
>While Redis is "fast" in comparison to traditional RDBMS, it's still a database that you have to manage connections, memory, processes, etc., which introduces more brittleness into the stack (the opposite of what we're trying to achieve).

Every database, Relational or Nonrelational, requires approximately the same level of management and maintenance when you start dealing with non-toy levels of transactions.

The "Fast" part is a little funny. If you don't care about joins, then row inserts and retrievals are pretty damn fast too =)

3 comments

> The "Fast" part is a little funny. If you don't care about joins, then row inserts and retrievals are pretty damn fast too =)

What makes SQLite exceptionally fast in a server environment is that you do not require a network call to do the query or even retrieve the data. Your remarks about joins and transactions are meaningless once you understand you're just reading stuff from your very own local HD, which is already orders of magnitude faster.

You can connect to Redis over a local socket, no network overhead.

Modern cloud architecture hates local sockets and wants to push everything to the network, but if you care about performance and are using Redis, that’s how you’d favor deploying & connecting to it.

> You can connect to Redis over a local socket, no network overhead.

I think at some point we need to stop and think about the problem, and if an idea makes sense.

Even Redis has a disclaimer in their docs on how SQLite is faster due to the way it does not require network calls. The docs also explain how it's an apples-to-oranges comparison.

https://redis.io/docs/latest/operate/oss_and_stack/managemen...

(Nit: redis incurs at least IPC overhead, not necessarily network, which would be a ton worse—but if you’re not using Redis for IPC then I don’t know why you’re using it, so IPC overhead goes with the territory)

Redis is best for “smart” shared-memory-type resources. Queues that manage themselves, cache or locks or sessions with built-in expiration, that kind of thing. Shared smart logic for heterogenous same-host clients via extensions.

Remote Redis has never made much sense to me. Clustered-mode, sure, potentially, but with remote clients as the primary method of connection? No. It’s a really good (excellent, even) product but its sweet-spot is pretty narrow (even if the need is common) and basically nonexistent in now-typical container-heavy no-two-resources-are-required-to-be-on-the-same-actual-hardware situations.

SQLite has its vacuum operation, which is kind-of like running a garbage collection. Every time I read the docs about when to run a vacuum, I end up confused.

The last time I shipped an application on SQLite, I ended up just using a counter and vacuuming after a large number of write operations.

HashBackup author here, been using SQLite for about 15 years.

Doing a vacuum after a large number of deletes might make sense. The only real purpose of vacuum IMO is to recover free space from a database. Vacuum may also optimize certain access patterns for a short while, though I have never tested this, and it would be highly dependent on the queries used. If fragmentation is a bigger concern for you than recovering free space, you can also compute the fragmentation to decide whether to vacuum by using the dbstat table:

https://www.sqlite.org/dbstat.html

Then again, computing this will require accessing most of the database pages I'm guessing, so might take nearly as long as a vacuum. The other gotcha here is that just because db pages appear to be sequential in a file doesn't mean they are sequential on a physical drive, though filesystems do strive for that.

SQLite has pragma commands to tell you the number of total and free db pages. When the percentage of free pages is greater than x% and it's a convenient time, do a vacuum. For a highly volatile db, you can add a table containing this percentage, update it every day, and make your decision based on an average, but IMO it's easier just to check for more than 50% free (or whatever) and do the vacuum.

Vacuums used to be (circa 2019) pretty slow operations, but the SQLite team has sped them up greatly since then. Vacuuming a 3GB SQLite db on a SSD takes less than a minute these days. That's with the db 100% full; with only 50% used pages, it would be considerably faster.

Vacuums are done in a statement transaction, so you don't have to worry about a "half vacuum that runs out of disk space" screwing up your database.

> Every database, Relational or Nonrelational, requires approximately the same level of management and maintenance

I disagree with this statement. Surely there is a difference on the effort for "management and maintenance" if your database requires the operation of independent server processes.

Going to extreme examples, do you really believe it makes no difference whether you use SQLite or Oracle?