Hacker News new | ask | show | jobs
SableDb – Fast, persistent database supporting Redis API (github.com)
114 points by SableDb 809 days ago
6 comments

How does it compare to Kvrocks, which use RocksDB as the storage backend too?

https://github.com/apache/kvrocks/

Also, it works mentioning that kvrocks is more mature and supports many more commands than what SableDb currently supports
It performs better and uses different design choices (for example: SableDb uses tokio's local task per connection, and in general it uses green threads to make the code more readable and easy to maintain).

I will release some design documents later on (hopefully this month). Remember that is a one man project (hopefully, not for long), so it takes time to organize everything :)

I did some rookie testing between KVRocks and sableDB using Redis Benchmark

KVRocks

  PING_INLINE: 171821.30 requests per second, p50=0.183 msec
  PING_MBULK: 173310.22 requests per second, p50=0.191 msec
  SET: 115074.80 requests per second, p50=0.399 msec
  GET: 163398.70 requests per second, p50=0.271 msec
  INCR: 110741.97 requests per second, p50=0.415 msec
  LPUSH: 89847.26 requests per second, p50=0.487 msec
  RPUSH: 94428.70 requests per second, p50=0.487 msec
  LPOP: 86880.97 requests per second, p50=0.535 msec
  RPOP: 88339.23 requests per second, p50=0.527 msec
SableDB

  PING_INLINE: 90744.10 requests per second, p50=0.279 msec
  PING_MBULK: 90826.52 requests per second, p50=0.279 msec
  SET: 85763.29 requests per second, p50=0.311 msec
  GET: 87336.24 requests per second, p50=0.295 msec
  INCR: 68775.79 requests per second, p50=0.663 msec
  LPUSH: 36589.83 requests per second, p50=1.031 msec
  RPUSH: 38299.50 requests per second, p50=1.135 msec
  LPOP: 38051.75 requests per second, p50=1.191 msec
  RPOP: 37383.18 requests per second, p50=1.143 msec
KVRocks seems faster but certainly not a bad start
Sharing the build configuration (e.g. did you make sure to build `sabledb` in release mode?) + threads configurations etc, worth mentioning.
How well does raw Redis and/or raw RocksDB perform on your machine?
I like the idea of doing thread local execution of Tokyo tasks; I assume that means SableDb is mostly single threaded? Was this to reduce complexity, or for some other reason? I'm looking forward to the design doc on this!
It is multi-threaded (configurable, you can set it to a specific number configuration file, or use the magic value 0 where SableDb decides based on the number of cores divided by 2).

Each incoming connection is assigned to a worker thread, and two tokio tasks are created for the connection (one for reading and another for writing).

Using tokio allowed me to use the `async` code without using "callback hell" so the code looks clean and readable in a single glance without the need to follow callbacks

Hi SableDb. I am looking for a tech cofounder in databases. Probably not the best place to ask for a cofounder. :-) Regardless, would you be interested?
You might as well post it to the discussion of this article about why you won't find a technical co-founder. <https://news.ycombinator.com/item?id=39902372>
I’m potentially interested in a cofounder for my DB. Can you ping me on gmail to connect (username in profile)?
Looks promising, but needs support for more than just strings and lists. I personally use hashes, sorted-sets, and sets more than lists in production apps, and probably others too?

https://github.com/sabledb-io/sabledb/issues/7

Absolutely, adding more commands is my goal Completing a full-sync replication is my first priority (I have currently implemented a WAL tailing from primary -> replica) but tailing from a snapshot is the ideal solution IMO here.

Once this in place, adding "hash" commands (hset, hget etc) is the next family of commands. I open sourced it hopefully to get help from people out there :)

It would be informative to compare to memory only Redis and persistent Redis on the same hardware with the same benchmark suite. Even if SableDb is slower since it’s durably persisted, it would still be useful to consider the tradeoff versus ephemeral or weakly persisted implementation of the same API.
I don’t see SableDB mentioned in your link, and my comment is specifically about comparing using the same benchmark. Comparing across benchmarks is usually foolish since there’s so many factors that are different and performance scales non-linearly over the factors. For example the Garnet benchmark uses machines with 72 Azure vCPUs, the SableDB one uses AWS machines with 16 vCPUs.

Besides, I don’t expect SableDB to be faster than weakly persisted systems like Redis or I guess Garnet. The cool thing about SableDB is that it (looks?) durable - although the docs don’t make specific promises, they do mention some things in passing that imply RocksDb transactions. Each command seems to flush its changes to durable storage before succeeding. That’s very different from Redis et al even with their “append only file” / WAL turned on - the AOF is written asynchronously so you will still lose data on crash. Redis also deletes random data when under memory/storage pressure.

Again, I want to understand trade-offs not “find the fastest web scale database”. I’m sure /dev/null is faster than Garnet.

most self reports are false. there should be 3rd party eval.
Written in Rust™
So it's a Redis API written in Rust, but the underlying database is all C++. It seems like a nice project, but perhaps a little misleading to say it's written in Rust.
If anyone is interested in a Rust-based on-disk KV store, I've come across sled[0] a few times, seems interesting. The author's also built a lot of other cool concurrency primitives for Rust as well.

[0] https://github.com/spacejam/sled

See my previous comment. Sled was considered, and it was the main KV storage in early implementation.

I kept the adapter approach in the code, so switching back to sled should be pretty easy (or even converting it to full in-memory)

The storage itself uses the rust binding for RocksDb

However, I did have a branch (in another repo) that uses other different storage, some are purely written in Rust, like "dash" (which is full in-memory), "sled" and "speedb". I eventually decided to stick with RocksDb since its well mature and maintained by some giant companies like Meta.

The user code (the one I wrote) is all Rust.

Also, one could also argue that the `bytes` crate that is heavily utilized in the code base of SableDb, uses plenty of `unsafe` code, does this make it less "Rust" ?

    unsafe fn main() {
        ...
    }
RocksDB is written in C++ and runs in-process. That's just like having a load of unsafe Rust code.
So does OpenSSL, Bytes crates, WinAPI and many many other crates used by many Rust applications. Does this make applications written in Rust less "Rusties"?

IMO, the networking, threading and "tasks" ("green threads") in SableDb code base are the most risky part of writing a server and by choosing Rust, the risk of memory issues is reduced to minimum without scarifying performance

imho LSM (which is what RocksDB is) is not the optimal storage manager for this application. Should be a B-tree-like thing.
What do you mean by “this application”? There’s no application here, it’s just a database, right?

Or do you mean that LSMs shouldn’t be a foundation for a database?

LSM are write-optimized, most deployments of Redis are read-heavy situations.
SableDB - Written in Rust, well kind of: RocksDB engine is written entirely in C++