Hacker News new | ask | show | jobs
by atonse 1884 days ago
This is totally going to be a Hacker News Bingo type of question.

But has anyone tried to do a clean room implementation of Redis using Rust, but speaks the same wire protocol? You would get the zero-cost multi-threading, memory safety, etc, and it would be a drop in replacement.

5 comments

> zero-cost multi-threading

I think you mean zero cost abstractions. Which aren't usually zero cost, but just zero additional cost over doing it yourself.

There's no such thing as zero cost multi threading. Just tradeoffs. Rust actually doesn't help with performance here (it gets in the way often) but it definitely does help with correctness - which is truly hard with multi threaded programs.

I mis-worded. I meant safe multi-threading using rust’s abstractions that make this a lot easier and guaranteed, and zero cost meaning no overhead. Is that not the case? maybe not the case with threading.
> You would get the zero-cost multi-threading,

You kinda have to look at how things really work underneath before you can apply buzzwords to a database.

mis-worded. I meant safe multi-threading using rust’s abstractions that make this a lot easier and guaranteed. Is that not the case?
Tokio async runtime for Rust has a tutorial in its user guide https://tokio.rs/tokio/tutorial on writing a mini-redis (https://github.com/tokio-rs/mini-redis).
I've done a C clean room version and I will say that the networking part is as important as the multi-threading the data structures: https://github.com/raitechnology/raids/.

If you go to the landing page of the above, scroll down to the bottom, there is a TCP bypass solution graphed, using Solarflare Open Onload and it is capable of running several times as fast as the Linux Kernel TCP. I didn't test Redis with Open Onload, but I'm pretty sure you'll get a similar results since TCP is a major performance bottleneck in Redis as well.

Does clean room in this case mean you didn't look at the Redis source? Is there some licensing condition where this is required?
No, I don't believe that's what it means in this case. I took the usage to mean that since Rust has memory model which is different enough from C to require a redesign.

I looked at many implementations of Redis and read many KV papers. My redesign reason was similar to a "clean room Rust" reason, I desired a memory model that used shared memory that was independent of the protocol (Redis RESP in this case), allowing multiple processes with different types of protocol services to attach to it.

"Rewrite", then.
Yeah. "Refactor" may be more accurate.
It's been a long time since I've looked at KeyDB, but IIRC KeyDB is just Redis plus a spinlock. It's actually still very performant. There are other "toy" reimplementations of Redis in Rust that take the same approach and aren't even as performant as single threaded Redis.

The next approach you could take is using something like Glommio and take a thread-per-core design to Redis. I think that approach has a lot of potential, but the design becomes more complex (you now need something like distributed transactions for "cross-core" Redis transactions and mutli-gets)

RonDB (NDB Cluster) takes a different approach to threading and claims it's faster than scylladb-style sharding http://mikaelronstrom.blogspot.com/2021/03/designing-thread-...
"ScyllaDB also handles complex query processing in the same engine. RonDB does all complex query processing in MySQL in a separate program." — i.e., as soon as things might get a bit hairy, RonDB punts it off-system. While a clever way to keep performance numbers high, you'd have to also then analyze the MySQL latencies and throughputs to get an overall view of complete system performance.

It's a clever dodge, but it's not a magical way to eliminate the need for those pesky CPU cycle times.

Look a little closer Peter, I doubt you arrived that fast at that conclusion being right.

It's not a dodge, the MySQL process is in the same server and can even use shared memory.

He claims in total to be better.