Hacker News new | ask | show | jobs
by hd4 3421 days ago
Sorry if this should be obvious but what is/are the killer feature/s of RethinkDB, what differentiates it from something like Redis or even CockroachDB?
4 comments

Only RethinkDB gets you a) working changefeeds, where you can receive real-time changefeed updates to your queries, b) a well-implemented and Jepsen-proven distributed database.

As far as I know, there is no other solution which gets both things right.

I use it in PartsBox (https://partsbox.io/), a solution for keeping track of electronic components.

I am surprised more people aren't interested in changefeeds — the way I see it, it's the only way to implement multi-user webapps which update in real-time (as in: a change is made in one session and all other open sessions get the update immediately).

Have to agree with pmalynin, there are other dbs that do this.

Couchbase Mobile has changes feed as well.

How you are defining/measuring "well-implemented".

Not sure why there is no Jepsen test of Couchbase. I see many requests, and a closed issue with no discussion of why it was closed.

Well a) is provided by Mongo and is literally the reason why Meteor can do exactly what you described: multi-user webapps which update in real-time.
Sigh. Yes, the Mongo oplog and RethinkDB changefeeds are superficially similar. They are both for feeding changes, just like a paper airplane and a passenger jet are both for flying. And yet there is a world of difference.

Leaving aside reliability and ease of use, let's focus on correctness. RethinkDB lets me query a database, get initial data, and then get all the subsequent changes to that data. Notice there is no race condition there.

You can use this to implement systems where when a user logs in, gets the initial data loaded, and then subsequent changes are sent as they happen. Even if the same data is modified by someone else during this time (e.g. during the initial load), things will be processed correctly.

Comparing this to attaching a processor to a feed of all operations in the database doesn't make a lot of sense, because the oplog doesn't provide the same functionality.

The oplog replication provided by Mongo is a very different mechanism than RethinkDB's change feeds. This is a pretty good overview of the differences: https://www.compose.com/articles/rethinking-changes-how-two-...
Unfortunately b) is critical to some, which Mongo did not fair well on:

https://aphyr.com/posts/284-jepsen-mongodb

I couldn't find any documentation on change feeds for Mongo after a quick google search.

Could you post a link please?

While the oplog does provide some semblance of RethinkDB's changefeed, it's not nearly as powerful. With Rethink, you say you want a query, and rethink will let you know about changes to that query. Mongo just says "hey, here are operations that were done", and leaves the reconstruction to you".

So I guess Mongo+Meteor match up with RethinkDB... sort of.

It's also worth noting that changefeeds are highly scalable: you can run tens of thousands of them on a single node, and scale them out linearly from there (even as they're scoped to specific queries.)

Obviously the performance characteristics will be impacted by the volume of changes that arrive to the database, but the architecture to support this is highly parallelized (all the way down to cores on the CPU.)

RethinkDB is just a great document-centric database. It has guarantees similar to SQL, including server-side joins, while having a great replication/redundancy pattern (similar to C). It's probably best in a use case where you are planning on 3-15 servers for a cluster. If you need more than that C may be better.

I like to think of it as MongoDB done right. Above and beyond better consistency models and a broader, more well thought out API, they have an admin interface that is second to none (well SQL Management Studio might be slightly better). It's definitely better than any other "NoSQL" database.

A couple years ago, I had been considering it for a project, at the time it was missing a required feature for the project (geolocation indexes), so I wasn't able to use it then... but I followed the development of the feature, and prerequisites for that and the automatic master failover and the engineering discipline and planning was far better than pretty much any project I'd been exposed to ... The team(s) and their energies were not wasted, and I really appreciate what they have done.

I was sad to see the company shutter, but very happy to see the project under LF, and hope that it really takes off from here. It would be a pretty natural fit as an RDS service under Amazon and there are a few hosted options. Horizon also looks interesting compared to firebase.

This is another feature over competitors is that streaming updates is in the box, and not bolted on to oplog processing like competitors.

What is that C you're talking about?
C* is short for Cassandra.
Why did people start doing that? I noticed all the Cassandra people at work started using C* at pretty much the same time, too, including signatures in e-mail. Was there a global "there's too many letters in Cassandra and C7a looks weird" memo to the entire Cassandra community? Drives me nuts for absolutely no reason I can think of.
Probably a few high profile devs started using it and the community started to emulate.
I'm honestly not sure, I thought it was a canonical abbreviation as a lot of the training docs I've seen use it.
The * characters in your original comment were interpreted as markdown emphasis markers, so they effectively got lost.
yeah, I noticed after I replied... thx.
This might be a useful read: https://rethinkdb.com/faq/

It explains RethinkDB's ideal use cases, explains how to compare it to other databases, and details some of the differentiating features.

RethinkDB has passed jepsen testing.

*Fixed typo

I think you might mean the Jepsen tests: https://aphyr.com/posts/329-jepsen-rethinkdb-2-1-5