Hacker News new | ask | show | jobs
ScyllaDB is Moving to a New Replication Algorithm: Tablets (scylladb.com)
107 points by carpintech 1077 days ago
7 comments

Very similar to how BigTable[1] works under the hood which was built ~20 years ago.

[1] https://static.googleusercontent.com/media/research.google.c...

The load shifting part is similar to the way BigTable splits, merges, and assigns tablets. But the rest of it is not related, because BigTable does not try to offer mutation consistency across replicas. If you write to one replica of a BigTable, your mutation may be read at some other replica, after an undefined delay. Applications that need stronger consistency features must layer their own replication scheme atop BigTable (such as Megastore).

What this post is describing for replication seems more comparable to Spanner.

I don't understand this comment. Bigtable requires that each tablet is only assigned to one tablet server at a time, enforced in Chubby. There's no risk of inconsistent reads. Of course this means that there can be downtime when a tablet server goes down, until a replacement tablet server is ready to serve requests.
Right, the contrast I was trying to draw was between what they depict, where multiple nodes are holding a replica of the tablet and performing synchronous replication between themselves, and what BigTable would do, which is to have the entire table copied elsewhere, with mutation log shipping. What they are doing is more analogous to how Spanner would do replication.
Unless you're doing multi-cluster replication, there is no log shipping in BigTable: the data replication within a cluster is taken care of by the underlying filesystems.

Single-cluster BigTable is strongly consistent.

There can be consistency issue if you run multi-cluster BigTable, because multi-cluster replication is asynchronous.
This is the right move for Scylla. Overall, looks similar to YugabyteDB that distirbutes data by sharding tables into tablets as well. The cluster monitors the cluster size (number of nodes) and the size of each tablet (data volume), and adds new tablets or re-shards large ones automatically:https://docs.yugabyte.com/preview/architecture/docdb-shardin...
"can i copy your homework" "yeah but change it a bit" (compared to the other comment that looks nearly identical, looks lik yugabyte sent some people in here)
This sounds a lot like ranges in CockroachDB. Anyone familiar with the deep details to highlight the differences?
I thought of the same thing, so I am trying to find information on the documentation about things that CockroachDB does very well:

1. Consistent backups/transactions. When a backup is made, is that a single point in time, or best-effort by individual tablet. For example, backing up an inventory and orders table, the backup could have an older version of inventory, where orders have already been completed for some of them. It looks like Scylla backsup per node, so it could mean that data might have a slight time offset from one another.

2. Replicate reads. Like CockroachDB, it looks like Scylla will redirect the read to the range lead, but CRDB also provides the option to get stale reads from a non-lead. This is usually good for cross-region databases where reading off the lead can be big increases in latency.

I do not have a lot of time, but I am having a hard time finding much information about the architecture on Scylla's documentation. My personal guess is that Scylla optimized their code for performance, and less worry about data integrity.

> My personal guess is that Scylla optimized their code for performance, and less worry about data integrity.

Definitely, but they are implementing Raft-based transactions which provide higher consistency. That should enable a higher variety of use cases.

Moving from Vnode-based replication to tablets to dynamically distribute data across the cluster
Tablets remind me of vitess
Would be nice if the deployment story became a bit more like CockroachDB too.
How so? Mind saying more?
I suppose that they are saying that CockroachDB is a single binary which you just drop on a machine and you are good to go. For ScyllaDB you need to install Java, Python and several ScyllaDB related packages.
Java? I thought the whole raison d'etre for ScyllaDB was "Cassandra without Java"?
The server implementation is, but administering it still requires the Java based Cassandra tooling like nodetool and cqlsh
cqlsh is written in Python. Which doesn't mean it's less of a pain in the ass ;)
The Docker images for ScyllaDB work perfectly fine and ship with all administrative tools included.
I think you may be confusing ScyllaDB with Cassandra.
This sounds like partitions in DynamoDB