ScyllaDB is Moving to a New Replication Algorithm: Tablets

Y	Hacker News new \| ask \| show \| jobs

	ScyllaDB is Moving to a New Replication Algorithm: Tablets (scylladb.com)
	107 points by carpintech 1077 days ago

7 comments

bbss 1077 days ago

Very similar to how BigTable[1] works under the hood which was built ~20 years ago.

[1] https://static.googleusercontent.com/media/research.google.c...

link

jeffbee 1077 days ago

The load shifting part is similar to the way BigTable splits, merges, and assigns tablets. But the rest of it is not related, because BigTable does not try to offer mutation consistency across replicas. If you write to one replica of a BigTable, your mutation may be read at some other replica, after an undefined delay. Applications that need stronger consistency features must layer their own replication scheme atop BigTable (such as Megastore).

What this post is describing for replication seems more comparable to Spanner.

link

axiak 1077 days ago

I don't understand this comment. Bigtable requires that each tablet is only assigned to one tablet server at a time, enforced in Chubby. There's no risk of inconsistent reads. Of course this means that there can be downtime when a tablet server goes down, until a replacement tablet server is ready to serve requests.

link

jeffbee 1077 days ago

Right, the contrast I was trying to draw was between what they depict, where multiple nodes are holding a replica of the tablet and performing synchronous replication between themselves, and what BigTable would do, which is to have the entire table copied elsewhere, with mutation log shipping. What they are doing is more analogous to how Spanner would do replication.

link

dikei 1076 days ago

Unless you're doing multi-cluster replication, there is no log shipping in BigTable: the data replication within a cluster is taken care of by the underlying filesystems.

Single-cluster BigTable is strongly consistent.

link

dikei 1076 days ago

There can be consistency issue if you run multi-cluster BigTable, because multi-cluster replication is asynchronous.

link

magden 1077 days ago

This is the right move for Scylla. Overall, looks similar to YugabyteDB that distirbutes data by sharding tables into tablets as well. The cluster monitors the cluster size (number of nodes) and the size of each tablet (data volume), and adds new tablets or re-shards large ones automatically:https://docs.yugabyte.com/preview/architecture/docdb-shardin...

link

dangoodmanUT 1077 days ago

"can i copy your homework" "yeah but change it a bit" (compared to the other comment that looks nearly identical, looks lik yugabyte sent some people in here)

link

jzelinskie 1077 days ago

This sounds a lot like ranges in CockroachDB. Anyone familiar with the deep details to highlight the differences?

link

Nican 1077 days ago

I thought of the same thing, so I am trying to find information on the documentation about things that CockroachDB does very well:

1. Consistent backups/transactions. When a backup is made, is that a single point in time, or best-effort by individual tablet. For example, backing up an inventory and orders table, the backup could have an older version of inventory, where orders have already been completed for some of them. It looks like Scylla backsup per node, so it could mean that data might have a slight time offset from one another.

2. Replicate reads. Like CockroachDB, it looks like Scylla will redirect the read to the range lead, but CRDB also provides the option to get stale reads from a non-lead. This is usually good for cross-region databases where reading off the lead can be big increases in latency.

I do not have a lot of time, but I am having a hard time finding much information about the architecture on Scylla's documentation. My personal guess is that Scylla optimized their code for performance, and less worry about data integrity.

link

acevedoorafael 1077 days ago

> My personal guess is that Scylla optimized their code for performance, and less worry about data integrity.

Definitely, but they are implementing Raft-based transactions which provide higher consistency. That should enable a higher variety of use cases.

link

carpintech 1077 days ago

Moving from Vnode-based replication to tablets to dynamically distribute data across the cluster

link

bsdnoob 1077 days ago

Tablets remind me of vitess

link

geenat 1077 days ago

Would be nice if the deployment story became a bit more like CockroachDB too.

link

eatonphil 1077 days ago

How so? Mind saying more?

link

aeyes 1077 days ago

I suppose that they are saying that CockroachDB is a single binary which you just drop on a machine and you are good to go. For ScyllaDB you need to install Java, Python and several ScyllaDB related packages.

link

hobofan 1077 days ago

Java? I thought the whole raison d'etre for ScyllaDB was "Cassandra without Java"?

link

taywrobel 1077 days ago

The server implementation is, but administering it still requires the Java based Cassandra tooling like nodetool and cqlsh

link

heipei 1077 days ago

cqlsh is written in Python. Which doesn't mean it's less of a pain in the ass ;)

link

heipei 1077 days ago

The Docker images for ScyllaDB work perfectly fine and ship with all administrative tools included.

link

tracker1 1077 days ago

I think you may be confusing ScyllaDB with Cassandra.

link

collinc777 1077 days ago

This sounds like partitions in DynamoDB

link