| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by xchaotic 3223 days ago
	Then there's the 0 model for sharding - don't shard, just replicate everything, everywhere, with eventual consistency and MVCC

3 comments

craigkerstiens 3223 days ago

Author of the post here, couldn't agree more that if you don't have to shard then don't. Scaling out using replicas is an option to certain scale, you do have to ensure no long running queries and make sure your replication is up to date. Those are another option of things to manage, and at an intermediate stage very viable ones, at certain scale it all changes a bit. All that said if you're at a small data level I wouldn't encourage sharding for the sake of it.. if you know you're going to have to scale beyond a single node or approaching limits where you're frequently scaling up it's good to know your options and plan ahead.

link

AznHisoka 3223 days ago

Nobody voluntarily shards for their own enjoyment (unless they like pain). They shard because they need to in order to scale.

link

imtringued 3223 days ago

Sharding and master to master replication are two different things.

The tradeoff of master to master replication limits you to CRDTs, append only logs and manual merging of conflicts by the end user. Alternatively if dataloss is an acceptable tradeoff you can also use a last write wins strategy.

Sharding is basically having one isolated "database" per X (User, location, etc) but the tradeoff is you can't have transactions across two databases.

Document databases usually do both. Each document is it's own tiny database with atomic updates which then can be distributed over the cluster and they support multi master replication for availability/automatic failover.

link

adrianmonk 3222 days ago

Or you can do multi master replication, where you write to all replicas synchronously (and deal with downtime of replicas by continuing if you manage to write to a majority of them).

link