| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by paulryanrogers 1973 days ago
	Distributed systems are hard. Multi master is particularly sticky, especially if the data doesn't have natural boundaries. Once solved though horizontal is nice, if more involved to maintain.

1 comments

namibj 1973 days ago

CockroachDB is pretty good at encapsulating the complexity of multi-master.

You'll have to accept that transactions can fail due to conflicts, so if they are interactive, you'll have to retry manually.

Edit: I'd like hear criticism, instead of just seeing disapproval.

link

ddorian43 1973 days ago

(as a downvoter) Distributed transactions don't scale, they are NOT efficient. You can't co-partition data in Cockroachdb, so the only way is the slow way. Interleaved-data don't count.

link

namibj 1973 days ago

Are you suggesting it hits scaling boundaries earlier than that single-node postgres? The TPC-C benchmark suggests fairly strong scaling up to medium~high double digit node counts, and my understanding is that it's decently conflict prone and very much relying on distributed transactions.

Of course a distributed architecture costs efficiency, but as long as it still scales further (and after some point, cheaper) than single-node alternatives, the efficiency loss is tolerable.

You can affect partitioning, but forcing it requires the payed version.

link

ddorian43 1972 days ago

> Are you suggesting it hits scaling boundaries earlier than that single-node postgres?

This is nosql scenario all over again. You have to think in cost($)/query and your data layout.

> Of course a distributed architecture costs efficiency, but as long as it still scales further (and after some point, cheaper) than single-node alternatives, the efficiency loss is tolerable.

The efficiency loss is tolerable. The question is, when ? For primary-key get, efficiency-loss is, let's say, none. So you can start distributed from the start.

For another query, especially multi-tenant when you can put a tenant in 1 box, it may start making sense AFTER scaling to 2TB memory node.

Imagine a select query doing a join. It has to read 1GB from nvme-array or from network.

Imagine a write, ending up as a distributed-write, it needs to wait for 2x+ more servers in the network.

link