Hacker News new | ask | show | jobs
by namibj 1973 days ago
Are you suggesting it hits scaling boundaries earlier than that single-node postgres? The TPC-C benchmark suggests fairly strong scaling up to medium~high double digit node counts, and my understanding is that it's decently conflict prone and very much relying on distributed transactions.

Of course a distributed architecture costs efficiency, but as long as it still scales further (and after some point, cheaper) than single-node alternatives, the efficiency loss is tolerable.

You can affect partitioning, but forcing it requires the payed version.

1 comments

> Are you suggesting it hits scaling boundaries earlier than that single-node postgres?

This is nosql scenario all over again. You have to think in cost($)/query and your data layout.

> Of course a distributed architecture costs efficiency, but as long as it still scales further (and after some point, cheaper) than single-node alternatives, the efficiency loss is tolerable.

The efficiency loss is tolerable. The question is, when ? For primary-key get, efficiency-loss is, let's say, none. So you can start distributed from the start.

For another query, especially multi-tenant when you can put a tenant in 1 box, it may start making sense AFTER scaling to 2TB memory node.

Imagine a select query doing a join. It has to read 1GB from nvme-array or from network.

Imagine a write, ending up as a distributed-write, it needs to wait for 2x+ more servers in the network.