| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by bonesmoses 2316 days ago

Right tool for the right job. Users in Tokyo won't be editing accounts for users in Austrailia. There's often a natural partition there where you can leverage the data locality with an eventual consistency basis safely.

Conflict-free Replicated Data Types (CRDTs) and column-level conflict resolution also largely address the issue with data merges and coarse counter systems. I'll just ignore the general design flaw there, considering these should be an insert-only ledger system rather than counters, because bad or naive implementations abound. I get that the application layer shouldn't necessarily have to coddle the database, given absolute ACID would resist non-ideal usage patterns.

I haven't played with the others, but I set up an 8-node CockroachDB and managed to get pgbench to run on it. I was clearly doing something wrong, because I wasn't impressed at the throughput even with very high client counts. From what I could see there, I was clearly being throttled by some mix of RTT and locking contention between the 3-node default consensus group. Supposing this is not problematic in non-overlapping shard locality, you still hit a hard performance ceiling when multiple clients interact with the same pool fragment.

I'll have to look at the others now.

1 comments

ahachete 2316 days ago

> Right tool for the right job. Users in Tokyo won't be editing accounts for users in Austrailia. There's often a natural partition [...]

Exactly. That's why I'd say for tools that support multi-master to provide and enhance (enforce) support for this use case and avoid (don't support) the case where data durability and consistency are lost.

> Conflict-free Replicated Data Types (CRDTs) [...]

Yes, that's another solution for the problem... but I don't see them provided by BDR or Pgcat.

> I'll just ignore the general design flaw there, considering these should be an insert-only ledger system rather than counters [...]

It's not that I advocate for that system, but many users may choose this implementation. That you say "hey, you need to do all this and implement this best practices (and pass the Jepsen test) just to be sure you don't screw up data consistency and durability... it's putting a heavy burden in the user. Databases, if anything, are abstracting users away from complex issues with data and providing solid primitives to work on them. Again: Postgres is known as a very well trusted data store with given ACID properties. Conflict resolution breaks away with this.

> I haven't played with the others, but I set up an 8-node CockroachDB and managed to get pgbench to run on it. I was clearly doing something wrong, because I wasn't impressed at the throughput even with very high client counts [...]

I haven't benchmarked it myself, but I heard stories of low performance. Actually, I always believed that the internal key-value store that they use would never scale to represent table workloads. But give a shot to the other solutions. Spanner, for one, is known to process millions of transactions per second, with global (planet scale) replication (of course, with notable lag, but total consistency). That's pretty cool ;)

link

irfansharif 2316 days ago

> Actually, I always believed that the internal key-value store that they use would never scale to represent table workloads.

Care to elaborate here? As someone working at that layer of the system, our RocksDB usage is but a blip in any execution trace (as it should be, any network overhead you have given it's a distributed system would dominate single-node key-value perf). That aside, plenty of RDBMS systems are designed such that they sit atop internal key-value stores. See MySQL+InnoDB[0], or MySQL+RocksDB[1] used at facebook.

[1]: https://en.wikipedia.org/wiki/InnoDB

[0]: https://myrocks.io/

link

ahachete 2316 days ago

Don't get me wrong, both RocksDB and the work done by CCDB is pretty cool.

Yet I still believe that layering a row model as the V of a K-V introduces by definition inefficiencies when accessing columnar data in a way that row stores do, as compared to a pure row storage. Is not that it can't work, but that I believe it can never be as efficient as a more row-oriented storage (say like Postgres).

link

irfansharif 2316 days ago

I have no idea what you're saying. What's a "row-oriented storage" if not storing all the column values of a row in sequence, in contrast to storing all the values in a column across the table in sequence (aka "column store"). What does the fact that it's exposed behind a KV interface have to do with anything? What's "more" about Postgres' "row-orientedness" compare to MySQL?

In case you didn't know, a row [idA, valA, valB, valC] is not stored as [idA: [valA, valB, valC]]. It's more [id/colA: valA, id/colB: valB, id/colC: valC] (modulo caveats around what we call column families[0], where you can have it be more like option (a) if you want). My explanation here is pretty bad, but [1][2] go into more details.

[0]: https://www.cockroachlabs.com/docs/stable/column-families.ht...

[1]: https://www.cockroachlabs.com/blog/sql-in-cockroachdb-mappin...

[2]: https://github.com/cockroachdb/cockroach/blob/master/docs/te...

link

ahachete 2316 days ago

I know well. I have read most CCDB posts and architecture documentation. Pretty good job.

There are several ways to map a row to K-V stores, and different databases have chosen different approaches, I'm not referring specifically to CCDB's.

Whether you do [idA: [valA, valB, valC]] or [id/colA: valA, id/colB: valB, id/colC: valC], what I say is that I believe it is less efficient than [idA, valA, valB, valC], which actually also supports more clearly the case of compound keys (aka [idA, idB, idC, valA, ....]). Both are the ways Postgres stores rows.

link

awoods187 2316 days ago

PM @CockroachDB here. Have you seen our performance page? https://www.cockroachlabs.com/docs/dev/performance.html CRDB can easily scale throughput.

link