|
|
|
|
|
by andreimatei1
3137 days ago
|
|
Ozgun has done a good job describing some of Citus' benefits; allow me to write the CockroachDB-biased answer. Btw, congrats to all the Citus people on your launch. The highest-level architectural difference is, in my opinion, the fact that CockroachDB aims to not trade any features of single-node SQL databases for the distributed nature of our clusters. In contrast to Citus, you get ACID (serializable) transactions regardless of how the data ends up being distributed - so touching more than one shard/node in a transaction does not weaken any consistency guarantees. Similarly, DML statements can modify distributed data with no fuss.
The sharding is transparent in crdb; one doesn't need to explicitly declare any partitioning on a table or choose a shard count. We do give you, however, control over the data locality should you need it. The other big architectural difference has to do with our availability/durability guarantees. In crdb, all data is replicated, and the replication is done through a consensus protocol. This means that data appears to be written at once on multiple replicas. Losing any one replica at any point in time is not a big deal; all the data that had previously been written is still accessible. You'll never lose any writes because of a machine failure. This is in contrast to most master-slave type architectures, where generally data is not "atomically" written across the master and the slave. Whenever a slave is promoted to be a master, you will probably incur some data loss. It is my understanding that Citus falls in a flavor of such a master-slave architecture. Now, this all doesn't speak about the current blog post in particular; crdb does not currently have a managed service offering. |
|
Because clients talk to Citus using the synchronous postgres protocol, latency is one of the most important factors in performance. A single connection can never do more than 1/latency transactions per second. While you could use more connections, that is often complex, not always possible (e.g. long transaction blocks), and also comes with a significant performance penalty (fewer prepared statements, high SSL overhead, ...). It's important to optimise for latency.
We find the Citus transaction model is sufficient for most purposes. If you think of the typical balance transfer example, then there is no way to double spend or overspend. Distributed transactions are atomic and locks in postgres will serialise distributed transactions that update the same rows and ensure constraints are preserved. Importantly, Citus has distributed deadlock detection, which allows generous concurrency at almost no cost to the application (apart from maybe having to reorder transaction blocks).
Postgres supports different modes of replication, including synchronous replication to avoid losing any data on node failure. Many users actually prefer asynchronous replication, to avoid the extra round-trip and the broader performance implications that would have. It's also the reason we use streaming replication over other replication mechanisms that might give higher availability. Streaming replication supports higher concurrency and thus gives better throughput.
At the end of the day, people use Citus because a single node postgres (or MySQL) database is not performant enough to keep up with their workload. Performance is key. Citus is not a silver bullet for every postgres performance problem, but it has two particular strengths:
Minimal pain scaling out for multi-tenant applications that can shard by tenant.
Superpowers for real-time analytics apps (e.g. dashboard). Citus has a laser-sharp focus on addressing these use cases and makes certain trade-offs because of that.