Hacker News new | ask | show | jobs
by devit 3739 days ago
I've been unable to find any clear description of the capabilities of Citus and competing solutions (postgres-x2 seems the other leader).

Which of these are supported:

1. Full PostgreSQL SQL language

2. All isolation levels including Serializable (in the sense that they actually provide the same guarantees as normal PostgreSQL)

3. Never losing any committed data on sub-majority failures (i.e. synchronous replication)

4. Ability to automatically distribute the data (i.e. sharding)

5. Ability to replicate the data instead or in addition to sharding

6. Transactionally-correct read scalability

7. Transactionally-correct write scalability where possible (i.e. multi-master replication)

8. Automatic configuration only requiring to specify some sort of "cluster identifier" the node belongs to

2 comments

(Ozgun from Citus Data)

On PostgreSQL language support, we're updating our FAQ to have more information: https://www.citusdata.com/frequently-asked-questions Since the PostgreSQL manual (and its feature set) spans over 4K+ pages, we found that the best way to think about Citus' capabilities is from a use-case standpoint. If your workload needs distributed transactions that span across machines, or large ETL jobs, Citus currently isn't the best fit.

Citus supports sharding and replication out of the box (#4, #5). On #6, reads go through a master node (metadata server) and you see what you write.

We don't have #7. The way in which we implement this also has implications on your other questions. Multi-master (no single metadata server) is by far the biggest feature request that we receive: https://news.ycombinator.com/item?id=11353866

If we go with the approach in https://github.com/citusdata/citus/issues/389, you will be able to configure #3, #6, #7 through PostgreSQL's streaming replication settings. We still won't support distributed transactions that span across multiple machines.

On #8, could you elaborate a bit more? Do you mean a logical identifier for the node?

Also, it's hard to write a concise reply on a topic that requires so much context. I'd love to grab coffee with anyone who's interested in diving deep into distributed databases. Feel free to shoot me an email at ozgun@citusdata.com

Thanks for awesome product!

Do you know when you're planning to release Citrus 5.0 deb/rpm packages?

(Jason from Citus here)

As soon as they're built in PGDG! Our Docker image just builds on the PostgreSQL 9.5.1 image, then installs a .deb we built.

I've been wrapping up all our packaging work during the past week, but not having a OSS release yet was the final blocker for getting into well-known repos. We'll probably have a post about this in the near future.

Does this mean that distributed transactions are not supported at all?
But if they answer those questions, you won't buy support/use it...

Have a donut and look at our marketing spreadsheets.

I'm so tired of "seamless" "effortless" "simple" distributed database lies. There's mathematical theorems as to why there is no free lunch.