Hacker News new | ask | show | jobs
by ozgune 3131 days ago
It's actually both of these reasons, with reason (1) being the primary one.

PostgreSQL allows interactive transaction blocks (meaning you don't have to submit all commands within a transaction block upfront). Citus extends Postgres and needs to provide the same semantics.

That said, we regularly evaluate different techniques on distributed deadlock detection and avoidance. We have an FAQ that discusses deadlock avoidance methods in the context of Postgres. In the link below, the last question on "How can a distributed database prevent distributed deadlocks?" provides more detail:

https://www.citusdata.com/blog/2017/08/31/databases-and-dist...

1 comments

Does (can?) Citus optimize the case where the whole batch is visible up front?
Not practically, except when it is a single-statement transaction.

Where deadlock prevention becomes useful/necessary is single UPDATE/DELETE statements that span across multiple nodes. When those are executed in parallel they could deadlock against each other due to non-deterministic execution order.

Citus currently uses predicate locks to avoid these deadlocks, but there's probably some room for improvement there. On the other hand, for Citus use cases UPDATE and DELETE across shards are mainly batch operations (e.g., delete old data), so there's not a strong need for it yet.