Hacker News new | ask | show | jobs
by bonquesha99 2221 days ago
As a Kafka alternative, has anyone attempted to use PostgreSQL logical replication with table partitioning for async service communication?

Proof of concept (with diagrams in the comments): https://gist.github.com/shuber/8e53d42d0de40e90edaf4fb182b59...

Services would commit messages to their own databases along with the rest of their data (with the same transactional guarantees) and then messages are "realtime" replicated (with all of its features and guarantees) to the receiving service's database where their workers (e.g. https://github.com/que-rb/que, skip locked polling, etc) are waiting to respond by inserting messages into their database to be replicated back.

Throw in a trigger to automatically acknowledge/cleanup/notify messages and I think we've got something that resembles a queue? Maybe make that same trigger match incoming messages against a "routes" table (based on message type, certain JSON schemas in the payload, etc) and write matches to the que-rb jobs table instead for some kind of distributed/replicated work queue hybrid?

I'm looking to poke holes in this concept before sinking anymore time exploring the idea. Any feedback/warnings/concerns would be much appreciated, thanks for your time!

Other discussions:

* https://old.reddit.com/r/PostgreSQL/comments/gkdp6p/logical_...

* https://dba.stackexchange.com/questions/267266/postgresql-lo...

* https://www.postgresql.org/message-id/CAM8f5Mi1Ftj%2B48PZxN1...

5 comments

Sounds a bit like change data capture (CDC), e.g. via Debezium [1] which is based on logical decoding for Postgres, sending change events to consumers via Kafka? In particular when using explicit events for downstream consumers using the outbox pattern [2]. Disclaimer: I work on Debezium.

[1] debezium.io/ [2] https://debezium.io/documentation/reference/configuration/ou...

I think you'll find Debezium (http://debezium.io) interesting. Specially it's embedded mode where it's not coupled to Kafka.

Also,do look at the Outbox Pattern. It's basically what you are describing.

Just as Kafka isn't a database, PostgreSQL isn't a queue/broker. You can use it that way, but you'll spend a lot of time tweaking it, and I suspect you'll find it's too slow for non-trivial workloads.
Skype at its earlier peak used Postgres as a queue at huge scale. PGQueue. It had a few tweaks and, sure, it is an anti-pattern but it can work. It is sure handy if you are already using postgres and want to maintain a small stack.
I believe you will enjoy this:

https://medium.com/revolut/recording-more-events-but-where-w...

Not exactly the same architecture you are proposing, and quite complex tbh, but it is working for them.

interesting idea: I think one issue is that the write throughout of a single master Instance is very limited.

But if working set fit in memory on the instances processing the writes, you could use PostgreSQL logical replication to update materialized view on other server easily.

but then it start looking like Amazon aurora or Datomic database.