Hacker News new | ask | show | jobs
by Lord_Zero 2969 days ago
In general I have not seen the DBMS emitting changes as a best practice. The applications modifying the data should be emitting events or messages appropriately. Someone can correct me if I am wrong.
5 comments

I’d venture to say that is wrong at least in the case of Postgres.

In order to be able to reason with certainty about your data, information should be coming from the source closest to the truth, and that would be your database. You can’t really trust the application, it may have been written wrong.

Postgres’s LISTEN/NOTIFY commands, which are not part of the SQL standard, exist for these cases.

Part of the thought process around that is that most DBs don’t have a good option to do it well.

In larger applications, if you’ve got a need to have a stored procedure handle a complex modification your application will be blind to it.

Plus, at the application level you don’t have any way to enforce that the particular place in your code that modified the data and then emits the change is the ONLY place in your code that can modify that data. This becomes more pronounced over time too.

The problem is amplified if you find something that needs to be done to inbound data that is better handled in another language for some reason. Then you have to remember to duplicate the logic or hook your new code into the application, forcing a new layer in front of your database.

There are some things that the database is better left handling. :)

That was the conclusion I came to as well. If you just want to push data to a data warehouse by listening to the WAL or NOTIFY and then batch a bunch of updates together that's one thing. If you intend it to be application level events then coupling yourself to the schema of some other service (among other issues) seems problematic.
The most reliable implementation of this pattern is to use the replication binary logs of the database. Your application cannot atomically publish a write to both a database and a messaging system (without introducing a massive headache that you will implement incorrectly).
Yeah but I think it's important to consider whether any microservice should be able to listen to any other microservice inserts or updates. I think using the replication log as the underlying mechanism is a good idea, but you should really be choosing which messages to expose because it forms a public API.
The receivers choose which messages to expose by selecting which events to listen for. The broker then only forwards those events. The question is, how do the receivers specify the event type? Do they use the schema of the publisher, or some intermediate interface at the broker?
what if you have multiple applications writing data? The central repository, most authoritative copy of your data is your database.