Hacker News new | ask | show | jobs
by adamkl 1581 days ago
It is possibly that you could get inconsistencies between systems, but you mitigate that with durable event streams (e.g. Kafka) and by ensuring that each data entity in your enterprise has a system of record that can be used to resolve conflicts.

This is really not that different than using ETL jobs to move data about, but taking a streaming approach vs a batch approach.

1 comments

> It is possibly that you could get inconsistencies between systems, but you mitigate that with durable event streams (e.g. Kafka) and by ensuring that each data entity in your enterprise has a system of record that can be used to resolve conflicts.

The event stream layer isn't where the sync problems have arised in the systems I've worked on. It's the "commit transactionally both to your database and the event stream" part. Not a lot of systems are built to be ready to roll back the database change if the event publish fails. Or to be able to handle duplicate events if you error the other way.

So always write to a transactional queue first and then write to the database by reading sequentially from the queue?
That would work. The internal consistency will be at least once and you do de-duplication to handle message reliability through the database then. In other words, you need to uniquely identify each message to ensure idempotency since you will have duplicate writes of the same message to the database.

Just need to make sure to not mess up causal ordering between events because of out of order retries, if such things are important for your application.

Reading sequentially can be hard here, especially depending on throughput and how well you can or can't shard (e.g. how wide is the radius of possible side effects).

E.g. what if one event is something like "credit here, debit there" - you need to process it sequentially for both sides!

I guess if the concern is around atomically committing to the DB and event stream at the same time, you could hang the event stream off the DB using change-data-capture to populate the events.

Ultimately, whenever you are pushing data between systems you can end up with inconsistencies which why it’s important to clearly define systems of record.

That's basically what the database would do anyway, so generally, yes (e.g. even with a RDBMS it's internally got some sort of write log, in most cases). That's gonna help a lot!

It doesn't give you any guarantees about "at a given time" consistency, though. Maybe your queue or your event processor got backed up. So how real-time do you need both sides?

These are, of course, problems that people have solved to varying level of satisfaction for most use cases! But I've seen systems built by people who saw the blog post about "have both!" and then didn't think about all these cases and then... it gets messy.

Would it make sense to always write to a transactional queue first and then write to the database by reading sequentially from the queue?
If something is relying on the data in the queue to trigger a request to a service using the DB, the requestee may not have the full data yet.

The pattern I usually see is wiring this the other way - use an initial event as the trigger to write to the DB, while also streaming the DB's binlog or equivalent back as events. Now the risk is that another service gets a "too fresh" view but this is usually less harmful than a stale view. Things listening to the binlog events need to process them idempotently, but this is usually not a major complication since most queue designs will require you to be prepared for that anyway.