Hacker News new | ask | show | jobs
by porker 2434 days ago
If the database is the primary source of data, how do you get the data from there into the log-based replication method? I assumed the OP meant you'd write to Kafka, and the messages would be processed twice: once to write to the DB, and once to ElasticSearch.

Not wanting to do that for a small project, but wanting a better architecture than I've got, I'm curious about your proposed approach.

1 comments

Two possibilities: either the app writes both to database and Kafka (ideally using an atomic commit) or CDC is setup in Kafka to read database's transaction log (this is faster)

> you'd write to Kafka, and the messages would be processed twice: once to write to the DB, and once to ElasticSearch

This would be equivalent to using a message queue, which (in contrast to log-ordered replication) does not ensure same consistency guarantees (in this case (1) RYW for database writes and (2) database being always at least as up-to-date as the search index)