|
|
|
|
|
by lobster_johnson
4067 days ago
|
|
Kafka does indeed have a good design. But it doesn't solve the potential transaciton gap between your store and the queue. For example, if you commit a transaction but you're unable to reach the Kafka queue (because you crash, you're SIGTERMed, or there's heavy load causing a network blip, or any other number of reasons), you'll lose updates. You can't very well write to Kafka before you commit, because it's not visible yet outside the transaction. The only way is to use a transaction log in the same database, in a way that lets the log be read after the commit is done. Logical streaming would let you do this (Bottled Water [1], as someone else here mentioned, does this with Kafka) in a safe way. It's conceptually identical to storing a transaction log table, but wouldn't require as much custom code, and you'd get incremental updates for free. [1] http://blog.confluent.io/2015/04/23/bottled-water-real-time-... |
|
Two reasons why I can't just use postgres (I'd love to): 1.) Kafka (or whatever queue we settle on) will be used for logs and metrics as well, data that doesnt flow through postgres.
2.) Postgres stores the data-model of my business-domain, at the lowest, normalized level. But derived data-stores are inherently denormalized and I want to be able to use them without talking back to my source-of-truth all the time. So currently I'm passing DTOs to Kafka, just like I would to any API request. This data is not easily available at the postgres-level.
I'm not yet sure on the right abstraction level for events. It seems very natural to have them contain information that I would send to clients directly.