|
|
|
|
|
by jgraettinger1
2029 days ago
|
|
This is a solved problem, for a few years now. The basic trick is to publish "pending" messages to the broker which are ACK'd by a later written message, only after the transaction and all it's effects have been committed to stable storage (somewhere). Meanwhile, you also capture consumption state (e.x. offsets) into the same database and transaction within which you're updating the materialization results of a streaming computation. Here's [1] a nice blog post from the Kafka folks on how they approached it. Gazette [2] (I'm the primary architect) also solves in with some different trade-offs: a "thicker" client, but with no head-of-line blocking and reduced end-to-end latency. Estuary Flow [3], built on Gazette, leverages this to provide exactly-once, incremental map/reduce and materializations into arbitrary databases. [1]: https://www.confluent.io/blog/exactly-once-semantics-are-pos... [2]: https://gazette.readthedocs.io/en/latest/architecture-exactl... [3]: https://estuary.readthedocs.io/en/latest/README.html |
|