|
|
|
|
|
by sewen
3759 days ago
|
|
I disagree on some points. First, it is crucial to distinguish between "exactly-once" semantics with respect to state inside the stream processor (for example an aggregate computed in a window) an exactly-once delivery to external systems. The former is built into Flink, the later is only possible in some cases (transactional systems) and requires extra effort. Exactly-once for state inside the stream processor is incredibly useful, because it allows you to implement many non-idempotent operations such that the writes to external systems are idempotent: For example, you compute the complex aggregate in the stream processor and only periodically write the result to the external system (overwriting previous values). Now the external system always reflects an aggregate without duplicates. That is very valuable and only possible if inside the stream processor, you have exactly-once semantics for state. That does imply that the stream processor has a notion of managed state (in Flink for example the Windows, key/value state, and generic checkpointed state). Disclaimer: I am a Flink committer. |
|