What they don't tell you is that in order to achieve "exactly once" delivery you need to have idempotent writes. For example inserting in a database using PKs.
Exactly-once in a distributed system doesn't exist, and we need to get past that. It can be approximated with replay (at-least-once) and de-dupe/idempotency.
Trying to ensure true exactly-once is a fools errand. In a distributed system it required guarantees (down to the magnetic material level) that are very hard to get right. If any of those guarantees fail, you don't have it. What you have is a close approximation.
Real world applications usually involve a lot more than counting words in memory.
At-least-once is relatively easy. Combine it with idempotent operations and your work is done.
Storm is fairly explicit in documenting what it takes to achieve this, but it's not trivial, and every system involved has to support certain guarantees.
Spark (streaming) made some pretty big claims about exactly-once guarantees, but it turned out that claim was riddled with holes.
In my opinion, "exactly-once" doesn't imply that there are exceptions to that rule.
Guard against dupes and you'll be fine (easier said than done, obviously), but also know the limitations of the systems and frameworks you are working with.
First, it is crucial to distinguish between "exactly-once" semantics with respect to state inside the stream processor (for example an aggregate computed in a window) an exactly-once delivery to external systems. The former is built into Flink, the later is only possible in some cases (transactional systems) and requires extra effort.
Exactly-once for state inside the stream processor is incredibly useful, because it allows you to implement many non-idempotent operations such that the writes to external systems are idempotent: For example, you compute the complex aggregate in the stream processor and only periodically write the result to the external system (overwriting previous values). Now the external system always reflects an aggregate without duplicates.
That is very valuable and only possible if inside the stream processor, you have exactly-once semantics for state. That does imply that the stream processor has a notion of managed state (in Flink for example the Windows, key/value state, and generic checkpointed state).
Exactly-once delivery really depends where and how you actually "deliver". Even if your writes are idempontent you need to know if they have been properly committed on the other side, not that simple. If the system you are committing your output offers version control and/or proper transactional support then delivery can be eventually re-conciliated.
Apache Flink's snapshotting algorithm solely guarantees exactly-once application state access, plain and simple.
Trying to ensure true exactly-once is a fools errand. In a distributed system it required guarantees (down to the magnetic material level) that are very hard to get right. If any of those guarantees fail, you don't have it. What you have is a close approximation.
Real world applications usually involve a lot more than counting words in memory.
At-least-once is relatively easy. Combine it with idempotent operations and your work is done.
Storm is fairly explicit in documenting what it takes to achieve this, but it's not trivial, and every system involved has to support certain guarantees.
Spark (streaming) made some pretty big claims about exactly-once guarantees, but it turned out that claim was riddled with holes.
In my opinion, "exactly-once" doesn't imply that there are exceptions to that rule.
Guard against dupes and you'll be fine (easier said than done, obviously), but also know the limitations of the systems and frameworks you are working with.
EDIT: Disclaimer: I am an Apache Storm PMC Member