| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cfontes 2516 days ago

Kafka exactly-once semantics addresses the main issue of the article I think.

It's now relatively simple for a developer to implement a system with exactly once guarantee as long as you take care of the world that is not inside a Kafka transaction (integrations with third parties and such), which is still not super easy sometimes, but less so then the distributed transaction that will happen inside Kafka.

Kafka hides the complexity really well from my use of it so far is very reliable with the "new" semantics.

3 comments

pdpi 2516 days ago

I'm getting mighty tired of the "exactly-once" thing.

Everybody and their uncles seem to have picked up on this trend of advertising at-least-once systems as exactly-once, then burying somewhere in the docs that you're expected to guarantee idempotency yourself to get the appearance of exactly-once. That was the state of the art decades years ago, it's the state of the art now, and it's pretty damn dishonest to sell quality-of-life improvements as a fundamental shift in the guarantees/properties of these systems.

link

dualogy 2516 days ago

> at-least-once systems with idempotency to get the appearance of exactly-once

What baffles me even more is why the above is apparently not generally considered good-enough, elegant-enough --- and as a bonus, not violating the laws of physics either? Both sides of the coin are quite tameable and implementable. And together deliver what was wanted in the first place, and effectively. Curious in any subtle edge-cases I might have missed here!

link

pdpi 2516 days ago

The problem is that there is an audience for whom exactly-once sounds like it makes their non-specialist lives much simpler compared to an at-least-once system because they can offload the necessary distributed systems expertise to somebody else.

People insist on this messaging precisely because that crowd is somewhat vulnerable to this sort of shenanigans.

link

hosh 2516 days ago

It requires teaching the engineering team to reason and prove idempotency guarantees.

I've seen teams try to continue programming the way they always have and try to throw difficult problems over the wall. It is exactly-once in a non-threaded, monolithic, stateless web app, so why shouldn't it always be like that? It's part of a programmer's mindset to try to create abstractions and reason within a simpler problem space.

link

i0exception 2516 days ago

Kafka can probably guarantee exactly ones semantics on publishing (conditions apply). It definitely cannot guarantee exactly once semantics on the consumer and processing side. Imagine a scenario where you receive a message from Kafka and process it, but the processor crashes or has a network partition right after. There's no way for the message to be acknowledged and you either have to design your system to be idempotent or handle exactly-once semantics further down the stack.

Databases have been handling exactly-once semantics for decades now. What Kafka is doing is not new and actually gives you a false sense of security when it comes to these kinds of things.

link

docker_up 2516 days ago

Kafka exactly-once semantics is just that, they are "semantics".

What Kafka supports is exactly-once processing which has been supported in other stream processing frameworks such as Apache Storm years before Confluent's marketing. Duplicates are possible in Kafka with the current implementation of exactly-once, if one uses Kafka's consumer api it will de-dedupe on the processing side.

So no, there is no such thing as exactly-once in distributed systems.

link