Hacker News new | ask | show | jobs
by cmatta 3161 days ago
Kafka does give you a counter, it's in Neha's blog post. Kafka's idempotent producer registers itself with the brokers with a unique producer id and includes a sequence number with each message it sends. The brokers simply keep track of producer id + the highest sequence number they've seen, and return an ack, if the ack is lost the producer retries and the broker knows to deduplicate the message, and send the ack. In the event counter example this would ensure that each event would be incremented once, even in the face of multiple failures.

Relying on another distributed system to ensure the first distributed system isn't duplicating messages sounds like a headache I wouldn't wish on anyone. (I am a Confluent employee).

1 comments

OK, so you're right, it seems "brokers" are little databases (maybe just for counters?). In this case the broker acts as the separate de-duplication system I described. I'm much more familiar with a system that does not provide order guarantees (and as a result doesn't need "partitioning" or "re-partitioning" for multiple consumers), but with Kafka where order is guaranteed, a simpler mechanism is possible - keep the count and the message sequence number together, sometimes update both the count and the seq, sometimes just the seq, only update if the new seq is prev+1. And this is built into the broker.

But you still need to understand how this works to do "kafka transactions" and you still need some other scheme to get effects/actions outside of kafka. (And you'll probably get people doing dumb stuff saying "I was told I get exactly-once delivery")