Hacker News new | ask | show | jobs
by xyst 488 days ago
> Each of these Web workers puts those 4 records onto 4 of the topic’s partitions in a round-robin fashion. And, because they do not coordinate this, they might choose the same 4 partitions, which happen to all land on a single consumer

Then choose a different partitioning strategy. Often key based partitioning can solve this issue. Worst case scenario, you use a custom partitioning strategy.

Additionally , why can’t you match the number of consumers in consumer group to number of partitions?

The KIP mentioned seems interesting though. Kafka folks trying to make a play towards replacing all of the distributed messaging systems out there. But does seem a bit complex on the consumer side, and probably a few foot guns here for newbies to Kafka. [1]

[1] https://cwiki.apache.org/confluence/plugins/servlet/mobile?c...

1 comments

Even if you used a 100% random strategy - what OP described can still happen.

Matching the number of consumers can still produce an uneven result too, and OP clarifies that even if the worst case he laid out doesn't happen - in practice there still are idle workers. For the same reason, he likely doesn't want to have 16 workers at all times