Hacker News new | ask | show | jobs
by 66fm472tjy7 1442 days ago
In the first iteration of using confirms, we did not have the outbox but only logged how long it took to get the confirmation. After 3 seconds, we would throw out the expected confirmation. If a confirmation took longer than that, we would log that we received an unknown confirmation.

We hoped it would be fast enough that we can just wait for the confirmation before committing the transaction.

The official documentation says

> This means that under a constant load, latency for basic.ack can reach a few hundred milliseconds

I never did statistics, just looked at the log. IIRC most were acceptable but > 3s occurred frequently enough (and we even had instances of messages never being confirmed, IIRC) that we abandoned that plan.

We considered using Debezium[0], but decided on the current solution as it could be solved entirely with the current services and infrastructure whereas Debezium would have required us to deploy (writing this from memory so this might be inaccurate/incomplete) Kafka, Zookeeper, and a connector service.

[0] https://debezium.io/

1 comments

Yep, Debezium is built on Kafka Connect, and yeah, it expects a Kafka cluster to talk to, which will have ZK present for maintaining cluster state.
Kafka has shipped the long-awaited ZooKeeper-free mode, but AFAIK it’s still beta and behind feature flags on the producer, broker, and consumer (like almost all Kafka config :( but that’s another story)
Yeah, it's shipped, but it's missing some existing ZK features that tooling around Kafka relied on, and I'm a bit embarrassed for Confluent that they pushed KRaft so hard without a replacement.

E.g. the ability to watch a ZK node for changes, which means in Kafka sans ZK, you can't detect changes to topics without continuously polling via the admin client.

A coworker is working to implement something like this for KRaft, but it really demonstrates how an IPO can cause a company that was the steward of a FOSS project to do things detrimental to that project to keep the share price up. (Was also interesting how many key Confluent people left right after the IPO)

The other very notable change is how Confluent's dev effort has switched from the open source project to the Enterprise Edition, but they still have the majority of PMC members, while not having the corporate blessing to spend time reviewing PRs.

> you can't detect changes to topics without continuously polling via the admin client.

Yikes, that sounds like an oversight! Aren't topic configs written to a system topic that you could consume from?

They were supposed to be, in the original KIP, but that changed, I'm unclear as to what drove that change.

So my coworker's solution joins the KRaft quorum as an observer, then publishes metadata changes to a topic you can consume from.

Sorry, realised I'd lapsed into jargon and missed the edit.

KIP = Kafka Improvement Proposal. The mechanism for proposing big changes to Kafka and getting community feedback before core committers vote on adoption or not.

I'm not sure why Zookeeper is viewed so negatively in regards to Kafka, it's damn solid, and if I can quote Jepsen "Use Zookeeper". I know Confluent wanted to replace it because it struggled when you hit thousands of topics on a single cluster, but that feels like a very niche use case to me.