| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by unohoo 2222 days ago
	Use pulsar - so much better than kafka

2 comments

mrunkel 2222 days ago

I am all for self-reliance, but if you really want to influence someone else, you might want to include a link to the project, especially when the only word you share has a much more prevalent meaning.

link

jooz 2221 days ago

There was a Pulsar post on HN a few months ago, with interesting comments, some of the related to kafka. I keep Pulsar on the radar since that post:

https://news.ycombinator.com/item?id=21936252

link

lytedev 2222 days ago

Why is that?

link

wpietri 2222 days ago

If anybody has seen a good detailed comparison, I'd love to read one. The first dozen hits were pretty weak.

link

dominotw 2222 days ago

millions of topics, no zookeeper ect. Kafka is addressing these shortcomings on the roadmap.

link

nitwit005 2222 days ago

The Pulsar documentation says it requires Zookeeper: https://pulsar.apache.org/docs/en/administration-zk-bk/

link

dominotw 2222 days ago

oh sorry i meant storing topic info in zookeper that limits kafka to a certain number of topics.

link

EdwardDiego 2222 days ago

Nope, it also stores topic metadata in ZK - it's not exactly going to store that in the (near) stateless brokers, or in Bookkeeper - and BK also relies on ZK, but it's common to reuse the ZK quorum between the brokers and the bookies.

It also needs an additional ZK for cluster replication.

link

oweiler 2222 days ago

For a lot a projects this is hardly a problem. On the other hand Kafka is more mature and has a huge ecosystem (Kafka Connect, Kafka Streams, KSQL, ...).

link

math 2222 days ago

Kafka also has less moving parts even today before zookeeper removal is complete (2 vs pulsar 3).

link

biggestlou 2222 days ago

But one of those moving parts of Pulsar, BookKeeper, means that you're no longer storing data on message brokers. Worth the extra puzzle piece for a lot of use cases.

link

biggestlou 2222 days ago

Pulsar is less mature but does provide functional equivalents to all of the above. Pulsar IO (Kafka Connect), Pulsar SQL (KSQL), Pulsar Functions (Kafka Streams).

link

EdwardDiego 2222 days ago

Nah, Pulsar functions is nowhere near Kafka Streams - it's more like AWS lambda.

Example off the top of my head, is that you can't, in Pulsar Functions write the equivalent of "aggregate this stream across a 10 minute window and emit the results on window close".

But that's fine, it doesn't need to be like Kafka Streams, you can use Flink or Spark or Storm etc. to fill the same niche. In fact one of the founders of StreamNative (Pulsar's equivalent of Confluent) is a core committer on Flink.

link

skyde 2222 days ago

Kafka is not more mature just more hyped. I just wish Aphyr jepsen test would also cover more scenario like - what happen to your data if x+1 server permanently fail in the cluster with a replication factor of X. - what happen if a single partition data size or request rate become 90% of the cluster capacity - what happen in multi-tenancy scenario to other user throughput and latency when one user try to use all the capacity of the cluster - ...

link

EdwardDiego 2222 days ago

It's way more mature. I just spent a week evaluating Pulsar vs Kafka for a client and the fact Kafka has been open sourced for 10 years vs. Pulsar's 1.5 really shows in documentation, community support etc.

> what happen to your data if x+1 server permanently fail in the cluster with a replication factor of X.

It depends on how many in sync replica sets existed entirely within those X+1 servers. Their partitions will go offline, and other ISRs will have underreplicated partitions, and the alerting you've set up as a good engineer will have told you this was happening.

> what happen in multi-tenancy scenario to other user throughput and latency when one user try to use all the capacity of the cluster

Nothing because you're using ACL and have configured quotas appropriately.

Bad things otherwise.

PS, also been running Kafka since 0.8.

link

skyde 2222 days ago

if replication factor is 3 and 3 server go down in the span of 1 or 2 hours no alert will save you

link