Hacker News new | ask | show | jobs
by cmckn 1908 days ago
Oh hell yeah! That's great news, tons of work went into this -- props to the contributors!

I will take the opportunity to say that Kafka is kind of painful, with or without ZK. Check out NATS! [0]. It doesn't solve all the same problems, but is so much easier to use (during development especially) and can do a lot of the same things.

[0]: https://docs.nats.io/whats_new_20

9 comments

NATS isn't actually using a log structure though, it's a streaming message broker with a different set of consistency/delivery promises.
Correct; but I've seen many uses of Kafka that NATS could totally be used for. For example, load balancing across subscribers (use a NATS queue instead of a Kafka consumer group).

NATS doesn't ever store messages persistently; but this might be fine for your application, and then you don't have to worry about setting 5 different config options to make sure Kafka actually frees up disk space like you expect it to ;)

NATS also enables some unique patterns like request/reply via a "reply to" message header.

Anyway, it's been a joy to use!

Sounds more like rabbitmq replacement than kafka
> NATS doesn't ever store messages persistently

Not true. Both Nats streaming and the upcoming jetstream (core nats) do.

It depends what you mean by 'persistently'. Normally NATS streaming will delete the messages after they have been delivered to all subscribers successsfully and some expiration time has passed.
The expiration time can be unlimited, just like kafka
I'm aware, I was referring to the core binary, `nats-server`. NATS streaming server seems to be still receiving attention, but the client library (for Java) hasn't been committed to since 2019, so I'm not sure I'd build a new project with it. JetStream is out (as of this week, I believe) and is an optional module to `nats-server`, as you said.

To the folks at Synadia -- I love NATS, but the naming and organization of these projects could use some work. What's with the `stan.*` repository names? Where did "jetstream" come from? Why is it baked into `nats-server` but `nats-streaming-server` isn't? Is `nats-streaming-server` on the back burner?

Stan is nats streaming. The clients don't have to be updated since they're forward compatible since jetstream. NATS streaming will be deprecated after jetstream is GA is my understanding. Is there a bug in the library you found?
you also don't have to worry about those kinds of configuration gotchas if you use confluent cloud!
Only you have to be a billionaire to use the confluent cloud.
So I guess pay as you go is too expensive for your org? Be curious to understand why you feel it's too expensive when it completely extrapolates any kafka management.
relative to more mature Kafka cloud services, it's crazy expensive.
I believe NATS Streaming[0] and upcoming Nats Jetstream[1] could be more relevant to those who looking for a Kafka alternative. They offer persistent messages, at-least-once delivery similar to Kafka.

[0] https://docs.nats.io/nats-streaming-concepts/intro [1] https://github.com/nats-io/jetstream

[nb: I work for the main company behind NATS]

Jetstream is GA with the 2.2.0 release. Folks who believe in waiting for "not .0" won't have to wait too much longer.

Using Kafka only for ordered pub-sub is like ordering tap water at French Laundry.
In that if you're already there and just want water, it's totally reasonable. It'll get you wet.

If you went there for tap water, yeah, maybe there are better options.

NATS is not a replacement for any use of kafka that I think is actually a good fit.
Have you seen liftbridge, which was built ontop of nats to fit some of the more traditional kafka use cases?

https://liftbridge.io/

NATS by itself doesn’t handle the persistence side at all (by design). It’s at most once delivery, not at least once.

That being said, have you checked out NATS Streaming Server? It’s effectively a first party client for NATS that gives it at least once semantics and persistence, and makes it much more applicable to use cases that are currently on Kafka.

Docs here if you’re curious - https://docs.nats.io/nats-streaming-concepts/intro

There was a more recent 2.2 release with durable streams [0]. There is also a comparison page including Kafka [1].

[0]: https://docs.nats.io/whats_new_22 [1]: https://docs.nats.io/compare-nats

Is there any message ordering guarantee in NATS? With Kafka you can achieve this by using keyed message and messages in the same partition will always be ordered

Disclosure: I work for Confluent

Not the same guarantees, no:

> messages from a given single publisher will be delivered to all eligible subscribers in the order in which they were originally published. There are no guarantees of message delivery order amongst multiple publishers.

https://docs.nats.io/faq#does-nats-offer-any-guarantee-of-me...

I believe with jetstream, message in stream is ordered as they are written. Jetstream have a concept of consumer, (in the broker itself, not client), which can consume a subset of the stream, filtered by message subject.
I've never really understood the appeal of ordered messages. You end up splitting your data across partitions anyways for parallelism, so who cares? What systems out there require strictly ordered data? It seems like any design that requires something like that is going to be extremely brittle.
> You end up splitting your data across partitions anyways

Messages are ordered within partitions.

> What systems out there require strictly ordered data? It seems like any design that requires something like that is going to be extremely brittle.

TCP/IP ?

> Messages are ordered within partitions.

Right, but that means you're still "unordered" across those partitions?

> TCP/IP ?

But TCP/IP isn't delivered in order, it rearranges the unordered packages by their ID. I guess ordered delivery would be nice for that, but I just feel like making your protocol not require ordering is far simpler.

Not to mention that both TCP and Kafka have to handle head of line blocking?

I'm not trying to say that ordering is bad or anything, I just feel like it isn't buying me tons.

> Right, but that means you're still "unordered" across those partitions?

Right, so related messages have an ordering guarantee but unrelated messages may be processed out of order relative to each other, which is usually what you want. (Of course you do have to set the record key correctly).

> I'm not trying to say that ordering is bad or anything, I just feel like it isn't buying me tons.

It's a lot more lightweight than full ACID, but if you get your dataflow right it achieves everything that a traditional database does. Without ordering you wouldn't be able to do anything that requires any kind of consistency.

Hm, ok, yeah. So I guess I can see what you mean. I've never had a use case where I felt comfortable relying on any kind of message ordering, and always rely on my application level logic to handle that, or ensure the system is resilient despite ordering (ie: commutative operations only).

To me, it seemed at odds with the parallelism of a partition, but I suppose in this case you'd be partitioning on some sort of semantic key vs, say, a hash.

Thanks for bearing with me on that, this was just an unfamiliar idea for me.

> But TCP/IP isn't delivered in order.. I guess ordered delivery would be nice

> both TCP and Kafka have to handle head of line blocking

Well which is it?

(If TCP doesn't give you ordered delivery, why would a head block the rest of the line?)

You can split with a business case in mind.

Maybe if you're an e-business, you'll split everything happening on your website by client id, but still want events belonging to a single client to be received in order, for practicality.

Exactly. We ran into same issue with Kafka. If one needs ordered messages for some reason, Kafka is pretty much useless for this.
NATS is just ephemeral at-most-once pub/sub. It needs NATS Streaming or the new Jetstream for at-least-once persisted data and still has different semantics.

Apache Pulsar offers the same distributed log offering with a fundamentally better architecture, but Kafka has closed most of the gaps now and has far more integrations and a bigger ecosystem.

Does NATS support (or plan to support) log compaction? Without that, it's very hard to replace Kafka.
I"m not sure I understand your question; NATS streaming server (built on top of NATS) supports persistence to disk, a raft group, a SQL table, etc. and it appears the various storage implementations have mechanisms [0] to delete or compress old data.

That being said, I don't think this is what differentiates the two systems, the guarantees they do/don't make are likely what will make the decision for your project.

[0]: https://github.com/nats-io/nats-streaming-server/blob/master...

Yes, jetstream allows setting max messages to 1 in a stream.
And if you want closer kafka semantics built on top of nats, check out liftbridge:

https://liftbridge.io/