Hacker News new | ask | show | jobs
by tyri_kai_psomi 2360 days ago
NATS is an amazing project, I just wanted to take the opportunity to highlight it for those first hearing about it in this comment. It's so brilliantly simple, yet changed the way I design distributed systems. I handle almost anything in regards to the standard messaging guarantees that a Kafka-like system offers at the endpoints now. As a result, systems are much simpler, and diagnosability of bugs or edge cases are much more straightforward.
3 comments

NATS is amazing but note that it makes different promises than Pulsar. NATS doesn't offer true durability (in exchange for amazing performance and great simplicity) whereas Pulsar and similar are meant to survive certain partition or failure situations and not lose data.

It's not one or the other, they're just different tools.

There is nats-streaming-server as well which offers true durability (via file or SQL store) and a streaming model very similar to Kafka and Pulsar. It can also run as a raft cluster or in fault tolerance mode. It still has very good performance and is very simple to deploy and operate (I use it for event sourcing for real time IoT data at my day job).
NATS Streaming has major scalability problems even if it's simple to deploy. It's only high-availability unless you the Raft clustering but that has been bolted on to the original project and isn't really well-designed.

The team is working on an entirely new system called Jetstream to eventually replace it.

This sounds interesting, what exactly do you mean by 'endpoint' in this scenario? I looked into a few alternatives before settling for pulsar, and disregarded nats because it didn't seem to support message persistence. I didn't look into it too deeply though, maybe i should have. How do you guarantee no message is lost with NATS?
In my thinking, I think of an endpoint as something at either end of the communication channel (NATS in this case) where it is effectively terminal. Usually this is where the application logic lies. Dereck Collison (creator of NATS) brings this up in many of his talks about NATS, but I think the source of his thinking might come from “End-to-End Arguments in System Design” by Saltzer, Reed, & Clark.

The core of it is this point:

"Functions placed at low levels of a system may be redundant or of little value when compared with the cost of providing them at that low level."

That is, in order get that message redundancy or exactly once delivery, or message persistence, you pay a high cost, and you may be better off delegating to the endpoints.

This blog provides a good overview

https://blog.acolyer.org/2014/11/14/end-to-end-arguments-in-...

Here is the original paper

http://web.mit.edu/Saltzer/www/publications/endtoend/endtoen...

Thanks, much appreciated!
Did you check this?

https://docs.nats.io/nats-streaming-concepts/intro

"..Message/event persistence - NATS Streaming offers configurable message persistence: in-memory, flat files or database. The storage subsystem uses a public interface that allows contributors to develop their own custom implementations."

and

"At-least-once-delivery - NATS Streaming offers message acknowledgements between publisher and server (for publish operations) and between subscriber and server (to confirm message delivery). Messages are persisted by the server in memory or secondary storage (or other external storage) and will be redelivered to eligible subscribing clients as needed."

Also check out Liftbridge (https://liftbridge.io), which is a Kafka-like API on top of NATS.

Disclaimer: I'm the author and former core contributor of NATS and NATS Streaming.

I looked at Liftbridge when choosing a streaming platform for event sourcing, but the FAQ says it's not production ready. Is that still accurate?
Yes, for more on that see my reply here: https://news.ycombinator.com/item?id=21946939
No, i missed that. I think ive seen 'nats streaming', but didn't realize that it is its own distinct thing. All this makes more sense now to me, thanks!
Outbox pattern and NATS streaming
Additionally, you could always have reliable delivery over NATS using the request/response pattern using acking and retries.
Is it me or does NATS looks like it's aimed at an actor-based style of distributed system ?
It not necessarily aiming at that.