| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rubayeet 2126 days ago
	Whoa! This is a game changer! I was looking into both Kafka and Google Pub/Sub for a event-oriented system my team was designing. Google Pub/Sub looked very promising, but no guarantee of ordering was a deal breaker for us. I’ll consider this more strongly for the next system we build.

4 comments

doh 2126 days ago

We used Pub/Sub very extensively (50B messages a day) but moved to Pulsar [0]. It performs equally well and has some nice features. And also no vendor lock-in.

[0] https://kesque.com/billions-of-events-a-day-without-breaking...

link

atombender 2126 days ago

Pulsar seems operationally quite complex, as it has a dependency on both BookKeeper and ZooKeeper (which BK also needs). ZooKeeper is particularly notorious for being difficult. What's your experience been like?

link

doh 2126 days ago

It definitely is on the more complex side of management. That's why we partnered with Kafkaesque to do the maintenance for us. We were fine handling it ourselves but decided to outsource it as it's less critical for us than many other internal tasks.

They have an open ticket [0] to dilute zookeeper's dependance, but as far as I know it's still pending.

[0] https://github.com/apache/pulsar/issues/572

link

seg_lol 2126 days ago

Given cloud vendors low respect for their customers, the proper stance is to own your own infrastructure.

link

snypox 2126 days ago

The company I work for had the same stance 5 years ago. We regretted it a hundred times and now we stand the gaff. Nothing better than maintaining an EC2 based Cassandra cluster instead of simple using DynamoDB, huh...

link

seg_lol 2126 days ago

I don't believe I advocated for the use of Cassandra. Owning vs being owned by, we are not arguing about the same things.

Stances are not strategies, when we use another's API we form a bond but the the other is free to break it so we are automatically at a weaker position. We have acquiesced. But if we choose a strategically worse choice, we have not only acquiesced, but done self-harm.

link

snypox 2126 days ago

I understand. I’m just saying that in my my professional experience, nothing good came out of religously avoiding lock-in but the opposite.

link

outsomnia 2126 days ago

With respect, clearly you cannot extrapolate that particular experience very far.

Otherwise, eg, FOSS is "nothing good".

link

roland35 2126 days ago

I agree with this, the best parts of the cloud is abstracting away a lot of the basic maintenance of these applications. There is lock in to some platform no matter what you do, but hopefully you can design your application so that if you do need to rearchitect you can do it in phases.

link

staticassertion 2126 days ago

What's a use case where strict ordering is critically important?

link

jerf 2126 days ago

"What's a use case where strict ordering is critically important?"

In general, as the use case grows, every use case where the developers did not make explicit and careful provision for ensuring that order is not important, with quite non-trivial effort.

Even a lot of systems whose developers think they have no ordering dependencies are wrong in at least one subtle way without realizing it.

If you need to megascale, you're going to have to bite the bullet and build a system that can handle out-of-order, but there's a lot of systems out there where you don't need megascale, and you can get rid of that "quite non-trivial effort" to deal with out-of-orderness by asking for messages to arrive in order.

To get a sense of just how useful that can be... bear in mind that every time you open a TCP socket instead of a UDP one, you just made exactly that choice, to use an ordered message system when you didn't "need" one. Take a look at everything you do with a TCP socket and think about trying to run it over UDP, and not with something like QUIC that basically adds half of TCP back on it, but with UDP straight-up. That's what kind of things can use in-order delivery... lots of things.

Almost everything can be simplified by guaranteed in-order delivery. It's just that some things can't afford the downsides.

link

staticassertion 2126 days ago

I can understand your post, but I don't quite buy the TCP thing. I don't think anyone is using TCP for ordering, they're using it because they don't want their packet dropped.

I guess all of the systems I build are just built to assume no order/ or to leverage causal ordering, because that feels much easier to reason about - enforcing ordering feels really hard, and like something that a message bus can only do some of the work of.

link

zeroxfe 2126 days ago

> I can understand your post, but I don't quite buy the TCP thing. I don't think anyone is using TCP for ordering, they're using it because they don't want their packet dropped.

Think of (almost) any modern protocol built on top of TCP, and you'll see that ordering is critical. (http, smtp, telnet/ssh, etc.)

link

staticassertion 2126 days ago

Gotcha, ok so specifically protocols built on TCP.

link

heavenlyblue 2126 days ago

There's no ordering in HTTP. If you could send a whole HTTP request as a UDP packet you'd get exactly the same protocol (obviously sans WebSockets - but you could work around that).

link

duijf 2126 days ago

There are numerous valid HTTP payloads that are larger than a single TCP/UDP packet.

Ordering is important in these cases.

link

0xEFF 2126 days ago

How would you process an unordered response?

link

whatshisface 2126 days ago

I'm not sure that ordering doesn't matter for most TCP data. For example, HTTP depends on ordering. Any time you are transmitting messages larger than the size of a packet, you need some degree of ordering, even if it's only to reconstruct the individual messages, when you don't care about the order of the messages.

link

trimbo 2126 days ago

Consider processing two events without guaranteed ordering:

- Create A

- Delete A

In one ordering, A is created and then deleted as expected, in the other, the delete fails but then A is created and remains.

link

chimera77 2126 days ago

There are a couple of options without needing guaranteed ordering:

- jobs can have ever increasing ids, workers record the last seen id in one place, and ignore jobs with ids less than last seen

- job results are returned for each job to a supervisor. if a job result doesn't match current expected state, resend job. jobs should be idempotent in case a job is sent multiple times

If the create job is expensive, the latter solution would be less ideal, though.

link

manigandham 2126 days ago

Just ignoring tasks isn't helpful, it means you still process things out of order and just never get to the previous pending item in that case.

Much easier to just have strict order and process it off the line as it comes.

link

bgdam 2126 days ago

Creating ever increasing ids reliably at scale is not trivial. You will probably end up having a single server generating these ids which will then become a single point of failure.

link

caturopath 2126 days ago

(Or a distributed system - this is no trickier to migrate away from SPOF than anything else with global state.)

link

zaphirplane 2126 days ago

But when does the delete enter the queue and when does the Create enter the queue ;)

link

inopinatus 2126 days ago

That doesn’t necessarily require that the entire queue is totally ordered, but the alternatives (such as Virtual Synchrony) are still considered arcane / research topics.

link

staticassertion 2126 days ago

Kinda interesting, I built this just yesterday but with an async "buffer" and "write", and I just used a simple incrementing identifier system.

link

tyrust 2126 days ago

Does adding timestamps not handle this case?

link

jurre 2126 days ago

Now you need to queue up events for some time, reorder them using the timestamp, and then process them. It’s possible, but has overhead in both performance and custom code you’ll have to maintain. If there is no guarantee of order, two separate systems consuming those same events also might get different results, depending on the implementation, that can be problematic

link

sumtechguy 2126 days ago

For a single process on one box with one thread you can use something like that.

If you involve more than 1 box that goes out the window. Sometimes you can still get 'one timestamp' by making something else the owner of the timestamp. It also depends on your resolution of time and the process that does the ingesting. For example if that ingest process has more than one thread to handle things you can still get out of order/sametimestamp if not coded correctly.

link

jedmeyers 2126 days ago

If events are generated by different processes you cannot really guarantee that time is exactly the same for them, unless you do something fancy to ensure that.

link

zaphirplane 2126 days ago

Interesting. The ordering here is when the event was generated or when the event entered the queue ? I think the later and so I think the examples here don’t apply without something on top and a trade off

link

jedmeyers 2126 days ago

The queue entrypoint is not always the same process either, especially in a system like Pub/Sub.

link

GordonS 2126 days ago

With a single producer and consumer, yes - but of course that's seldom the case.

With multiple producers and consumers, clock skew would be an issue, with the time on different machines being off from each other slightly.

One option is to use a single source for generating IDs, but that introduces another failure point, and comes at a hefty performance cost.

link

dragonwriter 2126 days ago

> Does adding timestamps not handle this case?

If you have one message source (a single thread or some kind of coordination), and the messages have lower frequency than the timestamp resolution, yes.

The farther you get from that, the more the answer is no.

link

amelius 2126 days ago

Just make the source part of the timestamp?

(1,A) < (1,B) < (2,A) etc.

Also use serial numbers instead of true timestamps.

link

rsynnott 2126 days ago

That way lies madness, and/or eventually accidentally writing your own Dynamo-paper database.

link

valleyjo 2126 days ago

That won’t work in all cases. For instance, if you get messages from devices which can be reimaged they may have clock skew in a period of time before they’re synchronized again.

link

easytiger 2126 days ago

But in any case you can't rely on the order of message ingress to your system to represent anything meaningful either? It would have to ensure that the key for defining order would have some hard logical ordering purpose for which time is not relevant or useful.

link

jlokier 2126 days ago

The order of message ingress can still be meaningful even if device clocks are skew or jump due to rebooting, reimaging, network time sync, frequency drift, etc.

A hard logical order arises from interactions. E.g. if the device receives a message, does something locally, goes through a clock change, and then sends a message dependent on one it fetched earlier, that's a logical order with out-of-order clock.

Or if a device gets a message, processes, sends something to another device, that one processes too then sends another message back to the original source, there's a logical order but with three different clocks. Even if the clocks are synchronised, there will be some drift and the messages may be processed fast enough that the drift puts their timestamps out of order.

link

baq 2126 days ago

every use case where you haven't proven it isn't.

human conscious thought is local and single-threaded. it takes a lot of experience and training to be able to intuitively reason about non-local multi-threaded computation. if you're smart and humble you can try to simplify the problem by making individual messages independent from each other by e.g. employing redundancy but you still have to be aware that it's even a problem to begin with.

link

kurttheviking 2126 days ago

I see it a lot when integrating legacy healthcare systems that effectively operate on state-transition queues with an assumption of in-order processing.

link

himoacs 2126 days ago

Financial transactions (withdrawing/depositing money or placing stock orders) Credit card/payment usecases

link

edoceo 2126 days ago

What did you pick? I'm interested in non-G solutions.

link

wngr 2126 days ago

Probably Kafka according to his comment.

link

himoacs 2126 days ago

If you are still looking, I would recommend Solace PubSub+: https://solace.com/products/event-broker/software/

Supports zero message loss, no headaches around topic partitions, in-order messaging, support for open apis/protocols, in-memory AND persistent quality of service, support for event mesh etc

link