Hacker News new | ask | show | jobs
by ydnaclementine 1656 days ago
> Adding a timestamp to each message is an easy way for consumers to discard any out-of-order messages.

Not correct, but it's very easy to think timestamps will solve this. Timestamps aren't good because system times aren't synced across different computers precisely. Meaning if Producer A creates the first event, and Producer B creates a second event 50ms after (imagine a single row gets updated very quickly twice), but the system time on Producer A is 100ms ahead of Producer B and the event from Producer B gets to the consumers first (variable network latency), the event from Producer A will look like the latest event from a timestamp perspective and overwrite the Producer B event.

One way to solve is it to use not use timestamps, but use a monotonically increasing version number associated with a row that gets updated for every event/update or whatever and is sent along with the event message payload. The book, Designing data intensive systems, goes into this problem a whole lot. Recommend it to anyone discussing architecture. Issues like this will seem obvious to you after reading

5 comments

Yep, and there's no straightforward way to know if you missed a message by looking at a timestamp alone. If I receive a message at time "1" then time "3", how will I know I was meant to have seen something at time "2"?

Sequence numbering is how FIX does it. And I think its quite neat that it does this at a _protocol_ level. This means that a FIX client/engine will typically take care of sequence numbering, out-of-order detection (can happen during re-send requests), buffering any ahead-of-time messages, requesting gapfills, etc. It will only present your application layer code with in-order messages.

I'm not aware of any universal pattern for dealing with poison pill messages. Completeness detection and dealing with messages that crash your system are 2 separate concerns.

The implication in the FIX use case is that you should have a single writer/consumer of these events, at least for sequencing purposes. Attempting to coordinate sequencing across multiple threads (or computers) will just slow you down. Consider that a CAS or volatile write is ~10x slower on modern x86 than a single thread tearing through the same items without any contention. A 10x slowdown is pretty much the best-case scenario if you must have a contended resource shared between multiple threads. You only lose more orders of magnitude the further you go into this rabbit hole. The LMAX Disruptor is a good example of how we can rethink these sorts of problems and solve them in novel ways. I have linked a document that I think provides a really good perspective on this: [0].

The poison pill case is a business logic thing on either side of the queue. You can either do validation up-front before sticking requests into the queue, or after the fact when processing in batches. Either way, you will ultimately need to be able to handle problems on both sides of the fence. Something that doesn't look like "poison" on the way into the queue could become pretty nasty with certain emergent state as events are processed on the other side.

[0]: https://lmax-exchange.github.io/disruptor/disruptor.html

> Not correct, but it's very easy to think timestamps will solve this. Timestamps aren't good because system times aren't synced across different computers precisely.

They can be. That's what Google Spanner does, using GPS and atomic clocks. It's not hard or expensive. An atomic clock + GPS will set you back under $1000 https://www.ebay.com/itm/174750548607 for one in a nice box, $200 for a PCB https://www.ebay.com/itm/353611628534. Apparently it gets you to with 1e-11, which is about 10 pico seconds (I think).

But that doesn't solve the problem. Lets say you have two event producers, and both produce two events at times t and t+1. Once they both arrive, it's trivial to process then in the right order. Your problem is there is an unreliable network connecting you to these producers. Those atomic clocks can guarantee those events t and t+1 are distinguishable events if they are just 10ps apart, but how long do you wait for t to arrive before you decide to process the event at t+1 because there was no event at t. I can absolutely guarantee you whatever time you decide is reasonable, the universe will at some point screw you over and drop the event occurring at on your doorstep just after you decided to process t1.

Your issue isn't that it's difficult, it's is you are living in a state of sin if you believe the problem solvable given the premises.

PS: Google Spanner doesn't attempt to do the impossible. In Spanners case there is a single event producer, and Spanner is "merely" trying to record the event consistently across multiple nodes. If there are multiple event producers then it will serialise them in some order, but if there are two disconnected, independent Spanners out there processing the same events from the same producers Google is not claiming there would decide on the same order. That would need a God more powerful than Google.

> One way to solve is it to use not use timestamps, but use a monotonically increasing version number associated with a row that gets updated for every event/update or whatever and is sent along with the event message payload.

It is a concept that is known as vector clock. I suggest take a look at vector and Lamport clocks (named after Leslie Lamport), very useful in distributed systems using messages.

FWIW GPS clocks and PTP can keep clocks synchronized well below the us.
Yes, this bit me recently. Had to add an atomic index field instead, as some messages were arriving out of order because their timestamps were the same as other messages.