Hacker News new | ask | show | jobs
by srazzaque 1656 days ago
Yep, and there's no straightforward way to know if you missed a message by looking at a timestamp alone. If I receive a message at time "1" then time "3", how will I know I was meant to have seen something at time "2"?

Sequence numbering is how FIX does it. And I think its quite neat that it does this at a _protocol_ level. This means that a FIX client/engine will typically take care of sequence numbering, out-of-order detection (can happen during re-send requests), buffering any ahead-of-time messages, requesting gapfills, etc. It will only present your application layer code with in-order messages.

I'm not aware of any universal pattern for dealing with poison pill messages. Completeness detection and dealing with messages that crash your system are 2 separate concerns.

1 comments

The implication in the FIX use case is that you should have a single writer/consumer of these events, at least for sequencing purposes. Attempting to coordinate sequencing across multiple threads (or computers) will just slow you down. Consider that a CAS or volatile write is ~10x slower on modern x86 than a single thread tearing through the same items without any contention. A 10x slowdown is pretty much the best-case scenario if you must have a contended resource shared between multiple threads. You only lose more orders of magnitude the further you go into this rabbit hole. The LMAX Disruptor is a good example of how we can rethink these sorts of problems and solve them in novel ways. I have linked a document that I think provides a really good perspective on this: [0].

The poison pill case is a business logic thing on either side of the queue. You can either do validation up-front before sticking requests into the queue, or after the fact when processing in batches. Either way, you will ultimately need to be able to handle problems on both sides of the fence. Something that doesn't look like "poison" on the way into the queue could become pretty nasty with certain emergent state as events are processed on the other side.

[0]: https://lmax-exchange.github.io/disruptor/disruptor.html