Hacker News new | ask | show | jobs
by majke 1330 days ago
I'll bite. Neither. Both. Depending on system.

When the "state" is large, or changes often, obviously you can't send full state every time - that would be too much for end-nodes to process on every event. Both cpu - deserialization, and bandwidth. Delta is the answer.

Delta though is hard, since there always is an inherent race between getting the first full snapshot, and subscribing to updates.

On the other hand doing delta is hard. Therefore, for simple small updated not-often things, fat events carrying all state might be okay.

There is a linear tradeoff on the "data delivery" component:

- worse latency saves cpu and bandwidth (think: batching updates)

- better latency burns more cpu and bandwidth

Finally, the receiver system always requires some domain specific API. In some cases passing delta to application is fine, in some cases passing a full object is better. For example, sometimes you can save a re-draw by just updating some value, in other cases the receiver will need to redraw everything so changing the full object is totally fine.

I would like to see a pub/sub messaging system that solves these issues. That you can "publish" and object, select latency goal, and "subscribe" to the event on the receiver and allow the system to choose the correct delivery method. For example, the system might choose pull vs push, or appropriate delta algorithm. As a programmer, I really just want to get access to the "synchronized" object on multiple systems.

3 comments

There's a third type of event:

- Entire Object.

You send the entire state of the entire object that changed. Irrelevant fields and all.

This makes business logic and migrations easier in dependent services. You can easily roll back to earlier points in time without diffing objects to determine what state changed. You don't have to replay an entire history of events to repopulate caches and databases. You can even send "synthetic" events to reset the state of everything that is listening from a central point of control.

I've dealt with all three types of system, and this is by far the easiest one to work with.

How does this differ from a "fat event" ?
Thin event (1): person object XYZ changed.

Thin event (2): address object ABC changed.

Delta event (1): person object XYZ's name changed to "Bob"

Delta event (2): address object ABC's address line 2 was deleted and zip code was changed to "12345"

Fat event (1): person object XYZ changed, and here's everything we think person-consuming systems will care about

Fat event (2): address object ABC changed, and here's everything we think address-consuming systems will need

Entire object (1): { person: { token: "XYZ", name: "Bob", email: "bob@bob.com" age: 42, likes: ["ice cream", ...], ... }, updated_at: T, updated_by: U, version: 3, ... }

Entire object (2): { address: { token: "ABC", line_1: "1234 Some Place Rd.", line_2: null, city: "Everywhere", state: "NA", zip: "12345", ... }, updated_at: T, updated_by: U, version: 5, ... }

since the "fat event" ones are vaguely defined here, they could be arbitrarily close to or far from the "Entire object" cases. How does it differ? Maybe it does not.
Your team decides what parts of the model to expose in its events and it becomes an API in its own right.

You might change the names of fields, move them to places that don't reflect where they live on a nested model, etc. It requires a lot more thought and maintenance.

That isn't to say the choice can't be correct. All of these approaches have pros and cons.

>Delta though is hard, since there always is an inherent race between getting the first full snapshot, and subscribing to updates.

Since the deltas include a version identifier for what they should be applied on top of, then you should always be able to safely start by requesting the deltas, then ask for the object. Buffer the deltas till your full copy is received, then discard deltas for previous versions until the stream applies to yours, applying them thereafter to keep it up to date.

This omits the issues with "thin events" - it may be fine most of the time, but as it usually involves a "get more details" call over http or of some other kind, it has more moving parts, is therefor more prone to failures and slowdowns due to the extra coupling. This can kick in when load goes up or some other issue affects reliability, and cause cascading failure.