Hacker News new | ask | show | jobs
by dqv 969 days ago
You can totally do that. As you read the events, you can store and sort them in date order, then produce the state from the sorted order of events when you've finished reading the stream. There's nothing wrong with storing intermediary state before producing your final aggregate.

It might mean you can't do snapshotting unless you add additional logic though - checking for the date of the last seen event and triggering a new snapshot due to the out-of-orderness of the event entry.

1 comments

This is what I was thinking. Thanks for confirming it makes sense. I don't know why it seems like the kind of thing I'm sure there must be a ton of existing work and knowledge, but it's quite disconcerting when I can't find any of it.

I did think the same with snapshotting. I was thinking in the system the addition of an event would have to invalidate all subsequent snapshots (can be done quickly), then asynchronously recalculate those snapshots again using the new history. Or perhaps using the transaction time of events and snapshots to invalidate the snapshots (ie. if a snapshot was created before the most recently recorded event, according to transaction time, then the snapshot is invalid).

>I don't know why it seems like the kind of thing I'm sure there must be a ton of existing work and knowledge, but it's quite disconcerting when I can't find any of it.

Yeah, I hate to say it, but a lot of the writing about ES is trying to steer you toward paying consultants to think these things up for you. The truth is that everyone is doing it there own way - there isn't a correct way to do it, only trade offs.

The nice thing is that you always have your event log and so you can optimize projection/state building.

>I did think the same with snapshotting. I was thinking in the system the addition of an event would have to invalidate all subsequent snapshots (can be done quickly), then asynchronously recalculate those snapshots again using the new history. Or perhaps using the transaction time of events and snapshots to invalidate the snapshots (ie. if a snapshot was created before the most recently recorded event, according to transaction time, then the snapshot is invalid).

Yes, well, you can mark a snapshot as invalid if it was built after the decision time. What you can do is jump back to an earlier snapshot and start processing events as of that snapshot's version. This way you can do something like

(regular dates used for ease of reading)

    Snapshot(stream_vsn=90,  date=Date(2023,10,1), latest_decision_date=Date(2023,10,1))
    Snapshot(stream_vsn=100, date=Date(2023,10,10), latest_decision_date=Date(2023,10,7))
    Snapshot(stream_vsn=110, date=Date(2023,10,21), latest_decision_date=Date(2023,10,21))
So you get a new event with a decision date of 2023-10-8. You can invalidate the last snapshot, build from the second snapshot (then invalidate it), and leave the first snapshot as is. You can do build_snapshot(Snapshot(stream_vsn=100), all_events_after_vsn_100)) as an optimization since no events before version 100 affect the state.