| The latter. For short lived workflows, you may not care about updating; just let it finish. For longer jobs, you want some way to replace the current logic and either resume from where the job left off, or restart it idempotently. Especially if your workflow spans months or years (which at least some of these systems are designed for). The challenge is these systens shine when you manage the job state in-memory, but they don't "store" the data in a traditional sense. They just replay your logic and replay the original I/O results. So if your logic changes, the replay breaks and your state goes bye-bye. (I think of it similarly to React's "rules of hooks": you can't do anything that makes the function call key APIs in a different order than previous executions) So you either accept that you can never update an in-flight job (in a meaningful way, at least), or you track job state in some other system and throw away the distinguishing feature of these systems. I'm curious how people normally handle this. When I worked with Azure Durable Functions I couldn't find a way around this. |
I wonder if there could be an approach where you have both versions live simultaneously, and introduce some sort of "checkpoint" into the old version that would act similar to a DB migration. When re-computing a workflow you could then start from the latest checkpoint, but any workflows that were created with the old version that haven't reached a checkpoint would continue to run the old code until it does.