| Hi! I'm the author of the essay. Durable execution is meant to complement your application. You will never want to model everything with it. It solves the problem of needing to decide how often to manually make snapshots of some important state, this becomes implicit. Workflows in flawless can still fail, you could call the `panic` function, or divide by zero. In the end it's arbitrary compute. "External" state is one of the text book examples for using durable execution. If you are interacting with 5 different services and calling 5 different API endpoints, you sometimes want to have transactional behaviour. Leave all 5 systems in a consistent state after your interaction. You can't only call 2 and stop. Durable execution and patterns like saga [1] are one of the most straight forward ways (for me) to solve this. In flawless specifically, I try to give enough context to the user why things failed. It's very easy to reconstruct the whole computation from the log. And let the user decide if they want to re-run the workflow. If you charge someone's credit card, but the call to extend their subscription fails (service down), you can't safely just re-run this. You have two choices, either you continue progressing and roll back the charge, or you fail and have someone manually look at it. In general, you want to use flawless in scenarios where the "called exactly once" guarantee is important. If you can just throw away the state and it's safe to re-run from the start, then you don't need flawless for this part of the app. The less state you have to care about, the better. EDIT: The alternative would be to manually construct a state machine with a database. "Check if the credit card was charged. Call Stripe. I finished charging the credit card, save this information. Call the subscription service, it failed, restart everything. Check if the credit card was charged ...". And depending on your workflow, this can be a very complicated process where 90% of your code is just dealing with possible failures. Especially if failures happen on the edge of some calls it can become very tricky. [1]: https://medium.com/cloud-native-daily/microservices-patterns... |
- Application requests a JWT token. It then crashes and gets restarted. It gets past the problematic point, but later when trying to make a request, it crashes due to the cached token being expired.
- Application interacts with the current time in a meaningful manner. Due to the log replay, it will always live in the past and when switching from the cache-sourced time to the current time, some issues might occur, like deltas being larger than expected
- Application goes through a webshop checkout flow. After restart, some of the items in its cart have been already sold, but the app doesn't check this, since it already went through a (cached) check and got the result