| HN Mirror

One way our team has dealt with this question in a highly serverless-oriented and infrastructure-as-code-driven (how's THAT for buzzword soup) environment is to explicitly separate stateful resources from stateless, while exposing reference hooks in a configuration store to cross from one to the other. We've found that doing so _greatly_ reduces the blast radius of mistakes and lets us move more quickly and confidently.

The stateless stacks generally have a lot of development activity going on, and rapidly iterate. This is where most of our code and logic lives. This is where the vast majority of our deployment (and related cloud configuration) activity happens.

All of that thrash is kept away from the stateful stacks - think S3 buckets or DynamoDB tables - where, if THOSE thrash, we potentially get an outage at best, or lose data at worst (backups notwithstanding).

We DO NOT WANT stateless oriented stacks to own the lifecycle for stateful stacks. They inherently need to be treated differently. Or, at least the impact of mistakes is different.

The trick comes when you need to tie them together. To do this, we've added CloudFormation hooks and other deployment time logic that publish ARN and other connectivity info to our configuration store. The stateless services look up config values either during deployment or at runtime and are able to find the details they need to reference the state resources they need access to.

We've poked at toolsets like Amplify that lump everything together and have already been bitten numerous times. We've found that the difference between stateful and stateless resources should not be papered over, but instead emphasized and supported explicitly by tooling.

... all of this being one team's experience over the years, of course.

Very curious to see how this paradigm evolves here!

[edit]… Riffing on this just a little bit further… as I’m thinking about it here, it comes down to abstraction level. In a deployment or resource management domain, a generic “this is a cloud resource” isn’t very useful. What’s way _more_ useful is something like “this is a stateful resource” or “this is a stateless resource”, because that level describes resource behavior more clearly, AND how to interface with or manage those resources.

There are echos of code development principles here intentionally - robust cloud infrastructure management is mirrors software dev practices as much as infrastructure management ones!