| In other words: It wasn't resiliently built stuff. Is an exploding car safe because it is built by an early stage startup? Just because you decide that implementing resiliency isn't a good business decision for some early stage startup, doesn't magically make the product resilient, it just isn't and that may be OK. There are many options to choose from for implementing resiliency, it could be having multiple providers concurrently, it could be having a plan for restoring service with a different provider in case one provider fails, it could be by setting up a contract with a sufficiently solvent provider that they pay for your damages if they fail to implement the resiliency that you need, whatever. But if you fail to consider an obvious failure mode of a central component of your system in your planning, then you are obviously not building a resilient system. Edit: One more thing: > There's no indication anywhere from GCP themselves that a project could be a domain of failure. If asked, I doubt they would consider it as such. Then you are asking wrong, which still is your failure if you are responsible for designing a resilient system. If you ask them "Is a complete project expected to fail at once?", of course they will say "no". That's why you ask them "Will you pay me 10 million bucks if my complete project goes offline with less than one month advance warning?", and you can be sure you will get the response to the problem that you are actually trying to solve. |