Hacker News new | ask | show | jobs
by zenexer 2539 days ago
As someone downstream of providers like Stripe who is on call for issues like this, that term is actually quite helpful to me. It tells me that I should be expecting delays and timeouts, and that some percentage of operations are likely to complete, whereas a complete outage likely means requests are failing immediately or failing to connect. This is important information when reviewing our options. During a full outage, aside from failover (when possible and not automated), we usually don’t need to take any action. When dealing with greatly increased error rates, it may be beneficial for us to disable the API on our end in order to avoid a lot of hung open connections and delayed responses for our users. We’d rather that operations fail immediately and completely instead of forcing users to wait around for operations that are unlikely to complete anyway.