From @koopajah on discord: "We don't really drop nines if one specific feature is having issues". Pretty convenient payment processing is just a "feature" of a payment infrastructure SaaS
This is a bullshit and untenable position. This “one specific feature” takes out the core feature. Shameful, deceitful and reputation tarnishing position.
In a OKR culture, the way to make your KR good is to add lots of useless and very simple microservices, which are always up, so the overall metric is 100%.
I guess 99.999% uptime is no longer an engineer's well-earned badge of honor at Stripe. Now it's just sales puffery, like a LOWEST PRICES IN THE CITY!! sign at a discount store.
Sorry, but true class isn't just elegant rectangles decorated by various subtle shades of gray sans serif.
Yes, you either need to have a per-customer uptime number (that is visible to those customers) or you have one unified uptime number that takes a hit if any customers experience downtime. You can't have it both ways.
Why can't you maintain and report an average uptime across all service usage? So if you have an outage that affects 1% of traffic it moves your figure 1/10th as one on 10% of traffic?
That's what I'd expect a reported number to be, since that's what a client experiences on average.