Good point! I'm just a humble Linux sysadmin dubbed "SRE" who slept through Stats for Engineers and now pays the price every week dealing with SWE eager to blame me for their mistakes.
You were right; that was a case of Simpson's paradox. Every category experienced a latency boost but the overall statistic worsened. Jevon's paradox is what caused the induced demand, but when the new usage data was gathered the initial review was an example of Simpson's paradox.
Effect of the change -> Jevon's paradox.
Measurement of Jevon's paradox -> Simpson's paradox (in this case, that isn't a general rule).
The fact that the two are easily linked is one of the reasons the statistical paradox is so common in practice.
Latency improved for everyone, but overall average latency increased because usage increased faster in high latency areas. That's Simpson's Paradox. Simpson's Paradox doesn't care where the subpopulations you're measuring came from.
If I recall the youtube slow-internet optimisation case correction, I think it is an example of Simpson's paradox. They made it faster for countries with fast internet, and faster for countries with slow internet, and then the average performance across all users/countries was slower, because now the countries with slow internet used youtube much more than before.
It would be Simpsons' Paradox if Google services in Indonesia were initially slow because Indonesians tend to use YouTube more often than lighter services.
There wasn't an error in the conclusions of the initial measuremen. It was the solution that had problems.
How does "Average and p95 latency actually increased after shipping the work to production.
How does an objectively good change make things worse?" relate to Simpson's paradox again?
That's exactly it. After "shipping the work to production" (making it faster for everybody), the overall average and p95 got worse. Each sub-population experienced improvement: countries with fast internet got faster youtube, countries with slow internet got faster youtube. But the overall average and p95 got worse: overall average was slower youtube. Because now more users from the second sub-population bring the overall average speed down (or latency up). That's Simpson's paradox.
Ah, you may be right. It's not clear in the story that "Average and p95 latency actually increased after shipping the work to production." means average of Indonesia and ex-Indonesia and not just Indonesian average.