We (Netflix) begged them for years to create a Chaos Monkey that we could pay for. There were things we just couldn't do ourselves, like simulate a power pull or just drop all network packets on the bare metal. I guess not enough people asked.
CMaaS sounds amazing for resiliency engineering. There's so much I want to be doing to perturb our stack, but I don't know all the ways stuff can go wrong. Sure I can ddos it, kick services and servers offline, etc, but that's what, a few dozen failure modes? Expertise in chaos would be valuable by itself. Not to mention being able to shake parts of the system I normally can't touch.
Side note: terraform is pretty good for causing various kinds of chaos, deliberately or otherwise.