Like SSML? See azure tts or google cloud tts, or ibm Watson or even old school system tts like SAPI voices on windows. But I hear you. In a VITS typical model system ssml isn’t standard. Piper tts does have it on the roadmap.
I just want programmable prosody. Prosodic controls would allow much more believable TTS - apple used to have it on the earlier TTS models, but these new TTS models sound so natural at the phoneme level, but the prosody is often jacked up so that it's easily identifiable as artificial.