| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by oidar 649 days ago
	The voice models for this are very good. I'd love to have granular control over the output of a model like this locally.

1 comments

willwade 649 days ago

Like SSML? See azure tts or google cloud tts, or ibm Watson or even old school system tts like SAPI voices on windows. But I hear you. In a VITS typical model system ssml isn’t standard. Piper tts does have it on the roadmap.

link

oidar 649 days ago

I just want programmable prosody. Prosodic controls would allow much more believable TTS - apple used to have it on the earlier TTS models, but these new TTS models sound so natural at the phoneme level, but the prosody is often jacked up so that it's easily identifiable as artificial.

link