| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by oezi 481 days ago
	Text-To-Speech models still aren't trained on rich enough data to have all the nuances we need to be fully expressive. For example, most models don't have a way to change accents separately from language (e.g. English with a slight French accent) or have an ability to set emotions such as excitement or sleepiness. We aren't even talking about adding laughing, singing/rap or beatboxing.