| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by IshKebab 412 days ago
	Impressive! I guess the speech synthesis quality is the best available open source at the moment? The endgame of this is surely a continuously running wave to wave model with no text tokens at all? Or at least none in the main path.

1 comments

koljab 412 days ago

This is coqui xttsv2 because it can be tuned to deliver the first token in under 100 ms. Gives the best balance between quality and speed currently imho. If it's only about quality I'd say there are better models out there.

link