| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by andberx 107 days ago
	This is really cool. Voice cloning + translation in one pipeline is something a lot of content creators would pay for right now. Especially for YouTube dubbing where you want to keep the original personality of the speaker. Are you handling the speech-to-text, translation, and voice synthesis as separate steps or is it more of an end-to-end model? Curious how you deal with things like pacing and intonation that don't always carry over between languages.