| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by danielbln 1052 days ago
	Op also clearly hasn't used Elevenlabs or similar tools. If you clone a professional narrator it already sounds incredibly good and effectively indistinguishable from a human. Giving acting directions to the model to steer the output (kind of like ControlNet does for Stable Diffusion) seems like a logical next step.

1 comments

But in this case, they want to avoid the human input. So, I guess, it would rather work by reading and copying the intonation of the source voice.