Hacker News new | ask | show | jobs
by Moldoteck 1052 days ago
AI dubbing would never reach that level of acting, sounds awkward. - you should consider being more conservative in this direction)) I've said the same thing abt generating of images 5 years ago but the last progress of midjourney/dalle proved me wrong. We should not underestimate what hights human progress can achieve
1 comments

Op also clearly hasn't used Elevenlabs or similar tools. If you clone a professional narrator it already sounds incredibly good and effectively indistinguishable from a human. Giving acting directions to the model to steer the output (kind of like ControlNet does for Stable Diffusion) seems like a logical next step.
But in this case, they want to avoid the human input. So, I guess, it would rather work by reading and copying the intonation of the source voice.