Hacker News new | ask | show | jobs
by atypicalstudio 1717 days ago
I'm a software engineer but I work with a lot of projects that require recorded voice narration etc. I've already been able to replace some small/low budget projects with synthesised speech but it definitely does lack the depth and nuance that a real actor brings to it. Is there anything you have seen specifically that you were inpressed by? The one that really stood out to me is the 'deep fake' voice synthesis (though it relies on hours of captured performance first)
1 comments

Not yet, as you identify it really is about the nuance, and why people like me get paid for what we do. There are some systems that are trying to add generic emotional states using a matrix of activation vs deactivation (how energetic the delivery is) and pleasantness, to try and generate that nuance. So you can end up with emotional states like "guilt"(unpleasant with moderate deactivation) and "elation" ( pleasant with high activation), but as far as I know the application of those states are not determined through machine learning, there is still a human involvement to determine the emotional state, and then try and generate an inflection that matches it.

Narration is as much about acting as having a nice voice. We've only just begun in getting the computer voice to sound mostly human, getting it to act is another thing entirely.