> I told her then that the industry would be disrupted by AI before she retired.
Yes. I just discovered there is a text-to-speech addon [1] (now a few months old) for World of Warcraft that adds voices for every NPC in the game... It is so impressive and game changer (pun intended) that I naively asked in the chat of the Twitch stream I was watching "when did Blizzard add voices to the NPCs??". For an instant I really thought Blizzard contracted actors, but no, someone like you and me just used AI to generate realistic voices for every character in the game. I don't think it's ready yet to completely replace actors in video games (surely it will in the near future tho) but voice acting is something so expensive to do that I can see studios and developers in 2024 already use this tech for all the optional dialogues and secondary characters' voices.
I've wondered at what point this would happen. I think it could now, but from what I've read the voice actor unions are able to prevent it currently (at least for AAA games or non-indie devs). Many of them have agreements/contracts in place for the foreseeable future, and being the first big company to replace them is a heap of terrible press that nobody is going to want to touch. I think it's the same reason Hollywood reached the AI agreement recently too.
It won't replace high-end talent, I don't think models can replicate the nuance for a long time, however the entire low-to-mid end of the market is going to get nuked from low earth orbit
It won’t replace it, but it’s very likely to supplant it, just about destroying the segment by reducing demand by being good enough and so much cheaper, especially as people get more used to it.
Typesetting. Music engraving. Bookbinding. The quality of all these fields have been materially harmed by advancements.
Computer typesetting has, by and large, been a significant regression, though the gap has largely been made up now if you make the right choices.
Published music scores used to be set by experts. Now they’re set by novices using software that is mechanical in method and generally quite insipid. Most are atrocious compared to the old masters, and mediocre at best compared to the typical published scores from a hundred years ago; and very few popular scores are really good (… and if they are, there’s a reasonably high chance they’ve used GNU LilyPond, which has focused on this problem). But the barrier for entry is so much lower, and people have got used to the inferior results, so I don’t know if anyone engraves music the old way, and even people that know better largely just shrug and make do with the new. Like with computer typesetting, there is hope because things have slowly improved. But most will continue to be mediocre.
Books used to be bound with cold glue. It takes time to set, but the results are very good, supple and long-lasting. Then along came hot-melt glue, and it’s just so much friendlier for cheap manufacturing because books are finished within a few minutes instead of a day or two, that I don’t think anyone produces books the old way any more, even though the results are abysmal in comparison (compare the binding and reading experience of a paperback from the ’40s or ’50s with one from the turn of the century; no one after tasting the old will desire the new; for he says, the old is good). But they’re just (barely) good enough. Unlike the other two, I don’t think there’s any hope here—the regressive advancement crowded out the superior but dearer option so that no place was found for it.
You can still get relatively good published music scores from a few of the old German shops (Schirmer, Henle, etc.), but they are very expensive. They are a joy to use when playing, though, since the music is very clearly laid out and page turns are in the perfect place, etc. Finale and Sibelius are controllable enough that you can use them to do fantastic layout, but many people either do not understand how to make a score readable or don't care enough.
That, and what GP describes, is what I see as the overall trend of the market to hollow out the middle. It's not just about technology (though it plays a big role), as all optimization coming from competitive pressure - materials, processes, business models, marketing.
What seems to universally happen is, the market bifurcates - one part is in a race to the bottom, the other (much smaller) aims for super premium tier (overpriced quality), because only those two positions are sustainable, once the race-to-the-bottom side drags all the economies of scale with it. So as a consumer, you get to chose between cheap low-quality garbage that's barely fit for purpose, and rare, super-expensive, professional/elite high-end products. There is no option for "good value for reasonable price".
This has been happening to everything - software, furniture, construction, electronics, vehicles, food, you name it.
I'm using AI for training videos for my startup. Never going back to voice actors outside of primary marketing videos. tThe sheer convenience of write/listen/tweak cycle on scripts is insane. In minutes you can do a voiceover which would have have taken hours + days delay prior.
Sure the final result sounds slighty robotic. 99% of people wouldn't care, and you can get more training videos done, faster for a fraction of the cost.
[Edit] And I'll add the difference from 6 months ago is noticeable to today. I imagine every 6 months we can just re-download updated voiceovers and every 6 months will sound just slightly more polished..
I wonder which will happen first - AI evolves to work well at the high-end, or high-end humans retire and there’s nobody left in the low-to-mid end to fill their shoes…
It will absolutely replace high-end talent. Anything that a human can do will be able to be done 10x better by a model -- especially in such a narrow and well defined domain.
Did you hear the output examples? Yeah, I think not. I mean, definitely on the way, but there's no way if you need quality acting in your dub that you're going with this.
These are models specially tuned and sized for near real-time, instant translation. It would be naive to think that there aren't technical creatives building and training models tuned for expressiveness and nuance in a more controlled environment.
Maybe not in the current state of the model, but judging by the rate of improvement we’re all seeing it’s just a matter of time (and data+compute+research obv).
Yes. I just discovered there is a text-to-speech addon [1] (now a few months old) for World of Warcraft that adds voices for every NPC in the game... It is so impressive and game changer (pun intended) that I naively asked in the chat of the Twitch stream I was watching "when did Blizzard add voices to the NPCs??". For an instant I really thought Blizzard contracted actors, but no, someone like you and me just used AI to generate realistic voices for every character in the game. I don't think it's ready yet to completely replace actors in video games (surely it will in the near future tho) but voice acting is something so expensive to do that I can see studios and developers in 2024 already use this tech for all the optional dialogues and secondary characters' voices.
[1] https://www.curseforge.com/wow/addons/voiceover