Hacker News new | ask | show | jobs
by quantumfissure 660 days ago
This was fascinating and lead me down the "Oldest Companies" Wikipedia.

However, I was watching the second video linked in the article made by Endevr and it just sounds like it was made by and read by ChatGPT. The pauses are off and inflections are exactly the same on some words sounding unnatural. Is it me, or does anyone else feel that way about it? This isn't a new trend, is it?

3 comments

AI-created voices do natural conversation pretty well, but many of the voices sold are developed from intentionally unnatural announcer speech. Online marketplaces became strong before the pandemic (and exploded during) and this meant many voice actors could get work and train themselves, and many of them tended toward a variety of unnatural announcer styles. This was partly because video narration was one of the largest (and lowest-paying) areas of work, and many of the buyers sought out inexpensive announcer styles.

When you buy an AI voice, there isn't (yet) much you can do to get it vary its read, and many of the least-expensive voices you can buy are from these early mostly self-taught voice talent. When I used to hire agents to train for one of the 78 Voice Acting Expo events we put on, they would tell me that most voice actors "lose their talent when they get an agent". Too many trainees feel that acting is not something you do, it something used to train them, and that once they are trained and graduate by getting an agent, they simply need to "open their mouth and gold will fall out". A similar reversion to the mean happens in other areas of training. People return to just "making an effort" after training. I remember my parents (father was an Army doctor) going to tennis camps, and being good for a few weeks afterwards, but not having much of what they learned "stick".

It's interesting to see AI immortalize this odd self-taught speech.

It's not just you. YouTube is littered with videos just like this one, with exactly the properties you describe. My guess is they are cheap to produce from stock footage, and the scripts rarely offer any deep insights but are mostly just regurgitated blurbs from Wikipedia. If narrated it's almost always by the same voice, presumably synthesized so no voice actors to pay.

I guess it's good business from the ad income sharing, and at the very least it's slightly more interesting than people posting reactions to reaction videos.

Sadly it does seem a trend, maybe produced with this kind of stuff https://invideo.io/make/youtube-video-editor/