|
|
|
|
|
by kvasserman
12 days ago
|
|
I think of it this way. LLMs suppose to be good at generating text/writing, right? Well, they are not very good at it. They generate plausible content that superficially makes sense. Most people can easily tell AI generated slop from human writing. I suspect that mimicking human voice is multiple levels more difficult for LLMs than mimicking human content. The level of nuance that humans produce in their speech is probably staggering. So I maybe completely wrong, but I see no evidence so far to support the idea that either LLM's writing or speaking is going to get much better any time soon. |
|
Text is just human thoughts in their most simple form. Writing is about expressing ideas, and there is almost an infinite number of ways to express them. Extremely difficult task, and LLMs only "imitate" it to the best of their training
This is not at all true for voice. There are an infinite number of possible voices, but a finite number of tones and phonemes you can use to express the text.
It's a much easier technical problem; it's just that it's much harder to gather proper data (you cannot just scrape Reddit and hope for the best, as LLMs do). And voice gets like 1/100th of LLMs' funding