|
|
|
|
|
by danielbln
1052 days ago
|
|
Op also clearly hasn't used Elevenlabs or similar tools. If you clone a professional narrator it already sounds incredibly good and effectively indistinguishable from a human. Giving acting directions to the model to steer the output (kind of like ControlNet does for Stable Diffusion) seems like a logical next step. |
|