Hacker News new | ask | show | jobs
by IanCal 495 days ago
I think the key distinction is that there is no specific training data for that speaker. You can view the input as just the input voice to clone, not training examples.

It would be more like training examples if you had to give it specific phrases.