Hacker News new | ask | show | jobs
by turnsout 1154 days ago
So the history prompts are collections of text/audio pairs?
1 comments

history is semantic, coarse and fine. so essentially the same thing thats getting generated just using it as an input before the generation
So how do you clone an existing speaker's voice? That's the part I don't get.