Hacker News new | ask | show | jobs
by tehsauce 1239 days ago
How did they get to use the joe rogan voice though? It seems that one isn’t public?
1 comments

It uses the TorToiSe TTS model for generation. It's simple to generate conditioning voice latents using short audio samples. Likely transcribed JRE episodes were part of the TorToiSe training data, explaining how it's so good at recreating his voice characteristics in particular.