Hacker News new | ask | show | jobs
by FactualOrion 1073 days ago
You will probably get better results using voice2voice models like RVC or Bark, that way the model is on beat for instace. This is how most AI covers of songs are made I believe. It's also easy to train an RVC model and use it on Google colab. Takes about 1-2 hours to train and 16 minute ebook can be generated in about 170 seconds for instance.