Hacker News new | ask | show | jobs
by cloudking 1156 days ago
Wow this is impressive, any ideas why the first demo sounds a bit "hollow" like there is an echo? Thanks for sharing
1 comments

Thanks! The model learns a lot from unsupervised (as well as supervised) audio, so technically low-quality and high-quality audio are both just as likely to the model as music, background sounds or really anything else including echos or bad microphones :) will be interesting to learn how to control for these things, either through prompting or other switches during training/inference