Hacker News new | ask | show | jobs
by ascorbic 703 days ago
For those confused as I was - it's not trying to match the accent of the target speech in those samples, just the timbre. To quote the paper:

> Voice conversion refers to altering the style of a speech signal while preserving its linguistic content. While style encompasses many aspects of speech, such as emotion, prosody, accent, and whispering, in this work we focus on the conversion of speaker timbre only while keeping the linguistic and para-linguistic information unchanged.