| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dwohnitmok 288 days ago

From the article:

> By clicking or tapping on a point, you will hear a standardized version of the corresponding recording. The reason for voice standardization is two-fold: first, it anonymizes the speaker in the original recordings in order to protect their privacy. Second, it allows us to hear each accent projected onto a neutral voice, making it easier to hear the accent differences and ignore extraneous differences like gender, recording quality, and background noise. However, there is no free lunch: it does not perfectly preserve the source accent and introduces some audible phonetic artifacts.

> This voice standardization model is an in-house accent-preserving voice conversion model.

2 comments

hencq 288 days ago

I'm kind of curious if it would be possible for it to use my own voice but decoupled from accent. I.e. could it translate a recording from my voice to a different accent but still with my voice. If so, I wonder if that makes it easier for accent training if you can hear yourself say things in a different accent.

link

glandium 288 days ago

That would be interesting for sure, but considering you don't hear yourself the same way someone else or a mic does, I'm not sure it would have the benefit you're expecting.

link

hencq 288 days ago

Haha yeah not sure how useful it would be in practice, but mostly curious.

link

pinkmuffinere 288 days ago

Ah thanks, missed that somehow

link