| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sailingparrot 255 days ago
	> This voice standardization model is an in-house accent-preserving voice conversion model. Not sure this model works really well. As a french/spanish native speaker, I can immediately recognize an actual French or Spanish person speaking in english, but the examples here are completly foreign to me. If I had to guess where the "french" accent was from I would have guessed something like Nigeria. For example spanish have a very distinct way of pronouncing "r" in english that is just not present here. I would have been unable to correctly guess French or Spanish for the ~10 examples present in each language (mayyybe 1 for French).

2 comments

vintermann 254 days ago

It's probably an artifact of them lumping together all varieties/dialects of a given language. I don't speak Spanish, but I know that the R is one of the things that's different in e.g. Argentina.

link

suddenlybananas 254 days ago

I wonder if they have a large population of African French speakers in the dataset?

link

ilyausorov 254 days ago

For sure the voice standardization model is not perfect, but it was important for us to do especially for the voice privacy. It’s still pretty early tech.

link