| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by tmshapland 249 days ago
	Fascinating! How did you decouple the speaker-specific vocal characteristics (timbre, pitch range) from the accent-defining phonetic and prosodic features in the latent space?

1 comments

oscarfree 249 days ago

We didn't explicitly. Because we finetuned this model for accent classification, the later transformer layers appear to ignore non-accent vocal characteristics. I verified this for gender for example.

link