|
|
|
|
|
by petrochukm
2580 days ago
|
|
Doubt it. Generative-adversarial models have had a lot of success in image generation; however, the same cannot be said for speech synthesis. Unless they have figured out a new technique, they are probably using Tacotron 2 (https://ai.googleblog.com/2017/12/tacotron-2-generating-huma...). Google's Tacotron 2 already achieved human-parity TTS without adversarial training as measured by MOS. |
|