While trying to search for an answer, I came across this list of papers, which has a section on speech synthesis - https://github.com/zzw922cn/awesome-speech-recognition-speec...