| The neat thing about this particular singing synthesizer is that it used a surprisingly sophisticated (especially for the 60s) physical model of the human vocal tract [1], and was perhaps the first use of physical modeling sound synthesis. Vowel shapes were obtained through physical measurements of an actual vocal tract via x-rays. In this case, they were Russian vowels, but were close enough for English. While this particular kind of speech synthesis[2] isn't really used anymore, it's still fun to play around with. Pink Trombone [3] is a good example of a fun toy that uses a waveguide physical model, similar to the Kelly-Lochbaum model above. I've adapted some of the DSP in Pink Trombone a few times[4][5][6], and used it in some music[7] and projects[8]of mine. For more in-depth information about specifically doing singing synthesis (as opposed to general speech synthesis) using waveguide physical models, Perry Cook's Dissertation [9] is still considered to be a seminal work. In the early 2000s, there were a handful of follow-ups to physically-based singing synthesis being done at CCRMA. Hui-Ling Lu's dissertation [10] on glottal source modelling for singing purposes comes to mind. 1: https://ccrma.stanford.edu/~jos/pasp/Singing_Kelly_Lochbaum_... 2: https://en.wikipedia.org/wiki/Articulatory_synthesis 3: https://dood.al/pinktrombone/ 4: https://pbat.ch/proj/voc/ 5: https://pbat.ch/sndkit/tract/ 6: https://pbat.ch/sndkit/glottis/ 7: https://soundcloud.com/patchlore/sets/looptober-2021 8: https://pbat.ch/wiki/vocshape/ 9: https://www.cs.princeton.edu/~prc/SingingSynth.html 10: https://web.archive.org/web/20080725195347/http://ccrma-www.... |
1: https://ccrma.stanford.edu/~jos/pasp/