My app [0] currently uses a mildly customized version of FastSpeech 2 [1] with LPCNet [2] vocoder, which I consider "good quality" @ 16kHz. Faster than realtime on mobile CPU (at least, on anything upwards of a mid-range 2017 device - I can stream practically instantly on my iPhone 11). Using a different vocoder with mobile GPU could probably get even faster (which I don't want to do, for various reasons), and desktop CPU is usually even faster.
There are various other flavours that can deliver faster synthesis (NixTTS comes to mind), but IMO they sacrifice quality even further.
"Good quality" is subjective, obviously. To me, it's perfectly audible, but there's definitely a noticeable difference in quality compared to the heavier diffusion-based models. It's much less crisp and loses some of the more subtle inflections, plosives, etc. For my purposes (language learning), it's fine for the time being but eventually it would be nice to move to a higher-end model.
I used to work at Resemble.ai and we used models that did real-time synthesis. I don’t think it’s particularly difficult anymore, even without sacrificing quality.
If this text was in an ebook my phone could read it aloud in real time. I'm using Cool Reader and Samsung's voices. They feels like TTS but it's OK.
I'm sure there are ways to select any text and make my phone read it in any app but I don't need it and I didn't investigate. Actually I don't need it in ebooks too but I know it's there and I checked that it works.
There are various other flavours that can deliver faster synthesis (NixTTS comes to mind), but IMO they sacrifice quality even further.
"Good quality" is subjective, obviously. To me, it's perfectly audible, but there's definitely a noticeable difference in quality compared to the heavier diffusion-based models. It's much less crisp and loses some of the more subtle inflections, plosives, etc. For my purposes (language learning), it's fine for the time being but eventually it would be nice to move to a higher-end model.
[0] https://polyvox.app [1] https://arxiv.org/abs/2006.04558 [2] https://github.com/xiph/LPCNet/