I searched for a TTS service recently and found wellsaidlabs. It’s a saas product but the quality is astonishing. It’s also fast to render the audio, approximately 2 times the length of the audio file.
Here is an article of the mit technology review magasine about it
https://www.technologyreview.com/2021/07/09/1028140/ai-voice...